aboutsummaryrefslogtreecommitdiff
path: root/copy.c
AgeCommit message (Collapse)Author
2025-05-30skip deleted phis in use width scanQuentin Carbonneaux
2025-03-14gvn/gcm reviewQuentin Carbonneaux
- Many stylistic nits. - Removed blkmerge(). - Some minor bug fixes. - GCM reassoc is now "sink"; a pass that moves trivial ops in their target block with the same goal of reducing register pressure, but starting from instructions that benefit from having their inputs close.
2025-03-14Global Value Numbering / Global Code MotionRoland Paterson-Jones
More or less as proposed in its ninth iteration with the addition of a gcmmove() functionality to restore coherent local schedules. Changes since RFC 8: Features: - generalization of phi 1/0 detection - collapse linear jmp chains before GVN; simplifies if-graph detection used in 0/non-0 value inference and if-elim... - infer 0/non-0 values from dominating blk jnz; eliminates redundant cmp eq/ne 0 and associated jnz/blocks, for example redundant null pointer checks (hare codebase likes this) - remove (emergent) empty if-then-else graphlets between GVN and GCM; improves GCM instruction placement, particularly cmps. - merge %addr =l add %addr1, N sequences - reduces tmp count, register pressure. - squash consecutive associative ops with constant args, e.g. t1 = add t, N ... t2 = add t2, M -> t2 = add t, N+M Bug Fixes: - remove "cmp eq/ne of non-identical RCon's " in copyref(). RCon's are not guaranteed to be dedup'ed, and symbols can alias. Codebase: - moved some stuff into cfg.c including blkmerge() - some refactoring in gvn.c - simplification of reassoc.c - always reassoc all cmp ops and Kl add %t, N. Better on coremark, smaller codebase. - minor simplification of movins() - use vins Testing - standard QBE, cproc, hare, harec, coremark [still have Rust build issues with latest roland] Benchmark - coremark is ~15%+ faster than master - hare "HARETEST_INCLUDE='slow' make check" ~8% faster (crypto::sha1::sha1_1gb is biggest obvious win - ~25% faster) Changes since RFC 7: Bug fixes: - remove isbad4gcm() in GVN/GCM - it is unsound due to different state at GVN vs GCM time; replace with "reassociation" pass after GCM - fix intra-blk use-before-def after GCM - prevent GVN from deduping trapping instructions cos GCM will not move them - remove cmp eq/ne identical arg copy detection for floating point, it is not valid for NaN - fix cges/cged flagged as commutative in ops.h instead of cnes/cned respectively; just a typo Minor features: - copy detection handles cmp le/lt/ge/gt with identical args - treat (integer) div/rem by non-zero constant as non-trapping - eliminate add N/sub N pairs in copy detection - maintain accurate tmp use in GVN; not strictly necessary but enables interim global state sanity checking - "reassociation" of trivial constant offset load/store addresses, and cmp ops with point-of-use in pass after GCM - normalise commutative op arg order - e.g. op con, tmp -> op tmp, con to simplify copy detection and GVN instruction dedup Codebase: - split out core copy detection and constant folding (back) out into copy.c, fold.c respectively; gvn.c was getting monolithic - generic support for instruction moving in ins.c - used by GCM and reassoc - new reassociation pass in reassoc.c - other minor clean-up/refactor Changes since RFC 6: - More ext elimination in GVN by examination of def and use bit width - elimination of redundant and mask by bit width examination - Incorporation of Song's patch Changes since RFC 5: - avoidance of "bad" candidates for GVN/GCM - trivial address offset calculations, and comparisons - more copy detection mostly around boolean values - allow elimination of unused load, alloc, trapping instructions - detection of trivial boolean v ? 1 : 0 phi patterns - bug fix for (removal of) "chg" optimisation in ins recreation - it was missing removal of unused instructions in some cases ifelim() between GVN and GCM; deeper nopunused()
2024-06-09Optab-driven copy detectionRoland Paterson-Jones
2022-12-25new UNDEF RefQuentin Carbonneaux
Crashing loads of uninitialized memory proved to be a problem when implementing unions using qbe. This patch introduces a new UNDEF Ref to represent data that is known to be uninitialized. Optimization passes can make use of it to eliminate some code. In the last compilation stages, UNDEF is treated as the constant 0xdeaddead.
2022-11-21recognize some phis as copiesQuentin Carbonneaux
The copy elimination pass is not complete. This patch improves things a bit, but I think we still have quite a bit of incompleteness. We now consistently mark phis with all arguments identical as copies. Previously, they were inconsistently eliminated by phisimpl(). An example where they were not eliminated is the following: @blk2 %a = phi @blk0 %x, @blk1 %x jnz ?, @blk3, @blk4 @blk3 %b = copy %x @blk4 %c = phi @blk2 %a, @blk3 %b In this example, neither %c nor %a were marked as copies of %x because, when phisimpl() is called, the copy information for %b is not available. The incompleteness is still present and can be observed by modifying the example above so that %a takes a copy of %x through a back-edge. Then, phisimpl()'s lack of copy information about %b will prevent optimization.
2022-10-03fix case of Pool constantsQuentin Carbonneaux
2021-08-02copy: consider identity element for more instructionsMichael Forney
udiv %x, 1 == %x, and for each of sub, or, xor, sar, shr, and shl, <op> %x, 0 == %x.
2019-11-25copy: Fix use of compound literal outside its scopeMichael Forney
C99 6.5.2.5p6: > If the compound literal occurs outside the body of a function, > the object has static storage duration; otherwise, it has automatic > storage duration associated with the enclosing block. So, we can't use the address of a compound literal here. Instead, just set p to NULL, and make the loop conditional on p being non-NULL. Remarks from Quentin: I made a cosmetic change to Michael's original patch and merely pushed the literal at toplevel.
2019-05-14fix a bad bug in copy detectionQuentin Carbonneaux
The code used to see add 0, 10 as a copy of 0.
2019-05-02detect ubiquitous simple copiesQuentin Carbonneaux
When lowering pointer arithmetic, it is natural for a C frontend to generate those instructions.
2019-02-26new copy elimination passQuentin Carbonneaux
The sparse data-flow analysis used for copy elimination before this patch could sometimes diverge. The core reason for this behavior is that the visitphi() function was not monotonic in the following copy-of lattice: top (represented as the temp / | \ itself) x y z ... \ | / bot (represented as R) This monotonicity defect could be fixed by reverting 2f41ff03, but then the pass would end up missing some redundant phis. This patch re-implements the pass from scratch using a different approach. The new algorithm should get rid of all redundant copies. On the other hand, it can run slower than the monotonic sparse data-flow analysis because, in the worst case, an instruction in a phi cluster can be visited as many times as there are phis in the input program. Thanks to Michael Forney for reviewing and testing the new pass.
2018-04-26Fix compiler warnings.Emil Skoeldberg
Compiler warned about comparison between signed and unsigned values.
2017-02-27cosmetic fixesQuentin Carbonneaux
2017-02-25do sign/zero extensions removal in copy.cQuentin Carbonneaux
2016-12-08use a queue for copy eliminationQuentin Carbonneaux
2016-10-22fix bug in copy propagationQuentin Carbonneaux
The pass was not doing anything incorrect, but it missed some opportunities to optimize. On a copy heavy example I observed that, in the output of the pass a phi of the following shape remained: %a =w phi @A %c, @B %a Originally the phi for %a was: %a =w phi @A %b, @B %a Since %b was discovered a copy of %c, %a should have been eliminated and replaced with %c. I think the problem is that knowledge on the first argument of the phi %a changes as the algorithm progresses, a more detailed walk- through follows. In the first round of the algoritm, %a is discovered to be a copy of its first argument %b. phi(%b, %a) -> %b In the second round, %a is computed as the phi of %c (since the first argument changed) and %b (the result of the first iteration), in our lattice, the glb of those two is bottom. phi(%c, %b) -> %a (Bottom) Finally, there is a third round in which we compute %a as the phi of %a and %c, which again, gives bottom. phi(%c, %a) -> %a (Bottom) The bug is not tied to a phi of a copy, for example, if the first argument is speculated to be a copy of 0 and then, this knowledge is retracted; we will compute in sequence: phi(0, %a) -> 0 phi(%b, 0) -> %a (Bottom) phi(%b, %a) -> %a (Bottom) The way I fixed this is by ignoring arguments of the phi that were discovered to be copies of the phi node itself. This will make the last rounds above do the correct thing.
2016-04-20match jumps/ops with il textQuentin Carbonneaux
2016-04-19use assert for ssa invariants in fold/copyQuentin Carbonneaux
2016-04-12diagnose some undefined usesQuentin Carbonneaux
2016-03-31cleanup error handlingQuentin Carbonneaux
2016-03-29new layout, put LICENSE in rootQuentin Carbonneaux