c9x.me/qbe.git - QBE - Backend Compiler

Age	Commit message (Collapse)	Author
2025-05-30	skip deleted phis in use width scan	Quentin Carbonneaux

2025-03-14	gvn/gcm review	Quentin Carbonneaux
	- Many stylistic nits. - Removed blkmerge(). - Some minor bug fixes. - GCM reassoc is now "sink"; a pass that moves trivial ops in their target block with the same goal of reducing register pressure, but starting from instructions that benefit from having their inputs close.
2025-03-14	Global Value Numbering / Global Code Motion	Roland Paterson-Jones
	More or less as proposed in its ninth iteration with the addition of a gcmmove() functionality to restore coherent local schedules. Changes since RFC 8: Features: - generalization of phi 1/0 detection - collapse linear jmp chains before GVN; simplifies if-graph detection used in 0/non-0 value inference and if-elim... - infer 0/non-0 values from dominating blk jnz; eliminates redundant cmp eq/ne 0 and associated jnz/blocks, for example redundant null pointer checks (hare codebase likes this) - remove (emergent) empty if-then-else graphlets between GVN and GCM; improves GCM instruction placement, particularly cmps. - merge %addr =l add %addr1, N sequences - reduces tmp count, register pressure. - squash consecutive associative ops with constant args, e.g. t1 = add t, N ... t2 = add t2, M -> t2 = add t, N+M Bug Fixes: - remove "cmp eq/ne of non-identical RCon's " in copyref(). RCon's are not guaranteed to be dedup'ed, and symbols can alias. Codebase: - moved some stuff into cfg.c including blkmerge() - some refactoring in gvn.c - simplification of reassoc.c - always reassoc all cmp ops and Kl add %t, N. Better on coremark, smaller codebase. - minor simplification of movins() - use vins Testing - standard QBE, cproc, hare, harec, coremark [still have Rust build issues with latest roland] Benchmark - coremark is ~15%+ faster than master - hare "HARETEST_INCLUDE='slow' make check" ~8% faster (crypto::sha1::sha1_1gb is biggest obvious win - ~25% faster) Changes since RFC 7: Bug fixes: - remove isbad4gcm() in GVN/GCM - it is unsound due to different state at GVN vs GCM time; replace with "reassociation" pass after GCM - fix intra-blk use-before-def after GCM - prevent GVN from deduping trapping instructions cos GCM will not move them - remove cmp eq/ne identical arg copy detection for floating point, it is not valid for NaN - fix cges/cged flagged as commutative in ops.h instead of cnes/cned respectively; just a typo Minor features: - copy detection handles cmp le/lt/ge/gt with identical args - treat (integer) div/rem by non-zero constant as non-trapping - eliminate add N/sub N pairs in copy detection - maintain accurate tmp use in GVN; not strictly necessary but enables interim global state sanity checking - "reassociation" of trivial constant offset load/store addresses, and cmp ops with point-of-use in pass after GCM - normalise commutative op arg order - e.g. op con, tmp -> op tmp, con to simplify copy detection and GVN instruction dedup Codebase: - split out core copy detection and constant folding (back) out into copy.c, fold.c respectively; gvn.c was getting monolithic - generic support for instruction moving in ins.c - used by GCM and reassoc - new reassociation pass in reassoc.c - other minor clean-up/refactor Changes since RFC 6: - More ext elimination in GVN by examination of def and use bit width - elimination of redundant and mask by bit width examination - Incorporation of Song's patch Changes since RFC 5: - avoidance of "bad" candidates for GVN/GCM - trivial address offset calculations, and comparisons - more copy detection mostly around boolean values - allow elimination of unused load, alloc, trapping instructions - detection of trivial boolean v ? 1 : 0 phi patterns - bug fix for (removal of) "chg" optimisation in ins recreation - it was missing removal of unused instructions in some cases ifelim() between GVN and GCM; deeper nopunused()
2024-06-09	Optab-driven copy detection	Roland Paterson-Jones

2022-12-25	new UNDEF Ref	Quentin Carbonneaux
	Crashing loads of uninitialized memory proved to be a problem when implementing unions using qbe. This patch introduces a new UNDEF Ref to represent data that is known to be uninitialized. Optimization passes can make use of it to eliminate some code. In the last compilation stages, UNDEF is treated as the constant 0xdeaddead.
2022-11-21	recognize some phis as copies	Quentin Carbonneaux
	The copy elimination pass is not complete. This patch improves things a bit, but I think we still have quite a bit of incompleteness. We now consistently mark phis with all arguments identical as copies. Previously, they were inconsistently eliminated by phisimpl(). An example where they were not eliminated is the following: @blk2 %a = phi @blk0 %x, @blk1 %x jnz ?, @blk3, @blk4 @blk3 %b = copy %x @blk4 %c = phi @blk2 %a, @blk3 %b In this example, neither %c nor %a were marked as copies of %x because, when phisimpl() is called, the copy information for %b is not available. The incompleteness is still present and can be observed by modifying the example above so that %a takes a copy of %x through a back-edge. Then, phisimpl()'s lack of copy information about %b will prevent optimization.
2022-10-03	fix case of Pool constants	Quentin Carbonneaux

2021-08-02	copy: consider identity element for more instructions	Michael Forney
	udiv %x, 1 == %x, and for each of sub, or, xor, sar, shr, and shl, <op> %x, 0 == %x.
2019-11-25	copy: Fix use of compound literal outside its scope	Michael Forney
	C99 6.5.2.5p6: > If the compound literal occurs outside the body of a function, > the object has static storage duration; otherwise, it has automatic > storage duration associated with the enclosing block. So, we can't use the address of a compound literal here. Instead, just set p to NULL, and make the loop conditional on p being non-NULL. Remarks from Quentin: I made a cosmetic change to Michael's original patch and merely pushed the literal at toplevel.
2019-05-14	fix a bad bug in copy detection	Quentin Carbonneaux
	The code used to see add 0, 10 as a copy of 0.
2019-05-02	detect ubiquitous simple copies	Quentin Carbonneaux
	When lowering pointer arithmetic, it is natural for a C frontend to generate those instructions.
2019-02-26	new copy elimination pass	Quentin Carbonneaux
	The sparse data-flow analysis used for copy elimination before this patch could sometimes diverge. The core reason for this behavior is that the visitphi() function was not monotonic in the following copy-of lattice: top (represented as the temp / \| \ itself) x y z ... \ \| / bot (represented as R) This monotonicity defect could be fixed by reverting 2f41ff03, but then the pass would end up missing some redundant phis. This patch re-implements the pass from scratch using a different approach. The new algorithm should get rid of all redundant copies. On the other hand, it can run slower than the monotonic sparse data-flow analysis because, in the worst case, an instruction in a phi cluster can be visited as many times as there are phis in the input program. Thanks to Michael Forney for reviewing and testing the new pass.
2018-04-26	Fix compiler warnings.	Emil Skoeldberg
	Compiler warned about comparison between signed and unsigned values.
2017-02-27	cosmetic fixes	Quentin Carbonneaux

2017-02-25	do sign/zero extensions removal in copy.c	Quentin Carbonneaux

2016-12-08	use a queue for copy elimination	Quentin Carbonneaux

2016-10-22	fix bug in copy propagation	Quentin Carbonneaux
	The pass was not doing anything incorrect, but it missed some opportunities to optimize. On a copy heavy example I observed that, in the output of the pass a phi of the following shape remained: %a =w phi @A %c, @B %a Originally the phi for %a was: %a =w phi @A %b, @B %a Since %b was discovered a copy of %c, %a should have been eliminated and replaced with %c. I think the problem is that knowledge on the first argument of the phi %a changes as the algorithm progresses, a more detailed walk- through follows. In the first round of the algoritm, %a is discovered to be a copy of its first argument %b. phi(%b, %a) -> %b In the second round, %a is computed as the phi of %c (since the first argument changed) and %b (the result of the first iteration), in our lattice, the glb of those two is bottom. phi(%c, %b) -> %a (Bottom) Finally, there is a third round in which we compute %a as the phi of %a and %c, which again, gives bottom. phi(%c, %a) -> %a (Bottom) The bug is not tied to a phi of a copy, for example, if the first argument is speculated to be a copy of 0 and then, this knowledge is retracted; we will compute in sequence: phi(0, %a) -> 0 phi(%b, 0) -> %a (Bottom) phi(%b, %a) -> %a (Bottom) The way I fixed this is by ignoring arguments of the phi that were discovered to be copies of the phi node itself. This will make the last rounds above do the correct thing.
2016-04-20	match jumps/ops with il text	Quentin Carbonneaux

2016-04-19	use assert for ssa invariants in fold/copy	Quentin Carbonneaux

2016-04-12	diagnose some undefined uses	Quentin Carbonneaux

2016-03-31	cleanup error handling	Quentin Carbonneaux

2016-03-29	new layout, put LICENSE in root	Quentin Carbonneaux