aboutsummaryrefslogtreecommitdiff
path: root/util.c
AgeCommit message (Collapse)Author
13 daysIf-conversion RFC 4 - x86 only (for now), use cmovXXRoland Paterson-Jones
Replacement of tiny conditional jump graphlets with conditional move instructions. Currently enabled only for x86. Arm64 support using cselXX will be essentially identical. Adds (internal) frontend sel0/sel1 ops with flag-specific backend xselXX following jnz implementation pattern. Testing: standard QBE, cproc, harec, hare, roland
2025-03-14gvn/gcm reviewQuentin Carbonneaux
- Many stylistic nits. - Removed blkmerge(). - Some minor bug fixes. - GCM reassoc is now "sink"; a pass that moves trivial ops in their target block with the same goal of reducing register pressure, but starting from instructions that benefit from having their inputs close.
2025-03-14Get rid of movins() infra.Roland Paterson-Jones
2025-03-14Global Value Numbering / Global Code MotionRoland Paterson-Jones
More or less as proposed in its ninth iteration with the addition of a gcmmove() functionality to restore coherent local schedules. Changes since RFC 8: Features: - generalization of phi 1/0 detection - collapse linear jmp chains before GVN; simplifies if-graph detection used in 0/non-0 value inference and if-elim... - infer 0/non-0 values from dominating blk jnz; eliminates redundant cmp eq/ne 0 and associated jnz/blocks, for example redundant null pointer checks (hare codebase likes this) - remove (emergent) empty if-then-else graphlets between GVN and GCM; improves GCM instruction placement, particularly cmps. - merge %addr =l add %addr1, N sequences - reduces tmp count, register pressure. - squash consecutive associative ops with constant args, e.g. t1 = add t, N ... t2 = add t2, M -> t2 = add t, N+M Bug Fixes: - remove "cmp eq/ne of non-identical RCon's " in copyref(). RCon's are not guaranteed to be dedup'ed, and symbols can alias. Codebase: - moved some stuff into cfg.c including blkmerge() - some refactoring in gvn.c - simplification of reassoc.c - always reassoc all cmp ops and Kl add %t, N. Better on coremark, smaller codebase. - minor simplification of movins() - use vins Testing - standard QBE, cproc, hare, harec, coremark [still have Rust build issues with latest roland] Benchmark - coremark is ~15%+ faster than master - hare "HARETEST_INCLUDE='slow' make check" ~8% faster (crypto::sha1::sha1_1gb is biggest obvious win - ~25% faster) Changes since RFC 7: Bug fixes: - remove isbad4gcm() in GVN/GCM - it is unsound due to different state at GVN vs GCM time; replace with "reassociation" pass after GCM - fix intra-blk use-before-def after GCM - prevent GVN from deduping trapping instructions cos GCM will not move them - remove cmp eq/ne identical arg copy detection for floating point, it is not valid for NaN - fix cges/cged flagged as commutative in ops.h instead of cnes/cned respectively; just a typo Minor features: - copy detection handles cmp le/lt/ge/gt with identical args - treat (integer) div/rem by non-zero constant as non-trapping - eliminate add N/sub N pairs in copy detection - maintain accurate tmp use in GVN; not strictly necessary but enables interim global state sanity checking - "reassociation" of trivial constant offset load/store addresses, and cmp ops with point-of-use in pass after GCM - normalise commutative op arg order - e.g. op con, tmp -> op tmp, con to simplify copy detection and GVN instruction dedup Codebase: - split out core copy detection and constant folding (back) out into copy.c, fold.c respectively; gvn.c was getting monolithic - generic support for instruction moving in ins.c - used by GCM and reassoc - new reassociation pass in reassoc.c - other minor clean-up/refactor Changes since RFC 6: - More ext elimination in GVN by examination of def and use bit width - elimination of redundant and mask by bit width examination - Incorporation of Song's patch Changes since RFC 5: - avoidance of "bad" candidates for GVN/GCM - trivial address offset calculations, and comparisons - more copy detection mostly around boolean values - allow elimination of unused load, alloc, trapping instructions - detection of trivial boolean v ? 1 : 0 phi patterns - bug fix for (removal of) "chg" optimisation in ins recreation - it was missing removal of unused instructions in some cases ifelim() between GVN and GCM; deeper nopunused()
2025-03-14Re-use (vgrow) b->ins vector in backend xxx_abi() fn's.Roland Paterson-Jones
Removes last re-allocation of b->ins.
2025-03-14idup(Ins **, Ins *, ulong) -> idup(Blk *, Ins *, ulong)Roland Paterson-Jones
Always used this way and factors setting b->nins. Makes b->ins vector contract more obvious.
2025-03-14Blk::ins is a vectorRoland Paterson-Jones
Scratching an itch - avoid unnecesary re-allocation in idup() which is called often in the optimisation chain. Blk::ins is reallocated in xxx_abi() - needs further fiddling.
2024-04-11fold scaled offsets in addressesQuentin Carbonneaux
2024-04-09use mgen in amd64/isel.cQuentin Carbonneaux
2023-03-19naming nitQuentin Carbonneaux
2023-03-16silence format warning more reliablyQuentin Carbonneaux
2022-12-25new UNDEF RefQuentin Carbonneaux
Crashing loads of uninitialized memory proved to be a problem when implementing unions using qbe. This patch introduces a new UNDEF Ref to represent data that is known to be uninitialized. Optimization passes can make use of it to eliminate some code. In the last compilation stages, UNDEF is treated as the constant 0xdeaddead.
2022-12-14new blit instructionQuentin Carbonneaux
2022-11-22use a new struct for symbolsQuentin Carbonneaux
Symbols are a useful abstraction that occurs in both Con and Alias. In this patch they get their own struct. This new struct packages a symbol name and a type; the type tells us where the symbol name must be interpreted (currently, in gobal memory or in thread-local storage). The refactor fixed a bug in addcon(), proving the value of packaging symbol names with their type.
2022-10-08"rel" fields become "reloc"Quentin Carbonneaux
2022-10-08add support for thread-local storageQuentin Carbonneaux
The apple targets are not done yet.
2022-10-03fix case of Pool constantsQuentin Carbonneaux
2022-03-14dynamic stack allocs for arm64Quentin Carbonneaux
I also moved some isel logic that would have been repeated a third time in util.c.
2022-03-08cosmeticsQuentin Carbonneaux
2021-11-22reuse previous address constants in fold()Michael Forney
parseref() has code to reuse address constants, but this is not done in other passes such as fold or isel. Introduce a new function newcon() which takes a Con and returns a Ref for that constant, and use this whenever creating address constants. This is necessary to fix folding of address constants when one operand is already folded. For example, in %a =l add $x, 1 %b =l add %a, 2 %c =w loadw %b %a and %b were folded to $x+1 and $x+3 respectively, but then the second add is visited again since it uses %a. This gets folded to $x+3 as well, but as a new distinct constant. This results in %b getting labeled as bottom instead of either constant, disabling the replacement of %b by a constant in subsequent instructions (such as the loadw).
2021-10-11util: fix typo preventing 4-byte copy in blit()Michael Forney
2021-07-30err when an address contains a sum $a+$b (afl)Quentin Carbonneaux
Reported by Alessandro Mantovani. These addresses are likely bogus, but they triggered an unwarranted assertion failure. We now raise a civilized error.
2021-03-02fix a couple asan complaintsQuentin Carbonneaux
2017-06-06isreg() does not need to be inlinedQuentin Carbonneaux
2017-05-17intern symbol namesQuentin Carbonneaux
Symbols in the source file are still limited in length because the rest of the code assumes that strings always fit in NString bytes. Regardless, there is already a benefit because comparing/copying symbol names does not require using strcmp()/strcpy() anymore.
2017-05-16do not account for interferences in phi classesQuentin Carbonneaux
Before this commit, I tried to make sure that two interfering temporaries never ended up in the same phi class. This was to make sure that their register hints were not counterproductively stepping on each other's toes. The idea is fine, but: * the implementation is clumsy because it mixes the orthogonal concepts of (i) interference and (ii) phi classes; * the hinting process in the register allocator is hard to understand because the disjoint-set data structure used for phi classes is cut in arbitrary places. After this commit, the phi classes *really* are phi classes represented with a proper disjoint-set data structure.
2017-04-08rework storage of typesQuentin Carbonneaux
The arm64 ABI needs to know precisely what floating point types are being used, so we need to store that information. I also made typ[] a dynamic array.
2017-04-08prepare for multi-targetQuentin Carbonneaux
This big diff does multiple changes to allow the addition of new targets to qbe. The changes are listed below in decreasing order of impact. 1. Add a new Target structure. To add support for a given target, one has to implement all the members of the Target structure. All the source files where changed to use this interface where needed. 2. Single out amd64-specific code. In this commit, the amd64 target T_amd64_sysv is the only target available, it is implemented in the amd64/ directory. All the non-static items in this directory are prefixed with either amd64_ or amd64_sysv (for items that are specific to the System V ABI). 3. Centralize Ops information. There is now a file 'ops.h' that must be used to store all the available operations together with their metadata. The various targets will only select what they need; but it is beneficial that there is only *one* place to change to add a new instruction. One good side effect of this change is that any operation 'xyz' in the IL now as a corresponding 'Oxyz' in the code. 4. Misc fixes. One notable change is that instruction selection now generates generic comparison operations and the lowering to the target's comparisons is done in the emitter. GAS directives for data are the same for many targets, so data emission was extracted in a file 'gas.c'. 5. Modularize the Makefile. The Makefile now has a list of C files that are target-independent (SRC), and one list of C files per target. Each target can also use its own 'all.h' header (for example to define registers).
2017-01-12use a less obtuse api for vnew()Quentin Carbonneaux
2017-01-04improve performance of bsiter()Quentin Carbonneaux
2016-12-12create cfg.c for cfg-related functionsQuentin Carbonneaux
2016-12-12make newtmp() return zeroed out temporariesQuentin Carbonneaux
This was not necessary as temporaries were never freed and returned from an array zero initialized. But in the coming load optimization, we sometimes free temporaries by resetting fn->ntmp.
2016-08-15specify the allocation function in vnewQuentin Carbonneaux
2016-04-18factor some subtyping logic in clsmerge()Quentin Carbonneaux
2016-04-13oops, memcpy -> memmoveQuentin Carbonneaux
2016-04-13handle odd jumps in blkdel() an renblk()Quentin Carbonneaux
2016-04-13separate name and index in newtmp()Quentin Carbonneaux
2016-04-09did I loose my c?Quentin Carbonneaux
2016-04-09add a proper block deletion routineQuentin Carbonneaux
2016-04-05speedup bscount()Quentin Carbonneaux
2016-04-01tradeoff the type of bsiter()Quentin Carbonneaux
int is used all over the place for temporaries, maybe this should be changed, I don't know. Another thing to consider is that temporaries are currently on 12 bits (and will be on 29 or 30 bits in the future), so int will always be safe to store them. We just loose the free invariant of non-negativity.
2016-03-31move abi code in a new fileQuentin Carbonneaux
2016-03-31cleanup error handlingQuentin Carbonneaux
2016-03-29new layout, put LICENSE in rootQuentin Carbonneaux