aboutsummaryrefslogtreecommitdiff
path: root/parse.c
AgeCommit message (Collapse)Author
2025-03-14gvn/gcm reviewQuentin Carbonneaux
- Many stylistic nits. - Removed blkmerge(). - Some minor bug fixes. - GCM reassoc is now "sink"; a pass that moves trivial ops in their target block with the same goal of reducing register pressure, but starting from instructions that benefit from having their inputs close.
2025-03-14Global Value Numbering / Global Code MotionRoland Paterson-Jones
More or less as proposed in its ninth iteration with the addition of a gcmmove() functionality to restore coherent local schedules. Changes since RFC 8: Features: - generalization of phi 1/0 detection - collapse linear jmp chains before GVN; simplifies if-graph detection used in 0/non-0 value inference and if-elim... - infer 0/non-0 values from dominating blk jnz; eliminates redundant cmp eq/ne 0 and associated jnz/blocks, for example redundant null pointer checks (hare codebase likes this) - remove (emergent) empty if-then-else graphlets between GVN and GCM; improves GCM instruction placement, particularly cmps. - merge %addr =l add %addr1, N sequences - reduces tmp count, register pressure. - squash consecutive associative ops with constant args, e.g. t1 = add t, N ... t2 = add t2, M -> t2 = add t, N+M Bug Fixes: - remove "cmp eq/ne of non-identical RCon's " in copyref(). RCon's are not guaranteed to be dedup'ed, and symbols can alias. Codebase: - moved some stuff into cfg.c including blkmerge() - some refactoring in gvn.c - simplification of reassoc.c - always reassoc all cmp ops and Kl add %t, N. Better on coremark, smaller codebase. - minor simplification of movins() - use vins Testing - standard QBE, cproc, hare, harec, coremark [still have Rust build issues with latest roland] Benchmark - coremark is ~15%+ faster than master - hare "HARETEST_INCLUDE='slow' make check" ~8% faster (crypto::sha1::sha1_1gb is biggest obvious win - ~25% faster) Changes since RFC 7: Bug fixes: - remove isbad4gcm() in GVN/GCM - it is unsound due to different state at GVN vs GCM time; replace with "reassociation" pass after GCM - fix intra-blk use-before-def after GCM - prevent GVN from deduping trapping instructions cos GCM will not move them - remove cmp eq/ne identical arg copy detection for floating point, it is not valid for NaN - fix cges/cged flagged as commutative in ops.h instead of cnes/cned respectively; just a typo Minor features: - copy detection handles cmp le/lt/ge/gt with identical args - treat (integer) div/rem by non-zero constant as non-trapping - eliminate add N/sub N pairs in copy detection - maintain accurate tmp use in GVN; not strictly necessary but enables interim global state sanity checking - "reassociation" of trivial constant offset load/store addresses, and cmp ops with point-of-use in pass after GCM - normalise commutative op arg order - e.g. op con, tmp -> op tmp, con to simplify copy detection and GVN instruction dedup Codebase: - split out core copy detection and constant folding (back) out into copy.c, fold.c respectively; gvn.c was getting monolithic - generic support for instruction moving in ins.c - used by GCM and reassoc - new reassociation pass in reassoc.c - other minor clean-up/refactor Changes since RFC 6: - More ext elimination in GVN by examination of def and use bit width - elimination of redundant and mask by bit width examination - Incorporation of Song's patch Changes since RFC 5: - avoidance of "bad" candidates for GVN/GCM - trivial address offset calculations, and comparisons - more copy detection mostly around boolean values - allow elimination of unused load, alloc, trapping instructions - detection of trivial boolean v ? 1 : 0 phi patterns - bug fix for (removal of) "chg" optimisation in ins recreation - it was missing removal of unused instructions in some cases ifelim() between GVN and GCM; deeper nopunused()
2025-03-14idup(Ins **, Ins *, ulong) -> idup(Blk *, Ins *, ulong)Roland Paterson-Jones
Always used this way and factors setting b->nins. Makes b->ins vector contract more obvious.
2024-08-23skip preludes for some leaf fnsQuentin Carbonneaux
When rbp is not necessary to compile a leaf function, we skip saving and restoring it.
2024-08-15align emitted codeQuentin Carbonneaux
Functions are now aligned on 16-byte boundaries. This mimics gcc and should help reduce the maximum perf impact of cosmetic code changes. Previously, any change in the output of qbe could have far reaching implications on alignment. Thanks to Roland Paterson-Jones for pointing out the variability issue.
2024-06-09Optab-driven copy detectionRoland Paterson-Jones
2024-04-13parse: use dynamically sized hashtable for temporariesMichael Forney
This significantly improves parsing performance for massive functions with a huge number of temporaries. Parsing the 86MiB IL produced by cproc during zig bootstrap drops from 17m15s to 2.5s (over 400x speedup). The speedup is much smaller for IL produced from normal non-autogenerated C code. Parsing the sqlite3 amalgamation drops from 0.40s to 0.33s.
2024-04-12add common linkage for dataQuentin Carbonneaux
2024-04-09use mgen in amd64/isel.cQuentin Carbonneaux
2024-04-04fix accidentally noop loopQuentin Carbonneaux
Credit goes to Roland Paterson-Jones for spotting this bug.
2024-04-03do not parse +N constantsQuentin Carbonneaux
The parsing code for these constants conflicts with the Tplus token.
2024-03-28check that data alignment is in range and a power of twoMichael Forney
Otherwise, the alignment gets truncated to fit in char, so `align 256` is handled as no alignment requirement.
2024-01-02dbgloc: add column argumentDrew DeVault
dbgloc line [col] This is implemented in a backwards-compatible manner.
2023-08-18file,loc become dbgfile,dbglocQuentin Carbonneaux
2023-06-07parseline() tweaksQuentin Carbonneaux
2023-06-06implement line number info trackingThomas Bracht Laumann Jespersen
Support "file" and "loc" directives. "file" takes a string (a file name) assigns it a number, sets the current file to that number and records the string for later. "loc" takes a single number and outputs location information with a reference to the current file.
2023-04-02print prefix for thread-local symbolsQuentin Carbonneaux
2023-03-22rename blknew() to newblk()Quentin Carbonneaux
This is consistent with newtmp() and newcon().
2022-12-25new UNDEF RefQuentin Carbonneaux
Crashing loads of uninitialized memory proved to be a problem when implementing unions using qbe. This patch introduces a new UNDEF Ref to represent data that is known to be uninitialized. Optimization passes can make use of it to eliminate some code. In the last compilation stages, UNDEF is treated as the constant 0xdeaddead.
2022-12-14new blit instructionQuentin Carbonneaux
2022-12-12new rsval() helper for signed RefsQuentin Carbonneaux
The .val field is signed in RSlot. Add a new dedicated function to fetch it as a signed int.
2022-11-27new hlt block terminatorQuentin Carbonneaux
It is handy to express when the end of a block cannot be reached. If a hlt terminator is executed, it traps the program. We don't go the llvm way and specify execution semantics as undefined behavior.
2022-11-22use a new struct for symbolsQuentin Carbonneaux
Symbols are a useful abstraction that occurs in both Con and Alias. In this patch they get their own struct. This new struct packages a symbol name and a type; the type tells us where the symbol name must be interpreted (currently, in gobal memory or in thread-local storage). The refactor fixed a bug in addcon(), proving the value of packaging symbol names with their type.
2022-10-08"rel" fields become "reloc"Quentin Carbonneaux
2022-10-08add support for thread-local storageQuentin Carbonneaux
The apple targets are not done yet.
2022-10-03flag bad vastart usesQuentin Carbonneaux
2022-10-03fix case of Pool constantsQuentin Carbonneaux
2022-10-03parse sb,ub,sh,uh abi typesQuentin Carbonneaux
2022-09-15Fix parsing of multiple globals in datadefEmber Sawady
Eg. data $a = { w $b $c }
2022-07-01Reject multiple section definition for a symbolRoberto E. Vargas Caballero
2022-07-01Add qbe identifier in error stringsRoberto E. Vargas Caballero
When qbe is used with other tools is a bit hard to identify what is the tool that is generating the error. Adding an identifier at the beginning of the line makes much easier to identify the tool generating the error.
2022-04-11do not leak type fieldsQuentin Carbonneaux
Thanks to Daniel Xu for reporting.
2022-03-08flag types defined as unionsQuentin Carbonneaux
The risc-v abi needs to know if a type is defined as a union or not. We cannot use nunion to obtain this information because the risc-v abi made the unfortunate decision of treating union { int i; } differently from int i; So, instead, I introduce a single bit flag 'isunion'.
2022-02-24parse: allow string after first data itemPaul Ouellette
2022-02-11gas: put zero data into .bss by defaultMichael Forney
This allows frontends to use BSS generically, without knowledge of platform-dependent details.
2022-02-02shared linkage logic for func/dataQuentin Carbonneaux
2022-01-28update token hash paramsQuentin Carbonneaux
2022-01-23increase token limit to 255Bor Grošelj Simić
2022-01-23Add a negation instructionEyal Sawady
Necessary for floating-point negation, because `%result = sub 0, %operand` doesn't give the correct sign for 0/-0.
2021-11-22reuse previous address constants in fold()Michael Forney
parseref() has code to reuse address constants, but this is not done in other passes such as fold or isel. Introduce a new function newcon() which takes a Con and returns a Ref for that constant, and use this whenever creating address constants. This is necessary to fix folding of address constants when one operand is already folded. For example, in %a =l add $x, 1 %b =l add %a, 2 %c =w loadw %b %a and %b were folded to $x+1 and $x+3 respectively, but then the second add is visited again since it uses %a. This gets folded to $x+3 as well, but as a new distinct constant. This results in %b getting labeled as bottom instead of either constant, disabling the replacement of %b by a constant in subsequent instructions (such as the loadw).
2021-10-22make variadic args explicitQuentin Carbonneaux
Some abis, like the riscv one, treat arguments differently depending on whether they are variadic or not. To prepare for the upcomming riscv target, we change the variadic call syntax and give meaning to the location of the '...' marker. # new syntax %ret =w call $f(w %regular, ..., w %variadic) By nature of their abis, the change is backwards compatible for existing targets.
2021-09-20parse: fix loadw when assigned to l temporaryMichael Forney
The documentation states that loadw is syntactic sugar for loadsw, but it actually got parsed as Oload. If the result is an l temporary, Oload behaves like Oloadl, not Oloadsw. To fix this, parse Tloadw as Oloadsw explicitly.
2021-08-23parsefields: fix padding calculationDrew DeVault
This was causing issues with aggregate types. A simple reproduction is: type :type.1 = align 8 { 24 } type :type.2 = align 8 { w 1, :type.1 1 } The size of type.2 should be 32, adding only 4 bytes of padding between the first and second field. Prior to this patch, 20 bytes of padding was added instead, causing the type to have a size of 48. Signed-off-by: Drew DeVault <[email protected]>
2021-07-28fix buffer overflow in parser (afl)Quentin Carbonneaux
Reported by Alessandro Mantovani. Overly long function names would trigger out-of-bounds accesses.
2021-03-02add data $name = section "section" ...Drew DeVault
This allows you to explicitly specify the section to emit the data directive for, allowing for sections other than .data: for example, .bss or .init_array.
2020-08-06Move NPred in parse.c and decrease itMichael Forney
This now only limits the number of arguments when parsing the input SSA, which is usually a small fixed size (depending on the frontend).
2020-08-06Use a dynamic array for phi argumentsMichael Forney
2019-05-15Allow specifying literal global namesMichael Forney
2019-03-14Rearrange the fields in Ins so the bit-fields get packed togetherMichael Forney
2019-03-08use a hash table to parse temporariesQuentin Carbonneaux