aboutsummaryrefslogtreecommitdiff
path: root/arm64
AgeCommit message (Collapse)Author
13 daysIf-conversion RFC 4 - x86 only (for now), use cmovXXRoland Paterson-Jones
Replacement of tiny conditional jump graphlets with conditional move instructions. Currently enabled only for x86. Arm64 support using cselXX will be essentially identical. Adds (internal) frontend sel0/sel1 ops with flag-specific backend xselXX following jnz implementation pattern. Testing: standard QBE, cproc, harec, hare, roland
2026-01-06arm64_apple: fix argxbh supportQuentin Carbonneaux
2026-01-06arm64: prevent bogus IP1 clobbersQuentin Carbonneaux
2025-04-16fix fp constants on big endian hostsQuentin Carbonneaux
2025-03-15arm64: use IP1 as scratch registerQuentin Carbonneaux
On Apple platforms x18 is not guaranteed to be preserved across context switches. So we now use IP1 as scratch register. En passant, one dubious use of IP0 in arm64/emit.c fixarg() was transitioned to IP1. I believe the previous code could clobber a user value if IP0 was live.
2025-03-14Re-use (vgrow) b->ins vector in backend xxx_abi() fn's.Roland Paterson-Jones
Removes last re-allocation of b->ins.
2025-03-14idup(Ins **, Ins *, ulong) -> idup(Blk *, Ins *, ulong)Roland Paterson-Jones
Always used this way and factors setting b->nins. Makes b->ins vector contract more obvious.
2025-03-14Blk::ins is a vectorRoland Paterson-Jones
Scratching an itch - avoid unnecesary re-allocation in idup() which is called often in the optimisation chain. Blk::ins is reallocated in xxx_abi() - needs further fiddling.
2024-12-19handle large hfas correctly on arm64Quentin Carbonneaux
2024-10-01fix various codegen bugs on arm64Quentin Carbonneaux
- dynamic allocations could generate bad 'and' instructions (for the and with -16 in salloc()). - symbols used in w context would generate adrp and add instructions on wN registers while they seem to only work on xN registers. Thanks to Rosie for reporting them.
2024-08-15arm64/isel: Avoid signed overflow when handling immediatesAlexey Yerin
Clang incorrectly optimizes this negation with -O2 and causes QBE to emit 0 in place of INT64_MIN.
2024-06-16revert 4bc4c958Quentin Carbonneaux
Hopefully the right time now!
2024-05-28replace asm keywordErica Z
when applying a custom set of CFLAGS under clang that does not include -std=c99, asm is treated as a keyword and as such can not be used as an identifier. this prevents the issue by renaming the offending variables.
2024-04-22revert 1b7770e271Quentin Carbonneaux
Quotes are used on Apple target variants to flag that we must not add the _ symbol prefix.
2024-03-26Drop quotes around floating point constant labelsMichael Forney
This is incompatible with binutils gas older than 2.26.
2024-01-02dbgloc: add column argumentDrew DeVault
dbgloc line [col] This is implemented in a backwards-compatible manner.
2024-01-02revert 5af33410Quentin Carbonneaux
Causes errors with stock toolchain on OpenBSD.
2023-12-30Fix IBT/BTI by instrumenting function callsTobias Heider
2023-08-18file,loc become dbgfile,dbglocQuentin Carbonneaux
2023-06-06implement line number info trackingThomas Bracht Laumann Jespersen
Support "file" and "loc" directives. "file" takes a string (a file name) assigns it a number, sets the current file to that number and records the string for later. "loc" takes a single number and outputs location information with a reference to the current file.
2023-05-09fix sub-word returns on arm64_appleQuentin Carbonneaux
2023-03-22rename blknew() to newblk()Quentin Carbonneaux
This is consistent with newtmp() and newcon().
2023-03-19naming nitQuentin Carbonneaux
2023-03-16silence format warning more reliablyQuentin Carbonneaux
2023-03-15silence some warningsQuentin Carbonneaux
2023-03-11Emit .type and .size directives on RISC-V and ARMAlexey Yerin
To match x86
2022-12-14new blit instructionQuentin Carbonneaux
2022-12-12new rsval() helper for signed RefsQuentin Carbonneaux
The .val field is signed in RSlot. Add a new dedicated function to fetch it as a signed int.
2022-11-27new hlt block terminatorQuentin Carbonneaux
It is handy to express when the end of a block cannot be reached. If a hlt terminator is executed, it traps the program. We don't go the llvm way and specify execution semantics as undefined behavior.
2022-11-22use a new struct for symbolsQuentin Carbonneaux
Symbols are a useful abstraction that occurs in both Con and Alias. In this patch they get their own struct. This new struct packages a symbol name and a type; the type tells us where the symbol name must be interpreted (currently, in gobal memory or in thread-local storage). The refactor fixed a bug in addcon(), proving the value of packaging symbol names with their type.
2022-10-12thread-local storage for amd64_appleQuentin Carbonneaux
It is quite similar to arm64_apple. Probably, the call that needs to be generated also provides extra invariants on top of the regular abi, but I have not checked that. Clang generates code that is a bit neater than qbe's because, on x86, a load can be fused in a call instruction! We do not bother with supporting these since we expect only sporadic use of the feature. For reference, here is what clang might output for a store to the second entry of a thread-local array of ints: movq _x@TLVP(%rip), %rdi callq *(%rdi) movl %ecx, 4(%rax)
2022-10-12thread-local storage for arm64_appleQuentin Carbonneaux
It is documented nowhere how this is supposed to work. It is also quite easy to have assertion failures pop in the linker when generating asm slightly different from clang's! The best source of information is found in LLVM's source code (AArch64ISelLowering.cpp). I paste it here for future reference: /// Darwin only has one TLS scheme which must be capable of dealing with the /// fully general situation, in the worst case. This means: /// + "extern __thread" declaration. /// + Defined in a possibly unknown dynamic library. /// /// The general system is that each __thread variable has a [3 x i64] descriptor /// which contains information used by the runtime to calculate the address. The /// only part of this the compiler needs to know about is the first xword, which /// contains a function pointer that must be called with the address of the /// entire descriptor in "x0". /// /// Since this descriptor may be in a different unit, in general even the /// descriptor must be accessed via an indirect load. The "ideal" code sequence /// is: /// adrp x0, _var@TLVPPAGE /// ldr x0, [x0, _var@TLVPPAGEOFF] ; x0 now contains address of descriptor /// ldr x1, [x0] ; x1 contains 1st entry of descriptor, /// ; the function pointer /// blr x1 ; Uses descriptor address in x0 /// ; Address of _var is now in x0. /// /// If the address of _var's descriptor *is* known to the linker, then it can /// change the first "ldr" instruction to an appropriate "add x0, x0, #imm" for /// a slight efficiency gain. The call 'blr x1' above is actually special in that it trashes less registers than what the abi would normally permit. In qbe, I don't take advantage of this and lower the call like a regular call. We can revise this later on. Again, the source for this information is LLVM's source code: // TLS calls preserve all registers except those that absolutely must be // trashed: X0 (it takes an argument), LR (it's a call) and NZCV (let's not be // silly).
2022-10-08mark apple targets with a booleanQuentin Carbonneaux
It is more natural to branch on a flag than have different function pointers for high-level passes.
2022-10-08"rel" fields become "reloc"Quentin Carbonneaux
2022-10-08add support for thread-local storageQuentin Carbonneaux
The apple targets are not done yet.
2022-10-03fix case of Pool constantsQuentin Carbonneaux
2022-10-03new arm64_apple targetQuentin Carbonneaux
Should make qbe work on apple arm-based hardware.
2022-10-03add new target-specific abi0 passQuentin Carbonneaux
The general idea is to give abis a chance to talk before we've done all the optimizations. Currently, all targets eliminate {par,arg,ret}{sb,ub,...} during this pass. The forthcoming arm64_apple will, however, insert proper extensions during abi0. Moving forward abis can, for example, lower small-aggregates passing there so that memory optimizations can interact better with function calls.
2022-09-01remove two unsignedQuentin Carbonneaux
We have a uint alias that we use everywhere else. I also added a todo about unhandled large offsets in arm64/emit.
2022-09-01use direct bl calls on arm64Quentin Carbonneaux
This generates tidier code and is pic friendly because it lets the linker trampoline calls to dynlinked libs.
2022-08-31drop -G flag and add target amd64_appleQuentin Carbonneaux
apple support is more than assembly syntax in case of arm64 machines, and apple syntax is currently useless in all cases but amd64; rather than having a -G option that only makes sense with amd64, we add a new target amd64_apple
2022-05-10arm64: fix maximum immediate size for small loads/storesMichael Forney
The maximum immediate size for 1, 2, 4, and 8 byte loads/stores is 4095, 8190, 16380, and 32760 respectively[0][1][2]. [0] https://developer.arm.com/documentation/dui0802/a/A64-Data-Transfer-Instructions/LDRB--immediate- [1] https://developer.arm.com/documentation/dui0802/a/A64-Data-Transfer-Instructions/LDRH--immediate- [2] https://developer.arm.com/documentation/dui0802/a/A64-Data-Transfer-Instructions/LDR--immediate-
2022-03-17fix return for big aggregatesQuentin Carbonneaux
The recent changes in arm and riscv typclass() set ngp to 1 when a struct is returned via a caller-provided buffer. This interacts bogusly with selret() that ends up declaring a gp register live when none is set in the returning sequence. The fix is simply to set cty to zero (all registers dead) in case a caller- provided buffer is used.
2022-03-15new -t? flag to print default targetQuentin Carbonneaux
2022-03-15support env calls on arm64Quentin Carbonneaux
The x9 register is used for the env parameter.
2022-03-14dynamic stack allocs for arm64Quentin Carbonneaux
I also moved some isel logic that would have been repeated a third time in util.c.
2022-03-14improve consistency in abisQuentin Carbonneaux
2022-03-14arm64/abi: fix big aggregates passed on the stackQuentin Carbonneaux
The riscv test abi8.ssa caught a bug in the arm backend. It turns out we were using the wrong class when loading pointers to aggregates from the stack. The fix is simple and mirrors what is done in the riscv abi.
2022-03-08flag types defined as unionsQuentin Carbonneaux
The risc-v abi needs to know if a type is defined as a union or not. We cannot use nunion to obtain this information because the risc-v abi made the unfortunate decision of treating union { int i; } differently from int i; So, instead, I introduce a single bit flag 'isunion'.
2022-03-08cosmeticsQuentin Carbonneaux