| Age | Commit message (Collapse) | Author |
|
On x86_64, direct calls are always PC-relative. This means that
in order to call an absolute address, the call must be indirect.
To accomplish this, update fixarg to introduce a temporary before
emitting.
|
|
|
|
Replacement of tiny conditional jump graphlets with
conditional move instructions.
Currently enabled only for x86. Arm64 support using cselXX
will be essentially identical.
Adds (internal) frontend sel0/sel1 ops with flag-specific
backend xselXX following jnz implementation pattern.
Testing: standard QBE, cproc, harec, hare, roland
|
|
|
|
- Many stylistic nits.
- Removed blkmerge().
- Some minor bug fixes.
- GCM reassoc is now "sink"; a pass that
moves trivial ops in their target block
with the same goal of reducing register
pressure, but starting from instructions
that benefit from having their inputs
close.
|
|
Removes last re-allocation of b->ins.
|
|
Always used this way and factors setting b->nins.
Makes b->ins vector contract more obvious.
|
|
Scratching an itch - avoid unnecesary re-allocation in idup()
which is called often in the optimisation chain.
Blk::ins is reallocated in xxx_abi() - needs further fiddling.
|
|
When rbp is not necessary to compile
a leaf function, we skip saving and
restoring it.
|
|
This was cute to do, but it is
largely inconsequential, as shown
by the rough timings below:
benchmarking mul8_lea
3.9 ticks ± 0.88 (min: 3)
benchmarking mul8_imul
3.3 ticks ± 0.27 (min: 3)
benchmarking div8_udiv
6.5 ticks ± 0.52 (min: 6)
benchmarking div8_shr
3.3 ticks ± 0.34 (min: 3)
|
|
|
|
|
|
Hopefully the right time now!
|
|
when applying a custom set of CFLAGS under clang that does not include
-std=c99, asm is treated as a keyword and as such can not be used as an
identifier. this prevents the issue by renaming the offending variables.
|
|
Quotes are used on Apple target
variants to flag that we must
not add the _ symbol prefix.
|
|
|
|
|
|
|
|
This is incompatible with binutils gas older than 2.26.
|
|
dbgloc line [col]
This is implemented in a backwards-compatible manner.
|
|
Causes errors with stock toolchain
on OpenBSD.
|
|
|
|
|
|
signed int can't represent all the values of unsigned int, so we
need to do the conversion to signed long, and use the lower 32 bits
as the result.
|
|
|
|
Support "file" and "loc" directives. "file" takes a string (a file name)
assigns it a number, sets the current file to that number and records
the string for later. "loc" takes a single number and outputs location
information with a reference to the current file.
|
|
We now treat thread-local
symbols in Mems properly.
|
|
Non-store/load instructions were
not lowered correctly for thread-
local symbols. This is an attempt
at a fix (cannot test for now).
|
|
|
|
Thanks to Lassi Pulkkinen for
flagging the issue and pointing
me to Ulrich Drepper's extensive
doc [1].
[1] https://people.redhat.com/drepper/tls.pdf
|
|
This is consistent with newtmp()
and newcon().
|
|
|
|
|
|
|
|
Crashing loads of uninitialized memory
proved to be a problem when implementing
unions using qbe. This patch introduces
a new UNDEF Ref to represent data that is
known to be uninitialized. Optimization
passes can make use of it to eliminate
some code. In the last compilation stages,
UNDEF is treated as the constant 0xdeaddead.
|
|
|
|
The .val field is signed in RSlot.
Add a new dedicated function to
fetch it as a signed int.
|
|
It is handy to express when
the end of a block cannot be
reached. If a hlt terminator
is executed, it traps the
program.
We don't go the llvm way and
specify execution semantics as
undefined behavior.
|
|
Symbols are a useful abstraction
that occurs in both Con and Alias.
In this patch they get their own
struct. This new struct packages
a symbol name and a type; the type
tells us where the symbol name
must be interpreted (currently, in
gobal memory or in thread-local
storage).
The refactor fixed a bug in
addcon(), proving the value of
packaging symbol names with their
type.
|
|
It is quite similar to arm64_apple.
Probably, the call that needs to be
generated also provides extra
invariants on top of the regular
abi, but I have not checked that.
Clang generates code that is a bit
neater than qbe's because, on x86,
a load can be fused in a call
instruction! We do not bother with
supporting these since we expect
only sporadic use of the feature.
For reference, here is what clang
might output for a store to the
second entry of a thread-local
array of ints:
movq _x@TLVP(%rip), %rdi
callq *(%rdi)
movl %ecx, 4(%rax)
|
|
It is more natural to branch on a
flag than have different function
pointers for high-level passes.
|
|
|
|
The apple targets are not done yet.
|
|
|
|
The general idea is to give abis a
chance to talk before we've done all
the optimizations. Currently, all
targets eliminate {par,arg,ret}{sb,ub,...}
during this pass. The forthcoming
arm64_apple will, however, insert
proper extensions during abi0.
Moving forward abis can, for example,
lower small-aggregates passing there
so that memory optimizations can
interact better with function calls.
|
|
apple support is more than assembly syntax
in case of arm64 machines, and apple syntax
is currently useless in all cases but amd64;
rather than having a -G option that only
makes sense with amd64, we add a new target
amd64_apple
|
|
This may cause invalid assembly to be generated
and is not all that useful anyway after constant
folding has run.
|
|
|
|
|
|
I also moved some isel logic
that would have been repeated
a third time in util.c.
|