path: root/ir/be
Commit message (Collapse)AuthorAge
* x86: add modern architecture variants and improve cpu detectionamd64-fmaJohannes Bucher2021-03-22
| | | | | | | | | | Added Intel and AMD x86 architecture variants up to Alder Lake and Zen3. The variants can be selected via the -march and -mtune backend options. Improved CPU architecture and feature detection for -march=native. All features defined in x86_architecture.h are now detected using cpuid. SIMD instruction extensions detection extended up to AVX2.
* ia32/amd64: split up architecture variant and cpu features into different ↵Johannes Bucher2021-03-22
| | | | | | | bitsets this allows to add support for more architecture variants and cpu features as the current bitset was nearly full
* add basic cpu architecture autodetection for amd64Johannes Bucher2021-03-22
| | | | | | | | | Existing code from the ia32 backend for cpuid autodetection is now used for both x86 backends. Similar to ia32, the -march and -mtune options are now available for amd64 (limited to 'generic' and 'native' atm) FMA3 support is now only available if the target machine supports it.
* amd64: support scalar fused-multiply-add instructions (FMA3)Johannes Bucher2021-03-22
| | | | | | | | | | Adds support for fused multiply-add of scalar double- and single-precision floating point values from the FMA3 instruction set. Comprises the instructions VFMADD132SD, VFMADD213SD, VFMADD231SD, VFMADD132SS, VFMADD213SS, VFMADD231SS This feature can be enabled with the -mfma option.
* be2addr: fix copy-after case for modes with mode_TJohannes Bucher2021-03-22
* Recognize AArch64 as host cpu type.Manuel Mohr2021-03-04
* amd64: add pxor_0 instruction before cvtsi2sd to break dependency chainJohannes Bucher2020-02-21
* amd64: peephole: remove consecutive zero extensionsJohannes Bucher2020-02-21
* default to -fPIC on OpenBSDJohannes Bucher2019-12-06
* riscv: support soft-float and the -march and -mabi switchesJohannes Bucher2019-11-29
* riscv: correctly lower aggregate parametersJohannes Bucher2019-11-08
| | | | | | | Function parameter aggregates are lowered according to the RISC-V ILP32 ABI. Consider that small structs are passed by value when lowering builtin va_arg
* riscv: lowering of builtin va_arg takes alignment rules into accountJohannes Bucher2019-11-08
* Fix handling of array-typed struct members in AMD64 ABI.Andreas Fried2019-11-08
| | | | | Arrays need to be considered for a slice even if their starting offset is outside the range in question (e.g. struct { long x[2]; };).
* riscv: simplify frame pointer relative addressingJohannes Bucher2019-10-25
| | | | | Make use of the 'begin' parameter of be_layout_frame_types instead of fixing the offsets manually using a backend node flag.
* riscv: add support for variadic functionsJohannes Bucher2019-10-24
| | | | | | lowering of builtin va_arg still uses the be_default_lower_va_arg function which is not correct due to the alignment requirements of variadic arguments; a RISC-V specific implementation is needed
* Set immediate kind for ia32_FldCWSebastian Buchwald2019-08-09
| | | | This fixes x86code/float2int.c.
* riscv: add missing dump after lower_callsJohannes Bucher2019-06-25
* riscv: lower aggregate types at calls by replacing them with a pointer to ↵Johannes Bucher2019-06-19
| | | | the actual data
* riscv: add emit function for be_MemPerm nodesJohannes Bucher2019-06-19
| | | | | | uses a simple approach similar to the arm backend: save registers on the stack, load MemPerm ins in registers, write them back and restore the registers.
* riscv: add a peephole optimization for consecutive shift operationsJohannes Bucher2019-06-19
* riscv: support right shift for modes smaller than 32 bitJohannes Bucher2019-06-11
* riscv: rename register s0 -> fpJohannes Bucher2019-06-11
| | | | fp is an alternative ABI name for register s0
* riscv: fix function prologue + epilogueJohannes Bucher2019-06-11
* riscv: support frame pointer relative addressingJohannes Bucher2019-06-11
* riscv: support Alloc nodesJohannes Bucher2019-06-11
| | | | Introduced riscv backend nodes SubSP and SubSPimm for stack allocations
* riscv: support extension from mode Hu (16 bit) to machine sizeJohannes Bucher2019-05-17
* riscv: do not emit IncSP nodes with offset 0Johannes Bucher2019-05-17
* riscv: fix calculation of hi lo immediate (remove undefined behavior)Johannes Bucher2019-05-17
| | | | fixes runtime error in sanitize builds
* ir: Allow ASM nodes as control flow nodes.Christoph Mallon2019-04-05
| | | | This is done by treating them as fragile.
* beasm: Allow be_Asm nodes as control flow nodes.Christoph Mallon2019-04-05
| | | | This is done by treating them as fragile.
* beasm: Give label constraints the "register" class 'exec'.Christoph Mallon2019-04-05
* beasm: Do not confuse the additional register pressure handling with exec ↵Christoph Mallon2019-04-05
| | | | outputs.
* beasm: Tell the backends how to handle the fallthrough exec output of be_Asm.Christoph Mallon2019-04-05
* be, ir: Give be_Asm and ASM a fallthrough exec output.Christoph Mallon2019-04-05
* beasm: Handle operand modifier 'l' in all backends.Christoph Mallon2019-04-05
* beasm: Add BE_ASM_OPERAND_LABEL and tell the backends how to emit it.Christoph Mallon2019-04-05
* api: Pass ir_cons_flags to new_*_ASM(), so the pin state is set atomically.Christoph Mallon2019-04-05
* api: Pass the asm text before the constraints and clobbers to new_*_ASM().Christoph Mallon2019-04-05
| | | | This does better fit the syntax of an inline asm statement and also matches the order in irio.
* be: Dump the text template of be_Asm.Christoph Mallon2019-04-01
* be: A block needs no label, if it is only reachable by fallthrough from a ↵Christoph Mallon2019-03-31
| | | | regular X Proj.
* amd64: Also determine the frame offset for memory operands of be_Asm.Christoph Mallon2019-03-31
| | | | This fixes backend/asm_memory_access.c on amd64.
* amd64: Factor out code to determine the frame offset for an x86_addr_t.Christoph Mallon2019-03-31
* ia32: Also determine the frame offset for memory operands of be_Asm.Christoph Mallon2019-03-31
| | | | This fixes backend/asm_memory_access.c on ia32.
* ia32: Factor out code to determine the frame offset for an x86_addr_t.Christoph Mallon2019-03-31
* ia32: Set {base,index,mem}_input directly in init_ia32_attributes().Christoph Mallon2019-03-31
| | | | | It was only done temporarily in ia32_emit_am(). Now prepare for other users.
* ia32: Remove the enum constant 'IA32_ATTR_ia32_asm_attr_t'.Christoph Mallon2019-03-31
| | | | This is unused since ia32_Asm was replaced by be_Asm.
* amd64, ia32: Support all address modes in inline asm.Christoph Mallon2019-03-25
* Zero out the result struct in x86_create_address_mode() instead of in each ↵Christoph Mallon2019-03-25
| | | | caller.
* Use MAX.Christoph Mallon2019-03-24
* be: Refine modelling of additional register pressure.Christoph Mallon2019-03-24
| | | | | | | | | | | | | | | | | | Now additional pressure is applied to the register pressure either before (positive value) or after (negative value) the instruction. So far the value was applied to both the register pressure before and after the instruction. This leads to overapproximation, e.g. for cltd (in: eax, out: edx). When the input lives through then the register pressure after the instruction is 2, but +1 additional pressure unnecessarily increases it to 3. Now the additional pressure is applied to either the register pressure before or after the instruction. For cltd applying it only before the instruction is optimal, because the output can never be paired with the input. Typical symptom was overspilling around cltd+idiv. This still can overapproximate the actual register demand when in/out pairing depends on whether an input lives through. E.g. in: eax+reg, out: edx. Then 3 registers are needed when the reg input lives through. (additional pressure before 1) But only 2 registers are needed when the reg input dies. (no additional pressure) This fixes lit/overspill_cltd.c.