path: root/ir/be/ia32
Commit message (Collapse)AuthorAge
* x86: add modern architecture variants and improve cpu detectionamd64-fmaJohannes Bucher2021-03-22
| | | | | | | | | | Added Intel and AMD x86 architecture variants up to Alder Lake and Zen3. The variants can be selected via the -march and -mtune backend options. Improved CPU architecture and feature detection for -march=native. All features defined in x86_architecture.h are now detected using cpuid. SIMD instruction extensions detection extended up to AVX2.
* ia32/amd64: split up architecture variant and cpu features into different ↵Johannes Bucher2021-03-22
| | | | | | | bitsets this allows to add support for more architecture variants and cpu features as the current bitset was nearly full
* add basic cpu architecture autodetection for amd64Johannes Bucher2021-03-22
| | | | | | | | | Existing code from the ia32 backend for cpuid autodetection is now used for both x86 backends. Similar to ia32, the -march and -mtune options are now available for amd64 (limited to 'generic' and 'native' atm) FMA3 support is now only available if the target machine supports it.
* Set immediate kind for ia32_FldCWSebastian Buchwald2019-08-09
| | | | This fixes x86code/float2int.c.
* ir: Allow ASM nodes as control flow nodes.Christoph Mallon2019-04-05
| | | | This is done by treating them as fragile.
* beasm: Tell the backends how to handle the fallthrough exec output of be_Asm.Christoph Mallon2019-04-05
* beasm: Handle operand modifier 'l' in all backends.Christoph Mallon2019-04-05
* beasm: Add BE_ASM_OPERAND_LABEL and tell the backends how to emit it.Christoph Mallon2019-04-05
* api: Pass ir_cons_flags to new_*_ASM(), so the pin state is set atomically.Christoph Mallon2019-04-05
* api: Pass the asm text before the constraints and clobbers to new_*_ASM().Christoph Mallon2019-04-05
| | | | This does better fit the syntax of an inline asm statement and also matches the order in irio.
* ia32: Also determine the frame offset for memory operands of be_Asm.Christoph Mallon2019-03-31
| | | | This fixes backend/asm_memory_access.c on ia32.
* ia32: Factor out code to determine the frame offset for an x86_addr_t.Christoph Mallon2019-03-31
* ia32: Set {base,index,mem}_input directly in init_ia32_attributes().Christoph Mallon2019-03-31
| | | | | It was only done temporarily in ia32_emit_am(). Now prepare for other users.
* ia32: Remove the enum constant 'IA32_ATTR_ia32_asm_attr_t'.Christoph Mallon2019-03-31
| | | | This is unused since ia32_Asm was replaced by be_Asm.
* amd64, ia32: Support all address modes in inline asm.Christoph Mallon2019-03-25
* Zero out the result struct in x86_create_address_mode() instead of in each ↵Christoph Mallon2019-03-25
| | | | caller.
* be: Refine modelling of additional register pressure.Christoph Mallon2019-03-24
| | | | | | | | | | | | | | | | | | Now additional pressure is applied to the register pressure either before (positive value) or after (negative value) the instruction. So far the value was applied to both the register pressure before and after the instruction. This leads to overapproximation, e.g. for cltd (in: eax, out: edx). When the input lives through then the register pressure after the instruction is 2, but +1 additional pressure unnecessarily increases it to 3. Now the additional pressure is applied to either the register pressure before or after the instruction. For cltd applying it only before the instruction is optimal, because the output can never be paired with the input. Typical symptom was overspilling around cltd+idiv. This still can overapproximate the actual register demand when in/out pairing depends on whether an input lives through. E.g. in: eax+reg, out: edx. Then 3 registers are needed when the reg input lives through. (additional pressure before 1) But only 2 registers are needed when the reg input dies. (no additional pressure) This fixes lit/overspill_cltd.c.
* Add assertion for value that would lead to undefined behaviorSebastian Buchwald2019-03-22
* be: Add the typedef 'be_add_pressure_t' for additional register pressure.Christoph Mallon2019-03-19
* beasm: Factor out common code to add an immediate operand.Christoph Mallon2019-03-06
* beasm: Add helper function to check for occurrence of modifiers.Christoph Mallon2019-03-04
* be: Factor out code to emit an unconditional jump in each backend.Christoph Mallon2019-03-04
* improved readability/code quality according to clang-tidy readability checksJohannes Bucher2019-01-24
| | | | | | | resolved warnings for these checks: - readability-non-const-parameter - readability-avoid-const-params-in-decls - readability-named-parameter
* ia32: / ifconv: do not generate cmov constructsJohannes Bucher2018-10-10
| | | | | | | | | | | | In cases with larger if cascades the ifconversion tried to optimize by building quite large blocks containing multiple mux nodes which will be lowered to cmov instructions. This impacted the performance of the compiled program heavily. cmov generation can be enabled using the -mcmov option fehler218 now fails (infinite loop in jump threading), but this bug was only hidden and is not caused by disabling cmov generation.
* ia32: Set mode_T for a Conv_I2I loading from memory right after creating it, ↵Christoph Mallon2018-09-29
| | | | | | instead of later in gen_Proj_Load(). All other cases are have mode_T already.
* ia32: Ensure correct translation of Proj M -> Load as part of a RMW operation.Christoph Mallon2018-09-29
| | | | This fixes backend/am_test6.c and backend/ia32_mode_t3.c.
* ia32: Use correct format specifier for int.Christoph Mallon2018-09-26
* ia32: Simplify gen_Proj_Load().Christoph Mallon2018-09-24
| | | | All transformed nodes need the same kind of Projs.
* ia32: Remove impossible case from gen_Proj_Load().Christoph Mallon2018-09-24
| | | | Either a proxy Proj was generated above or the Load just was properly transformed.
* Remove dead assignment.Christoph Mallon2018-09-24
* ia32: Remove redundant check.Christoph Mallon2018-09-24
| | | | A Proj of a Load is always attached to a Load.
* ia32: Correctly use X86_ADDR_REG instead of X86_ADDR_INVALID.Christoph Mallon2018-09-22
* ia32: Simplify handling of nodes in create_proj_for_store(), which have mode ↵Christoph Mallon2018-09-22
| | | | M already.
* x86: Set the address mode variant in eat_shl() right away instead of doing ↵Christoph Mallon2018-09-19
| | | | it later in the caller.
* ia32: Use set_indexed_ent().Christoph Mallon2018-09-19
* ia32: Remove redundant 'set_ia32_op_type(..., ia32_Normal)'.Christoph Mallon2018-09-19
| | | | This is the default value when creating the node.
* Reduce code duplication a bit.Christoph Mallon2018-09-19
* Use set_am_const_entity().Christoph Mallon2018-09-03
* ia32: Remove redundant set_irn_pinned(n, false).Christoph Mallon2018-09-02
| | | | All the nodes are not pinned initially.
* ia32: Remove pointless state 'exc_pinned' from node specifications.Christoph Mallon2018-09-02
| | | | These nodes cannot raise an exception.
* ia32: Do not unnecessarily attach a Proj to an fadd.Christoph Mallon2018-09-01
| | | | | Even though the node loads from memory, it is from a constant pool and there will be no memory user. So making preparations for one is pointless.
* ia32: Factor out and simplify code to select elements from a float array.Christoph Mallon2018-09-01
* ia32: Remove the unused function clear_ia32_commutative().Christoph Mallon2018-09-01
| | | | It was not used in 10 years.
* ia32: Remove non-sensical assignment to mem_proj.Christoph Mallon2018-09-01
| | | | | It was set to NoMem. The only user (fix_mem_proj()) expects it to be a Proj and is not even used here.
* ia32: Reduce code duplication by using gen_unop_AM() in gen_popcount().Christoph Mallon2018-09-01
* Directly use ${ARCH}_single_reg_req_${CLS}_${REG} instead of ↵Christoph Mallon2018-08-23
| | | | ${ARCH}_registers[REG_${REG}].single_req.
* be: Factor out code to get an input pos for a given register requirement.Christoph Mallon2018-08-23
* Fix typos message and comment.Christoph Mallon2018-08-23
* x86: Do not fold multiple Consts into an immediate.Christoph Mallon2018-06-23
| | | | | | | | | The code only checked that each constant was small enough to encode into an immediate. But adding multiple constants could overflow the 32 bit immediate. On ia32 this is no issue, because registers are 32 bit wide, but on AMD64 this could lead to wrong results. Usually this only happens when local optimizations are deactivated, otherwise multiple Consts would be combined into one anyway. This fixes opt/add_consts.c on AMD64.
* ia32: Rename 'Return' to 'Ret' to match the instruction name.Christoph Mallon2018-06-01