12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485 |
- //===---------------------------------------------------------------------===//
- // Random ideas for the X86 backend: FP stack related stuff
- //===---------------------------------------------------------------------===//
- //===---------------------------------------------------------------------===//
- Some targets (e.g. athlons) prefer freep to fstp ST(0):
- http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html
- //===---------------------------------------------------------------------===//
- This should use fiadd on chips where it is profitable:
- double foo(double P, int *I) { return P+*I; }
- We have fiadd patterns now but the followings have the same cost and
- complexity. We need a way to specify the later is more profitable.
- def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW,
- [(set RFP:$dst, (fadd RFP:$src1,
- (extloadf64f32 addr:$src2)))]>;
- // ST(0) = ST(0) + [mem32]
- def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW,
- [(set RFP:$dst, (fadd RFP:$src1,
- (X86fild addr:$src2, i32)))]>;
- // ST(0) = ST(0) + [mem32int]
- //===---------------------------------------------------------------------===//
- The FP stackifier should handle simple permutates to reduce number of shuffle
- instructions, e.g. turning:
- fld P -> fld Q
- fld Q fld P
- fxch
- or:
- fxch -> fucomi
- fucomi jl X
- jg X
- Ideas:
- http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html
- //===---------------------------------------------------------------------===//
- Add a target specific hook to DAG combiner to handle SINT_TO_FP and
- FP_TO_SINT when the source operand is already in memory.
- //===---------------------------------------------------------------------===//
- Open code rint,floor,ceil,trunc:
- http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html
- http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html
- Opencode the sincos[f] libcall.
- //===---------------------------------------------------------------------===//
- None of the FPStack instructions are handled in
- X86RegisterInfo::foldMemoryOperand, which prevents the spiller from
- folding spill code into the instructions.
- //===---------------------------------------------------------------------===//
- Currently the x86 codegen isn't very good at mixing SSE and FPStack
- code:
- unsigned int foo(double x) { return x; }
- foo:
- subl $20, %esp
- movsd 24(%esp), %xmm0
- movsd %xmm0, 8(%esp)
- fldl 8(%esp)
- fisttpll (%esp)
- movl (%esp), %eax
- addl $20, %esp
- ret
- This just requires being smarter when custom expanding fptoui.
- //===---------------------------------------------------------------------===//
|