README-FPStack.txt 2.7 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485
  1. //===---------------------------------------------------------------------===//
  2. // Random ideas for the X86 backend: FP stack related stuff
  3. //===---------------------------------------------------------------------===//
  4. //===---------------------------------------------------------------------===//
  5. Some targets (e.g. athlons) prefer freep to fstp ST(0):
  6. http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html
  7. //===---------------------------------------------------------------------===//
  8. This should use fiadd on chips where it is profitable:
  9. double foo(double P, int *I) { return P+*I; }
  10. We have fiadd patterns now but the followings have the same cost and
  11. complexity. We need a way to specify the later is more profitable.
  12. def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW,
  13. [(set RFP:$dst, (fadd RFP:$src1,
  14. (extloadf64f32 addr:$src2)))]>;
  15. // ST(0) = ST(0) + [mem32]
  16. def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW,
  17. [(set RFP:$dst, (fadd RFP:$src1,
  18. (X86fild addr:$src2, i32)))]>;
  19. // ST(0) = ST(0) + [mem32int]
  20. //===---------------------------------------------------------------------===//
  21. The FP stackifier should handle simple permutates to reduce number of shuffle
  22. instructions, e.g. turning:
  23. fld P -> fld Q
  24. fld Q fld P
  25. fxch
  26. or:
  27. fxch -> fucomi
  28. fucomi jl X
  29. jg X
  30. Ideas:
  31. http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html
  32. //===---------------------------------------------------------------------===//
  33. Add a target specific hook to DAG combiner to handle SINT_TO_FP and
  34. FP_TO_SINT when the source operand is already in memory.
  35. //===---------------------------------------------------------------------===//
  36. Open code rint,floor,ceil,trunc:
  37. http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html
  38. http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html
  39. Opencode the sincos[f] libcall.
  40. //===---------------------------------------------------------------------===//
  41. None of the FPStack instructions are handled in
  42. X86RegisterInfo::foldMemoryOperand, which prevents the spiller from
  43. folding spill code into the instructions.
  44. //===---------------------------------------------------------------------===//
  45. Currently the x86 codegen isn't very good at mixing SSE and FPStack
  46. code:
  47. unsigned int foo(double x) { return x; }
  48. foo:
  49. subl $20, %esp
  50. movsd 24(%esp), %xmm0
  51. movsd %xmm0, 8(%esp)
  52. fldl 8(%esp)
  53. fisttpll (%esp)
  54. movl (%esp), %eax
  55. addl $20, %esp
  56. ret
  57. This just requires being smarter when custom expanding fptoui.
  58. //===---------------------------------------------------------------------===//