123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173 |
- //===-- README.txt - Notes for WebAssembly code gen -----------------------===//
- The object format emitted by the WebAssembly backed is documented in:
- * https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md
- The C ABI is described in:
- * https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md
- For more information on WebAssembly itself, see the home page:
- * https://webassembly.github.io/
- Emscripten provides a C/C++ compilation environment based on clang which
- includes standard libraries, tools, and packaging for producing WebAssembly
- applications that can run in browsers and other environments.
- wasi-sdk provides a more minimal C/C++ SDK based on clang, llvm and a libc based
- on musl, for producing WebAssemmbly applictions that use the WASI ABI.
- Rust provides WebAssembly support integrated into Cargo. There are two
- main options:
- - wasm32-unknown-unknown, which provides a relatively minimal environment
- that has an emphasis on being "native"
- - wasm32-unknown-emscripten, which uses Emscripten internally and
- provides standard C/C++ libraries, filesystem emulation, GL and SDL
- bindings
- For more information, see:
- * https://www.hellorust.com/
- The following documents contain some information on the semantics and binary
- encoding of WebAssembly itself:
- * https://github.com/WebAssembly/design/blob/main/Semantics.md
- * https://github.com/WebAssembly/design/blob/main/BinaryEncoding.md
- Some notes on ways that the generated code could be improved follow:
- //===---------------------------------------------------------------------===//
- Br, br_if, and br_table instructions can support having a value on the value
- stack across the jump (sometimes). We should (a) model this, and (b) extend
- the stackifier to utilize it.
- //===---------------------------------------------------------------------===//
- The min/max instructions aren't exactly a<b?a:b because of NaN and negative zero
- behavior. The ARM target has the same kind of min/max instructions and has
- implemented optimizations for them; we should do similar optimizations for
- WebAssembly.
- //===---------------------------------------------------------------------===//
- AArch64 runs SeparateConstOffsetFromGEPPass, followed by EarlyCSE and LICM.
- Would these be useful to run for WebAssembly too? Also, it has an option to
- run SimplifyCFG after running the AtomicExpand pass. Would this be useful for
- us too?
- //===---------------------------------------------------------------------===//
- Register stackification uses the VALUE_STACK physical register to impose
- ordering dependencies on instructions with stack operands. This is pessimistic;
- we should consider alternate ways to model stack dependencies.
- //===---------------------------------------------------------------------===//
- Lots of things could be done in WebAssemblyTargetTransformInfo.cpp. Similarly,
- there are numerous optimization-related hooks that can be overridden in
- WebAssemblyTargetLowering.
- //===---------------------------------------------------------------------===//
- Instead of the OptimizeReturned pass, which should consider preserving the
- "returned" attribute through to MachineInstrs and extending the
- MemIntrinsicResults pass to do this optimization on calls too. That would also
- let the WebAssemblyPeephole pass clean up dead defs for such calls, as it does
- for stores.
- //===---------------------------------------------------------------------===//
- Consider implementing optimizeSelect, optimizeCompareInstr, optimizeCondBranch,
- optimizeLoadInstr, and/or getMachineCombinerPatterns.
- //===---------------------------------------------------------------------===//
- Find a clean way to fix the problem which leads to the Shrink Wrapping pass
- being run after the WebAssembly PEI pass.
- //===---------------------------------------------------------------------===//
- When setting multiple local variables to the same constant, we currently get
- code like this:
- i32.const $4=, 0
- i32.const $3=, 0
- It could be done with a smaller encoding like this:
- i32.const $push5=, 0
- local.tee $push6=, $4=, $pop5
- local.copy $3=, $pop6
- //===---------------------------------------------------------------------===//
- WebAssembly registers are implicitly initialized to zero. Explicit zeroing is
- therefore often redundant and could be optimized away.
- //===---------------------------------------------------------------------===//
- Small indices may use smaller encodings than large indices.
- WebAssemblyRegColoring and/or WebAssemblyRegRenumbering should sort registers
- according to their usage frequency to maximize the usage of smaller encodings.
- //===---------------------------------------------------------------------===//
- Many cases of irreducible control flow could be transformed more optimally
- than via the transform in WebAssemblyFixIrreducibleControlFlow.cpp.
- It may also be worthwhile to do transforms before register coloring,
- particularly when duplicating code, to allow register coloring to be aware of
- the duplication.
- //===---------------------------------------------------------------------===//
- WebAssemblyRegStackify could use AliasAnalysis to reorder loads and stores more
- aggressively.
- //===---------------------------------------------------------------------===//
- WebAssemblyRegStackify is currently a greedy algorithm. This means that, for
- example, a binary operator will stackify with its user before its operands.
- However, if moving the binary operator to its user moves it to a place where
- its operands can't be moved to, it would be better to leave it in place, or
- perhaps move it up, so that it can stackify its operands. A binary operator
- has two operands and one result, so in such cases there could be a net win by
- preferring the operands.
- //===---------------------------------------------------------------------===//
- Instruction ordering has a significant influence on register stackification and
- coloring. Consider experimenting with the MachineScheduler (enable via
- enableMachineScheduler) and determine if it can be configured to schedule
- instructions advantageously for this purpose.
- //===---------------------------------------------------------------------===//
- WebAssemblyRegStackify currently assumes that the stack must be empty after
- an instruction with no return values, however wasm doesn't actually require
- this. WebAssemblyRegStackify could be extended, or possibly rewritten, to take
- full advantage of what WebAssembly permits.
- //===---------------------------------------------------------------------===//
- Add support for mergeable sections in the Wasm writer, such as for strings and
- floating-point constants.
- //===---------------------------------------------------------------------===//
- The function @dynamic_alloca_redzone in test/CodeGen/WebAssembly/userstack.ll
- ends up with a local.tee in its prolog which has an unused result, requiring
- an extra drop:
- global.get $push8=, 0
- local.tee $push9=, 1, $pop8
- drop $pop9
- [...]
- The prologue code initially thinks it needs an FP register, but later it
- turns out to be unneeded, so one could either approach this by being more
- clever about not inserting code for an FP in the first place, or optimizing
- away the copy later.
- //===---------------------------------------------------------------------===//
|