Skip to content

Commit 1fe7048

Browse files
committed
feat: 🎸 add BuildingAJIT raw docs
add building A JIT raw rst and markdown file.\n add machine translate for BuildingAJIT1.md
1 parent 62f71fc commit 1fe7048

10 files changed

+1782
-0
lines changed

BuildingAJIT/markdown/BuildingAJIT1.md

+198
Large diffs are not rendered by default.
+288
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,288 @@
1+
Building a JIT: Adding Optimizations \-- An introduction to ORC Layers
2+
======================================================================
3+
4+
::: {.contents}
5+
:::
6+
7+
**This tutorial is under active development. It is incomplete and
8+
details may change frequently.** Nonetheless we invite you to try it out
9+
as it stands, and we welcome any feedback.
10+
11+
Chapter 2 Introduction
12+
----------------------
13+
14+
**Warning: This tutorial is currently being updated to account for ORC
15+
API changes. Only Chapters 1 and 2 are up-to-date.**
16+
17+
**Example code from Chapters 3 to 5 will compile and run, but has not
18+
been updated**
19+
20+
Welcome to Chapter 2 of the \"Building an ORC-based JIT in LLVM\"
21+
tutorial. In [Chapter 1](BuildingAJIT1.html) of this series we examined
22+
a basic JIT class, KaleidoscopeJIT, that could take LLVM IR modules as
23+
input and produce executable code in memory. KaleidoscopeJIT was able to
24+
do this with relatively little code by composing two off-the-shelf *ORC
25+
layers*: IRCompileLayer and ObjectLinkingLayer, to do much of the heavy
26+
lifting.
27+
28+
In this layer we\'ll learn more about the ORC layer concept by using a
29+
new layer, IRTransformLayer, to add IR optimization support to
30+
KaleidoscopeJIT.
31+
32+
Optimizing Modules using the IRTransformLayer
33+
---------------------------------------------
34+
35+
In [Chapter 4](LangImpl04.html) of the \"Implementing a language with
36+
LLVM\" tutorial series the llvm *FunctionPassManager* is introduced as a
37+
means for optimizing LLVM IR. Interested readers may read that chapter
38+
for details, but in short: to optimize a Module we create an
39+
llvm::FunctionPassManager instance, configure it with a set of
40+
optimizations, then run the PassManager on a Module to mutate it into a
41+
(hopefully) more optimized but semantically equivalent form. In the
42+
original tutorial series the FunctionPassManager was created outside the
43+
KaleidoscopeJIT and modules were optimized before being added to it. In
44+
this Chapter we will make optimization a phase of our JIT instead. For
45+
now this will provide us a motivation to learn more about ORC layers,
46+
but in the long term making optimization part of our JIT will yield an
47+
important benefit: When we begin lazily compiling code (i.e. deferring
48+
compilation of each function until the first time it\'s run) having
49+
optimization managed by our JIT will allow us to optimize lazily too,
50+
rather than having to do all our optimization up-front.
51+
52+
To add optimization support to our JIT we will take the KaleidoscopeJIT
53+
from Chapter 1 and compose an ORC *IRTransformLayer* on top. We will
54+
look at how the IRTransformLayer works in more detail below, but the
55+
interface is simple: the constructor for this layer takes a reference to
56+
the execution session and the layer below (as all layers do) plus an *IR
57+
optimization function* that it will apply to each Module that is added
58+
via addModule:
59+
60+
``` c++
61+
class KaleidoscopeJIT {
62+
private:
63+
ExecutionSession ES;
64+
RTDyldObjectLinkingLayer ObjectLayer;
65+
IRCompileLayer CompileLayer;
66+
IRTransformLayer TransformLayer;
67+
68+
DataLayout DL;
69+
MangleAndInterner Mangle;
70+
ThreadSafeContext Ctx;
71+
72+
public:
73+
74+
KaleidoscopeJIT(JITTargetMachineBuilder JTMB, DataLayout DL)
75+
: ObjectLayer(ES,
76+
[]() { return std::make_unique<SectionMemoryManager>(); }),
77+
CompileLayer(ES, ObjectLayer, ConcurrentIRCompiler(std::move(JTMB))),
78+
TransformLayer(ES, CompileLayer, optimizeModule),
79+
DL(std::move(DL)), Mangle(ES, this->DL),
80+
Ctx(std::make_unique<LLVMContext>()) {
81+
ES.getMainJITDylib().setGenerator(
82+
cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
83+
}
84+
```
85+
86+
Our extended KaleidoscopeJIT class starts out the same as it did in
87+
Chapter 1, but after the CompileLayer we introduce a new member,
88+
TransformLayer, which sits on top of our CompileLayer. We initialize our
89+
OptimizeLayer with a reference to the ExecutionSession and output layer
90+
(standard practice for layers), along with a *transform function*. For
91+
our transform function we supply our classes optimizeModule static
92+
method.
93+
94+
``` c++
95+
// ...
96+
return cantFail(OptimizeLayer.addModule(std::move(M),
97+
std::move(Resolver)));
98+
// ...
99+
```
100+
101+
Next we need to update our addModule method to replace the call to
102+
`CompileLayer::add` with a call to `OptimizeLayer::add` instead.
103+
104+
``` c++
105+
static Expected<ThreadSafeModule>
106+
optimizeModule(ThreadSafeModule M, const MaterializationResponsibility &R) {
107+
// Create a function pass manager.
108+
auto FPM = std::make_unique<legacy::FunctionPassManager>(M.get());
109+
110+
// Add some optimizations.
111+
FPM->add(createInstructionCombiningPass());
112+
FPM->add(createReassociatePass());
113+
FPM->add(createGVNPass());
114+
FPM->add(createCFGSimplificationPass());
115+
FPM->doInitialization();
116+
117+
// Run the optimizations over all functions in the module being added to
118+
// the JIT.
119+
for (auto &F : *M)
120+
FPM->run(F);
121+
122+
return M;
123+
}
124+
```
125+
126+
At the bottom of our JIT we add a private method to do the actual
127+
optimization: *optimizeModule*. This function takes the module to be
128+
transformed as input (as a ThreadSafeModule) along with a reference to a
129+
reference to a new class: `MaterializationResponsibility`. The
130+
MaterializationResponsibility argument can be used to query JIT state
131+
for the module being transformed, such as the set of definitions in the
132+
module that JIT\'d code is actively trying to call/access. For now we
133+
will ignore this argument and use a standard optimization pipeline. To
134+
do this we set up a FunctionPassManager, add some passes to it, run it
135+
over every function in the module, and then return the mutated module.
136+
The specific optimizations are the same ones used in [Chapter
137+
4](LangImpl04.html) of the \"Implementing a language with LLVM\"
138+
tutorial series. Readers may visit that chapter for a more in-depth
139+
discussion of these, and of IR optimization in general.
140+
141+
And that\'s it in terms of changes to KaleidoscopeJIT: When a module is
142+
added via addModule the OptimizeLayer will call our optimizeModule
143+
function before passing the transformed module on to the CompileLayer
144+
below. Of course, we could have called optimizeModule directly in our
145+
addModule function and not gone to the bother of using the
146+
IRTransformLayer, but doing so gives us another opportunity to see how
147+
layers compose. It also provides a neat entry point to the *layer*
148+
concept itself, because IRTransformLayer is one of the simplest layers
149+
that can be implemented.
150+
151+
``` c++
152+
// From IRTransformLayer.h:
153+
class IRTransformLayer : public IRLayer {
154+
public:
155+
using TransformFunction = std::function<Expected<ThreadSafeModule>(
156+
ThreadSafeModule, const MaterializationResponsibility &R)>;
157+
158+
IRTransformLayer(ExecutionSession &ES, IRLayer &BaseLayer,
159+
TransformFunction Transform = identityTransform);
160+
161+
void setTransform(TransformFunction Transform) {
162+
this->Transform = std::move(Transform);
163+
}
164+
165+
static ThreadSafeModule
166+
identityTransform(ThreadSafeModule TSM,
167+
const MaterializationResponsibility &R) {
168+
return TSM;
169+
}
170+
171+
void emit(MaterializationResponsibility R, ThreadSafeModule TSM) override;
172+
173+
private:
174+
IRLayer &BaseLayer;
175+
TransformFunction Transform;
176+
};
177+
178+
// From IRTransformLayer.cpp:
179+
180+
IRTransformLayer::IRTransformLayer(ExecutionSession &ES,
181+
IRLayer &BaseLayer,
182+
TransformFunction Transform)
183+
: IRLayer(ES), BaseLayer(BaseLayer), Transform(std::move(Transform)) {}
184+
185+
void IRTransformLayer::emit(MaterializationResponsibility R,
186+
ThreadSafeModule TSM) {
187+
assert(TSM.getModule() && "Module must not be null");
188+
189+
if (auto TransformedTSM = Transform(std::move(TSM), R))
190+
BaseLayer.emit(std::move(R), std::move(*TransformedTSM));
191+
else {
192+
R.failMaterialization();
193+
getExecutionSession().reportError(TransformedTSM.takeError());
194+
}
195+
}
196+
```
197+
198+
This is the whole definition of IRTransformLayer, from
199+
`llvm/include/llvm/ExecutionEngine/Orc/IRTransformLayer.h` and
200+
`llvm/lib/ExecutionEngine/Orc/IRTransformLayer.cpp`. This class is
201+
concerned with two very simple jobs: (1) Running every IR Module that is
202+
emitted via this layer through the transform function object, and (2)
203+
implementing the ORC `IRLayer` interface (which itself conforms to the
204+
general ORC Layer concept, more on that below). Most of the class is
205+
straightforward: a typedef for the transform function, a constructor to
206+
initialize the members, a setter for the transform function value, and a
207+
default no-op transform. The most important method is `emit` as this is
208+
half of our IRLayer interface. The emit method applies our transform to
209+
each module that it is called on and, if the transform succeeds, passes
210+
the transformed module to the base layer. If the transform fails, our
211+
emit function calls `MaterializationResponsibility::failMaterialization`
212+
(this JIT clients who may be waiting on other threads know that the code
213+
they were waiting for has failed to compile) and logs the error with the
214+
execution session before bailing out.
215+
216+
The other half of the IRLayer interface we inherit unmodified from the
217+
IRLayer class:
218+
219+
``` c++
220+
Error IRLayer::add(JITDylib &JD, ThreadSafeModule TSM, VModuleKey K) {
221+
return JD.define(std::make_unique<BasicIRLayerMaterializationUnit>(
222+
*this, std::move(K), std::move(TSM)));
223+
}
224+
```
225+
226+
This code, from `llvm/lib/ExecutionEngine/Orc/Layer.cpp`, adds a
227+
ThreadSafeModule to a given JITDylib by wrapping it up in a
228+
`MaterializationUnit` (in this case a
229+
`BasicIRLayerMaterializationUnit`). Most layers that derived from
230+
IRLayer can rely on this default implementation of the `add` method.
231+
232+
These two operations, `add` and `emit`, together constitute the layer
233+
concept: A layer is a way to wrap a portion of a compiler pipeline (in
234+
this case the \"opt\" phase of an LLVM compiler) whose API is is opaque
235+
to ORC in an interface that allows ORC to invoke it when needed. The add
236+
method takes an module in some input program representation (in this
237+
case an LLVM IR module) and stores it in the target JITDylib, arranging
238+
for it to be passed back to the Layer\'s emit method when any symbol
239+
defined by that module is requested. Layers can compose neatly by
240+
calling the \'emit\' method of a base layer to complete their work. For
241+
example, in this tutorial our IRTransformLayer calls through to our
242+
IRCompileLayer to compile the transformed IR, and our IRCompileLayer in
243+
turn calls our ObjectLayer to link the object file produced by our
244+
compiler.
245+
246+
So far we have learned how to optimize and compile our LLVM IR, but we
247+
have not focused on when compilation happens. Our current REPL is eager:
248+
Each function definition is optimized and compiled as soon as it is
249+
referenced by any other code, regardless of whether it is ever called at
250+
runtime. In the next chapter we will introduce fully lazy compilation,
251+
in which functions are not compiled until they are first called at
252+
run-time. At this point the trade-offs get much more interesting: the
253+
lazier we are, the quicker we can start executing the first function,
254+
but the more often we will have to pause to compile newly encountered
255+
functions. If we only code-gen lazily, but optimize eagerly, we will
256+
have a longer startup time (as everything is optimized) but relatively
257+
short pauses as each function just passes through code-gen. If we both
258+
optimize and code-gen lazily we can start executing the first function
259+
more quickly, but we will have longer pauses as each function has to be
260+
both optimized and code-gen\'d when it is first executed. Things become
261+
even more interesting if we consider interprocedural optimizations like
262+
inlining, which must be performed eagerly. These are complex trade-offs,
263+
and there is no one-size-fits all solution to them, but by providing
264+
composable layers we leave the decisions to the person implementing the
265+
JIT, and make it easy for them to experiment with different
266+
configurations.
267+
268+
[Next: Adding Per-function Lazy Compilation](BuildingAJIT3.html)
269+
270+
Full Code Listing
271+
-----------------
272+
273+
Here is the complete code listing for our running example with an
274+
IRTransformLayer added to enable optimization. To build this example,
275+
use:
276+
277+
``` bash
278+
# Compile
279+
clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
280+
# Run
281+
./toy
282+
```
283+
284+
Here is the code:
285+
286+
::: {.literalinclude}
287+
../../examples/Kaleidoscope/BuildingAJIT/Chapter2/KaleidoscopeJIT.h
288+
:::

0 commit comments

Comments
 (0)