The Ethereum virtual machine is a bit different than most other virtual machines. my previous post I have already explained how it is used and described some of its properties.
The Ethereum Virtual Machine (EVM) is a simple yet powerful Turing-complete 256-bit virtual machine that allows arbitrary execution by anyone. EVM bytecode.
The go-ethereum project includes two implementations of the EVM. simple and straightforward Bytecode VM And a little more sophisticated JIT-VM. In this post, I will explain the differences between the two implementations and explain some of the characteristics of JIT EVM and why it is much faster than bytecode EVM.
Go-ethereum’s bytecode virtual machine
The internals of the EVM are very simple. There is a single execution loop that attempts to execute the current command. Program Counter (PC briefly). within this loop gas It is computed for each instruction, memory is extended if necessary, and the instruction is executed if the preamble is successful. This continues until the VM completes normally or throws an exception and returns with an error, e.g. gas shortage).
for op = contract(pc)
if !sufficientGas(op)
return error("insufficient gas for op:", or)
switch op
case ...:
/* execute */
case RETURN:
return memory(stack(-1), stack(-2))
pc++
At the end of the execution loop, the program counter is incremented to execute the next instruction and continue to do so until completed.
EVM has another method. change Through what is called a program counter jump-guideline (jump & jump). Instead of allowing the program counter to be incremented (pc++), the EVM can also be moved to an arbitrary location in the contract code. The EVM knows two jump instructions: a regular jump, read as ‘.Jump to X positionConditional jumps read as ” and “Jump to location X if condition Y is true“. When such a jump occurs, you must always land in the following location: jump destination. If the program reaches an instruction that is not the target of the jump, the program fails. That is, for a jump to be valid, it must always be followed by the instruction it is jumping to if the condition is true.
Before executing an Ethereum program, the EVM iterates through the code to find all possible jump destinations, then places them on a map that can be referenced by the program counter to find them. Whenever the EVM encounters a jump command, the jump is validated.
As you can see, the executable code is relatively easy and simple to interpret by the bytecode VM. We can conclude from its simplicity that it’s actually pretty stupid.
Welcome to JIT VM.
JIT-EVM takes a different approach to EVM bytecode execution and, by definition, Initially Slower than bytecode VM. Before a VM can run code, it must: plait JIT The bytecode of the component that the VM can understand.
The initialization and launch procedure is performed in three steps:
- Uses code hashes to determine if a JIT program is ready to run.H(C) It is used as an identifier to identify the program.
- If a program is found, it runs it and returns the results.
- If the program is not found, it executes the bytecode. and We compile JIT programs in the background.
Initially, I wanted to check if the JIT program had completed compilation and then move the execution to JIT. All of this happened during runtime in the same loop using Go. atom Package – unfortunately I found this to be slower than running a bytecode VM and using a JIT program for all sequential calls after the program compilation is complete.
By compiling bytecode into logical pieces, JIT has the ability to analyze code more accurately and optimize it whenever necessary.
For example, one incredibly simple optimization I did was compile a bunch of things. push Work with a single command. let’s take it call guideline; Before executing the call, we need 7 push instructions (i.e. gas, address, value, input offset, input size, return offset, and return size), and instead of repeating these 7 instructions, we executed them one by one. First, we optimized it by taking 7 instructions and adding 7 values to a single slice. Now, anytime start Instead of executing one of seven push instructions, it executes one instruction, optimized by immediately adding a static slice to the VM stack. Of course, this only works for static values, e.g. push 0x10), But these are quite present in the code.
I also tried optimizing static jump guideline. A static jump is a jump that always jumps to the same location, e.g. Press 0x1 and jump) will never change under any circumstances. By determining which jumps are static, we can pre-check whether the jump is valid and within the scope of the contract, and if so, generate new instructions that replace both jumps. push and jumpcommand and is flagged as follows: valid. This saves the VM from having to perform two instructions, check if the jump is valid, and perform a costly hash map lookup for valid jump locations.
next stage
Full stack and memory analysis also fit well with this model, where large chunks of code can fit into a single instruction. Additional points I would like to add run symbol Convert JIT to the appropriate JIT-VM. Once the program is large enough to take advantage of these optimizations, I think this will be the logical next step.
conclusion
oJIT-VM is much smarter than bytecode VM, but it’s still far from being fully completed. There are many more clever tricks that could be added to this structure, but they are not realistic at the moment. The runtime is within the range of “reasonable” speed. You may find yourself needing to further optimize your VMs, so there are tools to do so.
Additional code reading
Cross-post location – https://medium.com/@jeff.ethereum/go-ethereums-jit-evm-27ef88277520#.1ed9lj7dz