This is the second part of the article series.
If you haven’t read the paper, please look at the first part, introduction.
We are the deconstruction of a simple solidity intelligent contract EVM byte code.
Today, let us begin with a complex code of “divide and rule” strategy to dismantling it intelligent contract. I said in the introduction of the preface, this code is very low, but it may be more readable compared with the original bytecode.
Please make sure the following I introduced in the preface of the operation, the BasicToken code was deployed in the remix compiler.
Disclaimer: all instructions in this paper are provided by my own interpretation of the transaction mode of operation, do not represent the official views of the etheric fang.
Now, let us focus on JUMP, JUMPI, JUMPDES, RETURN and STOP, and ignore all other operations. Whenever we found is not one of the operation code, we will ignore it, and jump to the next instruction, not by their intervention.
When the EVM is executing code, is a top-down order, no other entrance point code, always starting from the top executive. JUMP and JUMPI can make the code jump. JUMP gets the top of the stack, and will move to the location of the instruction execution. However, the target position must include the JUMPDEST operation code, otherwise the execution fails. The sole purpose of doing so is that JUMPDEST will mark the position for valid jump targets. JUMPI is exactly the same, but a stack of second positions must not have a “0”, otherwise there is no jump. So this is a conditional jump, STOP is so smart contracts complete instructions, RETURN is to suspend the implementation of the contract but intelligent, EVM returns a part of memory data, it is very convenient .
So, let us begin to explain the code when taking into account all these. In the Remix debugger, the “transaction” slider to the left. You can use the Step Into button (it looks like a small downward arrow) and follow the instructions.
In front of the instruction can be ignored, directly to the eleventh instructions, we found the first JUMPI. If it doesn’t jump, it will continue through 12 to 15 instructions and eventually into the REVERT, then the execution will stop. But if the jump, it will skip these instructions to position 16 (sixteen hex 0x0010 in instruction 8 is pressed into the stack). 16 is a JUMPDEST instruction.
Continue to execute the operation code, until the “transaction” slider to the right. Just happened a lot, but only in the 68 position to find the RETURN operation code (as well as the STOP instruction in 69 operation code, just in case). This is strange. If you think about it, the intelligent control process of contract will always at the end of 15 or 68 instructions. We have just finished it and found no other possible processes, then what is left? (if you slide the “instructions” panel, you will see the code at the end of 566 positions).
We just traverse the instruction set (0 to 69) is called the contract “to create a code”. It will never become a part of intelligent contract code itself, but only during the transaction to create smart contracts executed by EVM a. We will soon find that this code is responsible for setting initial state created by the contract, and a copy of the code returns its runtime. The remaining 497 instructions (70 to 566), as we have seen, the implementation process will never reach, it is these codes will become part of the deployment of smart contracts .
Next, we first split the code: separate the code and run the code to create.
Now, we will create some in-depth study of the code.
Figure 1. BasicToken.sol created deconstruction EVM byte code
This is the most important concept to understand. Create code execution in the transaction, the transaction copy code return operation, the copy is the actual code for smart contracts. As we will see, the constructor is to create a code, and the code is not part of the runtime. Intelligent contract is a part of the constructor to create a code; once deployed, it will not appear in the smart contract code.
This magic is how it happened? This is what we are going to analyze the present content gradually.
Well。 So now the problem is simplified by understanding these 70 instructions and created corresponding code.
Let us use a top-down approach, the understanding of all instructions, and not skip any instructions. First, let us focus on the use of PUSH1 and MSTORE operating instructions to 2 0 code.
Figure 2. free memory pointer to the EVM byte code structure
PUSH1 just a byte onto the top of the stack, and MSTORE from the stack to grab the last two items and one stored in memory:
Mstore (0x40, 0x80)
| What to store.
Where to store.
Note: the above code fragment is Yul-ish code. Note how it is consumed in the stack elements from left to right, always the first consumption of the top of the stack elements.
This is the number 0x80 (128 in decimal) stored in the location of the 0x40 (64 in decimal) position.
We now discuss the problem, do not have to manage it, if there must be a reason, I’ll explain.
Now, open the Stack and Memory panel in the Remix Debugger tab, so you can view these instructions in visual gradually.
You may want to know: 1 instructions and 3 what happened? PUSH is the only EVM instruction consists of two or more bytes. So, PUSH 80 instructions. So we opened a mystery: directive 1 is 0x80, and the 0x40 directive 3.
Then I will explain from 5 to 15 instructions.
Figure 3.non-payable check the EVM byte code structure.
Here, have a lot of new operation code: CALLVALUE, DUP1, ISZERO, PUSH2, and REVERT. The number of CALLVALUE Wei involved in the affairs of DUP1 sent to create, copy the first element in the stack, if the maximum stack is zero, ISZERO 1 will be pushed to the stack, PUSH2 like PUSH1, but it will be two bytes onto the stack, and REVERT is to stop the implementation of.
So what happened here? In Solidity, we can write like this compilation:
If (msg.value! = 0) (revert);
This code is not actually a part of our original Solidity source, the compiler into, because we do not have the constructor declaration for payable. In the latest version of Solidity, without a clear statement as a function of payable cannot receive ethernet. Return to the assembly code in the JUMPI instruction 11 will skip instruction 12 to 15, if not related to the etheric coins, then jump to 16. Otherwise, REVERT will be two to 0 parameters (means not return useful data).
Well！ Let’s break, a cup of coffee.
(the next part can be a little tricky, so it is best to take a break for a few minutes. Before you again focus, to prepare a good cup of coffee for myself. Make sure you understand we’ve seen so far, because the next part is a bit complicated.)
If you want a different way to visualize we just completed the work, please try this simple tool I constructed: solmap. It allows you to compiled Solidity code, and then click the EVM operation code to highlight the relevant Solidity code. Disassemble and Remix is a bit different, but you should be able to understand it by comparison.
Ready to go out? Next is the directive 16 to 37. Please continue to use the Remix debugger. (remember, remix is your good friend ^ ^).
Figure 4. EVM byte code structure, the constructor argument is used to retrieve additional intelligence from the bytecode code at the end of the contract
The first four instruction (17 to 20) read position in the memory of the contents of any 0x40, and push it onto the stack. If you can remember, it should be 0x80. The following is 0x20 (32 in decimal) to the stack (instruction 21), and a copy of the value (instruction 23), stack 0x0217 (535 in decimal) (instruction 24), the last fourth copies (27 instructions), this value should be 0x80.
In view of this EVM command, you can temporarily not understand what happened. Don’t worry, it will occasionally appear in your mind.
In directive 28, the implementation of CODECOPY, which takes three parameters: the target memory location is used to store the copy code from the copy instruction number, and to copy the code bytes. Therefore, in this case, 0x80 is located in the code from the byte position (535, 32 bytes of code length to the target location start).
If you look at the disassembly code, 566 instruction. Why does this code trying to copy the last 32 bytes of code? In fact, the parameters included in the deployment of the constructor of the contract, as the original parameters of sixteen hexadecimal data attached to the end of the code (scroll down to “explain” panel can view this content). In this case, the constructor accepts a uint256 parameter, so all the code does is added at the end of the code from the parameter value is copied into memory.
These 32 instructions as the disassembly code has no meaning, but said they used the original sixteen hexadecimal: 0x0000000000000000000000000 0000000000000000000002710… Of course, this is we passed to the constructor of the decimal value 10000 in the deployment of smart contracts!
You can step in Remix repeat this part, make sure you know what has just happened. The final result should be the location of 0x00..002710, see the digital 0x80 in memory.
Well, to start the next part before, I suggest a cup of whisky and rest.
Why do you recommend a glass of whisky, because from here, is all downhill.
The next set of instructions is 29 to 35, updating the memory address of 0x40 value 0x80 value to 0xa0, we can see that they will offset value 0x20 (32 bytes).
Now we can begin to understand instructions 0 to 2. Solidity track called “empty memory pointer” things that can be used to store things in memory of our place, no one will cover it (unless we make a mistake). Therefore, since we will be stored in the old number 10000 free memory locations, we move forward by 32 bytes to update free memory pointer.
Even experienced Solidity developers will be confused in seeing the “free memory pointer” or code, mload (0x40, 0x80), they just said, “whenever we write a new entry, we will start from this point and keep the record of offset write memory”.
Each function in Solidity, when compiled into EVM bytecode, will initialize the pointer.
In the 0x00 to 0x40 between the memory of what you may not know. No, Solidity retained a memory, computing the hash value, we will soon see that this is essential for mapping and other types of dynamic data.
Now, in the 37 command in MLOAD, read from the memory location 0x40 and basically we 10000 value from memory to download the stack, it will be new there, and can be used in the next set of instructions in the.
This is a common pattern generated by Solidity EVM byte code: before performing a function, the function parameters are loaded into the stack (if possible), so that the upcoming code can use them – this is what happens next.
Let us continue to 38 to 55.
Figure 5. constructor body EVM code.
These instructions is only the main constructor is Solidity Code:
TotalSupply_ = _initialSupply;
Balances[msg.sender] = _initialSupply;
The first four instruction is very obvious (38 to 42), first of all, the 0 is pressed into the stack, then the stack second is copied (this is our number 10000), then the number 0 is copied and pushed to the stack, which is stored in the totalSupply_ slot. Now, SSTORE can use these values, and still remain below 10000 for future use:
Sstore (0x00, 0x2710)
| What to store.
Where to store.
Look! We will be the number 10000 is stored in a variable totalSupply_. Isn’t it amazing??
This value must be visualized in the Debugger tab of the Remix. You can find it in the store completely loaded panel.
The next set of instructions (43 to 54) is a bit tricky, but basically will store the key msg.sender 10000 in the balances mapping. Before continuing, please make sure that you understand this part of the Solidity document, the document explains how to save the image in memory.
In short, it will connect the slot mapping values (in this case the number is 1, because it is a statement of intelligent contracts in second variables) with the use of keys (msg.sender, in this case CALLER with the operation code, and then use the SHA3 operation code) from the abstract and use it as the target in memory position. Finally, the storage is just a simple dictionary or hash table.
Continue executing instructions 43 to 45, the msg.sender address is stored in memory (at position 0x00), and then in the instruction 46 to 50, the value 1 (map slot) stored in the memory location 0x20. Finally, any SHA3 operation code calculated from position 0x00 to position 0x40 in memory of the Keccak256 hash – series mapping slot / position and the key. This is the value of 10000 will be stored in our position in the map:
Sstore (hash, 0x2710…)
| What to store.
Where to store.
At this time, the main body of the constructor has been fully implemented.
All of these may be a bit overwhelming at first, but it is the basic part of storage in Solidity. If you don’t get it, I suggest you follow the Remix debugger to repeat several times, maintain stack and memory panel.
In addition, please feel free to ask the following questions. The widespread use of this model in the Solidity generated EVM byte code, you will soon learn to recognize it easily. Finally, the value of a key position to calculate it just save the map in memory of.
Figure 6. runtime code replication structure
In 56 to 65 instructions, we again perform code replication. Only this time, we will not be the last 32 bytes of code is copied to the memory; we position from the 0x0046 (70 in decimal) to copy 0x01d1 (465 in decimal) to 0 bytes of memory location. This is a large piece of code to copy!
If you again will slide to the right, you will notice that the position of the 70 just after the EVM code we create, execute a stopping place. The runtime bytecode contained in those 465 bytes. This is one part of the code, it will be used as intelligent contract runtime code stored in the block in the chain , the code will be every time someone or something with a smart contract interaction code. (we will follow-up in this series of the runtime code).
This is the instruction 66 to 69 do: we return the copy to memory code.
Return code EVM byte code structure in figure 7. when running.
Grab the RETURN code copied into memory and give it to EVM. If you do this in creating the code of the 0x0 address of the affairs in the context of EVM executes the code and returns the value stored as intelligent contract to create the runtime.
Until now, our BasicToken code will create and deploy smart contracts for example, and is ready to use its initial state and runtime code. If you step back and look at Figure 2, you will see our analysis of all the EVM byte code structure is generic, in addition to the highlighted in Purple: that is to say, they will be generated by the Solidity compiler to create bytecode. The difference between the actual constructor and constructor is only part of the constructor body – purple. Get embedded in the bytecode parameters at the end of the structure, and copy the runtime code and the structure of its return, can be considered a model code and general EVM code structure. You should be able to view any constructor before now, according to the instruction of learning, you should have a general understanding of its components.
In the next article, we will introduce the actual operation code, first describes how the entrance point of different contracts and intelligent EVM code to interact. Now, give yourself a fully deserve the pat, because you just digested the most difficult part of the series. You should also have a strong ability to read and debug EVM byte code, understand the general structure, the most important is to understand the difference between creation and runtime between EVM byte code. This is what makes the contract constructor in Solidity so special reasons.
We will be in a series of articles in the continued deconstruction!
* by Alejandro Santander starting in medium, and by the cheetah block chain security translation *
Cheetah Kingsoft to block chain security technology based on combination of artificial intelligence, NLP technology, provide contract audit, sentiment analysis and ecological security services for the blockchain users.
The official website of Ratingtoken https://www.ratingtoken.net/ from=z?