What is the EVM?
EVM or the Ethereum Virtual Machine is the building block of most smart contract blockchain. It is machine bytecode which all the nodes in the ethereum and other EVM block chains use to process data.
It can be thought of as a set of instructions on how to compute a given data. It is an commonly agreed set of instructions everyone uses to process smart contract transactions and their input data.
Given that it is such a fundamental thing in web3, it is important to understand the inner working of the EVM. So here in this article, I've compiled my notes on how I understand the EVM to work and to share it with the internet for others to understand and to correct me if I'm wrong. This article is an attempt to share what I've learned and what misunderstandings I might hold in this topic. Feel free to correct me or add extra points to it on the comments below.
So coming back to the topic at hand. How does contracts in the EVM manage the data given to it, and in terms of tokens(like ERC20, ERC721 etc) how does it keep track of who owns how much.
In other words, how do smart contract manage their data. To understand that, we first need to understand where at any given point in time during the processing of a transaction, a data element can be present.
There are 4 types of ways data is stored during the processing of transaction. That is
-
Storage
-
Memory
-
Stack
-
Transient storage
Storage
Storage is a place where data is stored permanently. It can be deleted or updated by the contract code. State Variables in solidity are stored in Storage.
Storage is further divided into Slot. Each Slot in storage can store upto a 32 bytes in it.
And the EVM stores and accesses the storage slot by using the sstore
and sload
opcode (opcode is the technical name for an instruction
).
Storage Layout
State variables of contracts are stored in storage in a compact way such that multiple values sometimes use the same storage slot. Except for dynamically-sized arrays and mappings, data is stored contiguously item after item starting with the first state variable, which is stored in slot 0
Storage Layout Packing
For each state variable, a size in bytes is determined according to its type. Multiple, contiguous items that need less than 32 bytes are packed into a single storage slot if possible, according to the following rules
-
The first item in a storage slot is stored lower-order aligned (i.e the first item is storage first/left)
-
Value types use only as many bytes as are necessary to store them
-
If a value type does not fit the remaining part of a storage slot, it is stored in the next storage slot
Memory
This is the temperary data stored for the duration of a transaction call and at the end of the call it is deleted.
Memory is considerably cheaper to access and store to as it will be deleted after the transaction call. Unlike storage which is the same throughout the contract and all calls, a new instance of memory is created for each transaction call. The data stored in one instance can't be accessed by another call or vice-versa. So if a function or the contract is re-entered in the middle of a transaction, there will multiple instances of memory for each transaction call.
Memory just like storage consists of slots and each slot can hold upto 32 bytes.
The memory can be stored to and accessed by the opcodes mstore
and mload
respectively.
Stack
The Stack is the place in the EVM in which data has to placed, before you can do any kind of computation. The Stack can be thought of as the work station of the EVM. If you want to do anything, (and I mean literally anything) you have to have it in the stack.
For example, if you want to add two number, you have to push those numbers to the stack first. You want to multiple? push it to the stack first. Want to store a variable in storage or memory? push it to the stack first. Even the above mentioned sload
and mload
are are used to take copy the data in certain slots in storage and memory and push it to the top of the stack. And mstore
and store
are used to write to certain slots in memory and storage.
The stack as the name suggests should be thought of as a stack of data elements stacked on top of others. Each element can have a maximum of 32 bytes.
Its a last in, first out mechanism. Anything you push to the stack will always be placed in the top of the stack and to access the ones lower in the stack you have to remove the top ones.
I've added a diagram from one of my notes for better visualisation.
![[EVM Stack Diagram]]
Transient Storage
Transient storage is the most recently add type of storage to the EVM. It borrows aspects from both memory and storage. Lets start with the similarities first.
Just like storage and memory, transient storage also consists of slots each of which can hold 32 bytes. And that is where the commonality for all three ends.
Transient storage is erased after every full transaction. Unlike memory which is deleted after every transaction call. Remember every transaction can have multiple internal calls, sometimes even to the same contract.
Conclusion
In brief, contracts mainly use four types of data stores, Stack, Memory, Storage and Transient Storage.
-
Storage- is Permanent until the contract updates or deletes it.
-
Memory- New instance of memory created for every transaction call. Deleted after every transaction call ends.
-
Transient Storage - Deleted after every transaction and is access to all transaction calls.
-
Stack- the place data has to be placed before you can start doing stuff with it
评论 (0)