In previous blog post, we introduced how to verify account state with state proof.
In this blog post, I will introduce how data is stored in Ethereum smart contract, and how to verify the storage state with proof.
What to store in smart contract storage
Ethereum smart contract is also an account, which has its own account state, such as account balance, nonce. In addition to that, it also has code and extra storage.
The storage of a smart contract is essentially a key-value pair store. In order to generate and store proof for the existence of a key-value pair, a Merkle Patricia Trie is constructed to store the key value pairs.
What is the actual key value pairs to be stored in storage?
In order to explain that, let’s use a smart contract as an example. For simplicity, I will use the default Migration contract from truffle:
This contract is for storing migration steps. It has two state variables: owner and last_completed_migration.
How are they stored as key value pairs in storage? Is the owner stored under a key owner and the other uint value migration under a key last_completed_migration?
No. The variable names are not stored in smart contract storage.
Each state variables are stored in a storage unit called, slot. slot is indexed from 0.
So in the above example, the owner value is stored at slot 0, and the last_completed_migration value is stored at slot 1.
These slots are determined at compile time, strictly based on the order in which the variables appear in the contract code.
Query the contract state value in storage
The contract source code shows it has two state values. They are stored in storage slots. How to query them?
We can query them from a full node using the RPC method getStorageAt .
It takes the contract address and slot index as parameters. In our example, the contract is deployed at address 0xcca577ee56d30a444c73f8fc8d5ce34ed1c7da8b. We can double check the contract’s source code on etherscan.io:
Since the owner value stores at slot 0. We can send the following request to query the value for the owner.
The result shows the stored owner address is 0xde74da73d5102a796559933296c73e7d1c6f37fb. Note the result is left padded with 0s because each slot stores 32 bytes data.
Likewise, we can query the last_completed_migration value stored at slot 1.
The result shows it’s0x0000000000000000000000000000000000000000000000000000000000000002, which is a uint value 2 in left padded hex format.
We don’t have to trust these values, because later we will query proof for them and verify the proof ourselves.
Storage Merkle Trie
With the result from the eth_getStorageAt method, we’ve verified that the storage values are stored at slot 0 and 1:
In order to generate proof for the storage state, the full node need to store these key value pairs into a Merkle trie.
Before storing into Merkle trie, the key-value pairs need to be converted.
The slot index, which is the key, will first first be encoded to hex string and left padded with 0s, then hashed with keccad256 ;
The value will be encoded to hex string and then RLP encoded.
This is the actual converted key-value pairs stored into the contract storage merkle trie:
We can construct this Merkle trie locally with these two key value pairs and compute the root hash of the trie. Later, we can use the root hash to compare with the hash in the proof.
The above test passes.
If we visualize the Merkle trie we built locally, this is what it look like:
Verify contract State with storage Proof
We have the root hash of the storage trie of the deployed contract. Now let’s verify it by querying account proof using the eth_getProof RPC method.
Note, the parameter 0xcca577ee56d30a444c73f8fc8d5ce34ed1c7da8b is the contract address; 0xA8894B is block 11045195 :
The result shows that the storageHash is 0x7317ebbe7d6c43dd6944ed0e2c5f79762113cb75fa0bed7124377c0814737fb4, which matches with the root hash of the Merkle trie that we created with the two key value pairs for the two contract variables. This means the storage state for that contract is identical to the Merkle trie we created locally.
But in order to trust the storageHash is valid, we should first verify the full account state is valid with the proof. I will skip this part here, since we’ve introduced how to do it in the previous blog post. You could also checkout the test case (see link in the end).
However, a contract might have lots of contract variables, what if I only care about the state of a specific contract variable? Is it possible to query and verify proof for a single contract variable? For instance, the owner variable.
Yes, we can.
Verify contract variable with proof
The eth_getProof method allows to query the proof for a single state slot (or multiple). The second parameter of the eth_getProof method accepts an array of storage keys, which could be state slot index.
For instance,to query the proof for the owner variable at slot 0, we can specify the storage key as 0x0000000000000000000000000000000000000000000000000000000000000000 or 0x0 for short:
The result includes a non-empty storageProof field. That is the proof for the state of the owner variable.
The proof contains two hex strings, which are the two encoded trie nodes along the path from the root node of the trie to the leaf node that stores the value.
It can be verified against the storageHash in the account state:
Verification passed. It means we can trust that at block 11045195, the owner value of the Migration contract, which is deployed at address 0xcca577ee56d30a444c73f8fc8d5ce34ed1c7da8b, is 0xde74da73d5102a796559933296c73e7d1c6f37fb.
Summary
OK. Let’s make a summary of what we’ve learned so far:
- Ethereum contract has extra storage to store contract’s state variables.
- State variables are stored in storage slots, each slot is 32 bytes long.
- Storage slots are indexed from 0.
- Slot index is a uint256 number, it’s a very big space.
- As a light client, they can query storage proof from a full node for a specific state variable of a contract using the eth_getProof method.
- Once received the storage proof, a light client can verify it themselves.
- Light client uses the nonce in the block header to verify the block. If the block header is valid, it can trust the stateRoot hash included in the block header.
- Light client can query the state proof for a specific contract by address, and can verify the account state against the stateRoot hash that they trust. After the verification, if the account state is valid, they can trust the account state data, which includes the storageHash.
- Light client can query the storage proof for a specific state variable of a contract. The storage proof is another Merkle proof that can be verified against the storageHash of the account state. If the storage proof is valid, the light client can trust that the value of the state variable.
Next
In this tutorial, we used simple contract to explain how the contract state is stored. The contract state variables are fixed sized values, such as address, uint etc.
Ethereum smart contract could also store dynamic sized values, such as array and map. There are very useful use cases for them. We will cover them in the next blog post.
In the next blog post, I will use the USDC contract as an example to explain how a smart contract stores dynamic sized values, and how to verify the USDC account balance or other ERC20 token balance with proof.
Stay turned!
References
- https://ethereum.github.io/yellowpaper/paper.pdf
- The source code of all the code and test cases used in this blog post
More
If you are interested in learning more, check out my blog post series: