Implementing EIP-712 on CKB-VM

Implementing EIP-712 on CKB-VM


10 min read

In a typical signature scheme, we usually take the witness and the hash of a transaction as parameters and use a hashing algorithm to generate a transaction signature request (This code demonstrates the process).

The result is a hexadecimal string as in the Message section below:

Figure 1. Request to sign a humanly unreadable message

This string of message is virtually impossible for any ordinary user to cipher. And signing something unreadable is implicitly dangerous. Suppose an encoding standard that allows the signers to understand the signature request, and certain third-party tools to verify the information could be developed, the transaction security could be greatly improved.

EIP-712 was proposed in 2017 as a procedure for hashing and signing typed structured data to solve this problem. Instead of byte strings, EIP-712 features a human-readable display of the signature to the user, so that the user knows exactly what they are dealing with. Below is the user interface for signing with EIP-712:

Figure 2. Request for signature interface improved by EIP712

This post introduces the basic structure of EIP-712 and the signature principle of typed structured data (Part I), as well as how it is implemented in Nerovs CKB (Part II). Part II covers: 1) the design of the tree structure for data storage, 2) hash calculation by types encoding, and 3) second encapsulation. Additionally, I include a link to several test cases for common use cases. The section on fuzz testing follows with four common errors and my brief analyses and solutions.

I EIP-712 Basics Explained: Hashing and Signing Typed Structured Data

We see from the picture above that EIP-712 has two components: Domain and Message. They constitute the entirety of EIP-712.

Domain is a fixed value. Message is customizable. You need to first define the structure, then fill in the values accordingly.

We can derive the following JSON code from Figure II:

    "name":"Decentralised Exchange",

According to EIP-712’s specification, the typed structured data must be a JSON object that has the following properties:

  • Types: define the names and types of the variables stored in EIP712.

  • Message: corresponds to the struct name of primaryType in types.

  • Domain: corresponds to the EIP712Domain struct in types. Domain‘s struct in types is EIP712Domain. In general, it only contains the fields needed for signature.

  • PrimaryType: struct name that marks the message field.

In types, the valid identifier of a definition is regarded as a struct name. It may contain several member variables, or none at all. Member variables must have name types, which can be a basic type, a struct or an array. The basic types include bytes1 to bytes32, uint8 to uint256, int8 to int256, bool, address, bytes, and string.

EIP712 hashStruct is defined as:

hashStruct(s : 𝕊) = keccak256(typeHash ‖ encodeData(s)) 

typeHash = keccak256(encodeType(typeOf(s)))

in which ‖ represents strcat( ), i.e., concatenate strings.

The typeHash encodes the type of the struct. It is structured as:

name ‖ "(" ‖ member₁ ‖ "," ‖ member₂ ‖ "," ‖ … ‖ memberₙ ")" type ‖ " " ‖ name

If the struct contains references to other structs, it must append the others to the string in the following form. Let's take the above data as an example, where the value of primaryType is msg, the struct can be encoded as:

msg(orderHash string,amount uint32,address address,nonce uint32)

If the struct above has other structs as dependencies, it must append the dependency structure in a similar format.

The encodeData in hashStruct encodes struct data as:

enc(value₁) ‖ enc(value₂) ‖ ... ‖ enc(valueₙ)

Here "enc" is used to encode the structure and then calculate the hash by keccak256.

Based on the struct and data provided, the struct hash can be calculated as follows:

  typeHash(msg) ‖
  hash(orderHash) ‖
  hash(amount) ‖
  hash(address) ‖

We need to calculate the hashes of two structs in EIP712. One is for domain, the other for message. The domain hash just needs to be filled out according to the description, since its structure is already defined. For the message hash, its struct name is in primaryType. We can get the struct name based on the value of primaryType.

Finally, by hashing the domain hash and message hash once again, we obtain the result:

hash(0x19 ‖ 0x01 ‖ hashStruct(domain) hashStruct(message))

II Implement EIP-712 on CKB Virtual Machine

The official EIP-712 implementation is in Rust, while CKB requires a C implementation. Since there is no open-source C implementation available, we developed CKB-EIP 712 ourselves.

EIP-712 implementation has two prerequisites:

  • Implement a struct for raw data

  • Implement hash calculation of data

Tree And Memory Management

A key feature of EIP-712 is that the types field needs to be user-defined; the data in the message field must correspond to the types defined in types. This requires a tree structure for data storage.

Since many EIP-712 implementations use JSON, it was our first choice as well. However, there were some issues with JSON and CKB’s virtual machine, CKB-VM:

  • It consumes a large number of cycles to build and parse JSON.

  • JSON involves a lot of serialization and deserialization features, for which ckb-c-stdlib does not provide relevant functions.

  • JSON protocols by themselves do not guarantee order. As a result, certain order changes in structs may occur in some libraries during coding and decoding. (Generally, the order can be adjusted manually.)

Given these disadvantages, we designed the e_item tree to store the data for hashing, shown below:

typedef union _e_item_value {
  uint8_t *data_number;  // bytesx, uintx, intx, address,
  bool data_bool;
  struct {
    uint8_t *data;
    size_t len;
  } data_bytes;
  const char *data_string;
  struct _e_item *data_struct;
} e_item_value;

typedef struct _e_item {
  const char *key;
  e_item_value value;
  e_type type;

  struct _e_item *sibling;
} e_item;

e_item is the tree structure, in which:

  • e_item_value is designed as a union for memory-saving purposes.

  • e_type defines enumeration types to store types and specifies the basic data types in EIP712.

Note that the struct _e_item *sibling of e_item stores its sibling nodes instead of child nodes, since the child nodes may store a basic type rather than a struct.

Considering that there would be a memory allocation problem (due to many strings being involved), we construct a simple memory management method when building e_item . :

typedef struct _e_mem {
  uint8_t *buffer;
  size_t buffer_len;
  size_t pos;
} e_mem;
e_mem eip712_gen_mem(uint8_t *buffer, size_t len);
void *eip712_alloc(e_mem *mem, size_t len);

When in use, call eip712_alloc to get the appropriate length of memory. However, you need to pay attention to the lifetime of e_mem.

Let me show you the conversion from JSON to e_item:

        "capacity":"999.99 CKB",
        "capacity":"9.99 CKB",

Below is a mathematical "tree" that reinterprets how the above structure is stored in memory:

Figure 3. Tree structure used in EIP-712 for data storage

Hash Calculation by Types Encoding

According to the EIP-712 specification, types need to be converted to a string in order to be included in the final hash calculation. This encoding may also be used in nested structs later on.

In the EIP-712 protocol, we need to hash the structs and data of domain and message separately, then calculate according to the following formula:

keccak256(0x190x01 ‖ hashStruct(domain) ‖ hashStruct(message))

The encode_imple interface is available to get hash via e_item.

int encode_impl(e_item *data, uint8_t *hash_ret);

With the above function, hashing is calculated and passed to hash_ret.

Second Encapsulation: Signature Hash Generation For CKB Transactions

To get the signature hash for CKB transactions, we need a second encapsulation based on CKB-EIP-712.

Since CKB uses fixed types, the corresponding data types are fixed. Here we use eip712_data to generate a hash according to the EIP-712 specification.

The code for interface encapsulation is as follows:

typedef struct _eip712_data {
  eip712_domain domain;
  eip712_active active;
  char* transaction_das_message;
  char* inputs_capacity;
  char* outputs_capacity;
  char* fee;
  uint8_t digest[32];

  eip712_cell* inputs;
  size_t inputs_len;
  eip712_cell* outputs;
  size_t outputs_len;
} eip712_data;

int get_eip712_hash(const eip712_data* data, uint8_t* out_hash);

Test Cases

I tested the library for several common scenarios while writing it; you can find some test cases here.

Here I attach an example:

eip712_data data = {0};
  data.domain.chain_id[31] = 9; = "";
  uint8_t verifying_contract[20] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
                                    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
                                    0x00, 0x00, 0x20, 0x21, 0x07, 0x22};
  memcpy(data.domain.verifying_contract, verifying_contract, 20);
  data.domain.version = "1"; = "withdraw_from_wallet"; = "0x00";

  data.inputs_capacity = "551.39280335 CKB";
  data.outputs_capacity = "551.39270335 CKB";
  data.fee = "0.0001 CKB";

  uint8_t digest_data[32] = {0xa7, 0x1c, 0x9b, 0xf1, 0xcb, 0x16, 0x86, 0xb3,
                             0x5a, 0x6c, 0x2e, 0xe4, 0x59, 0x32, 0x02, 0xbc,
                             0x13, 0x27, 0x9a, 0xae, 0x96, 0xe6, 0xea, 0x27,
                             0x4d, 0x91, 0x94, 0x44, 0xf1, 0xe3, 0x74, 0x9f};
  memcpy(data.digest, digest_data, 32);

  data.transaction_das_message =
      "TRANSFER FROM 0x9176acd39a3a9ae99dcb3922757f8af4f94cdf3c(551.39280335 "
      "CKB) TO 0x9176acd39a3a9ae99dcb3922757f8af4f94cdf3c(551.39270335 CKB)";

  eip712_cell inputs[2] = {0};
  inputs[0].capacity = "999.99 CKB";
  inputs[0].lock = "das-lock,0x01,0x0000000000000000000000000000000000000011";
  inputs[0].type = "account-cell-type,0x01,0x";
  inputs[0].data = "{ account: das00001.bit, expired_at: 1642649600 }";
  inputs[0].extra_data =
      "{ status: 0, "
      "55478d76900611eb079b22088081124ed6c8bae21a05dd1a0d197efcc7c114ce }";

  inputs[1].capacity = "9.99 CKB";
  inputs[1].lock = "das-lock,0x01,0x0000000000000000000000000000000000000021";
  inputs[1].type = "account-cell-type,0x01,0x";
  inputs[1].data = "{ account: das00001.bit, expired_at: 1642649600 }";
  inputs[1].extra_data =
      "{ status: 0, "
      "55478d76900611eb079b22088081124ed6c8bae21a05dc1a0d197efcc7c114ce }";

  data.inputs = inputs;
  data.inputs_len = sizeof(inputs) / sizeof(eip712_cell);

  eip712_cell outputs[1] = {0};
  outputs[0].capacity = "119.99 CKB";
  outputs[0].lock = "das-lock,0x02,0x0000000000000000000000000000000000000021";
  outputs[0].type = "account-cell-type,0x01,0x";
  outputs[0].data = "{ account: das00001.bit, expired_at: 1642649600 }";
  outputs[0].extra_data =
      "{ status: 0, "
      "55478d76900711eb079b22088081124ed6c8bae21a05dc1a0d197efcc7c114ce }";

  data.outputs = outputs;
  data.outputs_len = sizeof(outputs) / sizeof(eip712_cell);

  int ref = get_eip712_hash(&data, hash);

In this test, there are a few points that require special attention:

  • If strings are not hard-coded, pay attention to the timing of memory release. This section of memory can be managed using e_mem. (Similarly, digest and cell may have similar issues.)

  • Strings can be assigned the value '/0', but not NULL (null pointer), because there may be issues with the logic when hashing.

  • If ref is 0, the execution is successful; if not, the corresponding error code can be used to locate the problem.

  • Pointers are used to pass parameters because cells vary in length. Memory’s lifetime must be taken into consideration.

III Fuzzing: Errors and Analyses

Once the development is completed, we must ensure that the code is secure, stable, and capable of producing the correct output. The tests consist primarily of fuzz testing and code coverage.

Since our goal is to hash CKB transaction data, the fuzzing is performed on get_eip712_hash.

Given that the data input of the fuzz testing varies in length, we design a struct to parse the data according to lengths, and then generate eip712_data from the parsed data. The number of strings needed by the members’ parameters is huge, but they are only used in hashing as binary data. To improve performance, we only generate variable-length strings with fixed values.

Output correctness is just as important as code security and stability in extreme cases. To accomplish this, we write a function that converts the eip712_data into JSON strings and then calls the Nodejs EIP-712 module to verify.

The following are some of the most common fuzz testing errors, along with a brief explanation of the causes or solutions.

Unaligned Memory

This one occurs in the memory allocation of e_mem. Unaligned memory causes strange errors in CKB-VM. For example, the program crashes occasionally, but the cause is located in a line of code that has no errors at all.

Null Pointer in String

A null pointer will be reported as a runtime exception. It occurs mostly when eip712_data is converted to e_item. This can be fixed by adding a judgment.

Garbled Code In Cells

The reason for this is that both cells are pointers. Cells were constructed in one function. After the function returns, the stack memory of this function is released, but the pointers remain the same as before the release.

Miscalculated Struct In Hashing

Due to many nested structs involved in EIP-712, the final hash may need to be corrected after calculating the struct (perhaps because of a problem with encoding the input types when calculating the struct). Such problems are tricky. My solution is to print out the entire hashing data of all the parameters, then use the EIP-712 Rust Library to print out the hashing data in the same format. By comparing these two separate sets of results, we can determine where the problem lies.

✍🏻 Witten by Zishuang Han

You may also be interested in: