How to run Ethereum Mainnet node on AWS

Services

Case studies Careers

Blog

About us

How to run Ethereum Mainnet node on AWS

Fri, May 25, 2018 •16 min read

Blockchain

AWS recently published CloudFormation templates [1] to launch a private Ethereum testing network. Their solution comes with an app for viewing blocks, a miner, etc. As great as it is, this solution isn’t easy to apply to run a mainnet node.

First, lets try to answer why would anyone need a privately hosted mainnet Ethereum node instead of relying on Infura. In Rumble Fish we have few projects where some backend component is interacting with the Ethereum blockchain. In some cases it reacts on Event notification, in other it’s responsible for closing financial transaction and it needs to act fast.

It’s never a good idea to make a business-critical process rely on external service operating well. Whatever happens with Infura is beyond our control. If it goes down there could only wait. This may not be a problem for many applications, in our case we had to eliminate this risk.

TL;DR summary

To just get the node up follow point below. Its assumed that you have an AWS account set up with programmatic access.

0. Clone this repository.

git clone https://github.com/rumblefishdev/cf-parity-mainnet.git

Open terminal. Install aws and jq if not installed.

Setup which account and region you work with.

export AWS_DEFAULT_REGION=eu-central-1
      # set to matching entry in ~/.aws/credentials
      export AWS_DEFAULT_PROFILE=...

3. Create repository, build image and push it.

bash build_and_upload.sh

This command creates an ECR repository under your account and pushes there a slightly customized image of Parity client.

4. Specify parameters of the node.

cd cloudformation
cp stack-parameters.default.json stack-parameters.json
$EDITOR stack-parameters.json

In here you need to specify:

VpcId to run the chain
DNSName to register for your node (eg. mainnet.rumblefishdev.com)

5. Create CloudFormation stack.

bash -x create-stack

6. Go to CloudFormation console and wait for the stack creation to complete. Get the exported NameServer output and put it as NS entry in the DNS config of your domain.

7. Wait ~3 days for synchronization and verification process to finish.

8. Go to AWS console EC2 | Volumes section and take a volume snapshot.

9. When snapshot is ready put it as ChainSnapshodId parameters stack-parameters.json and update the stack.

bash -x update-stack

Challenges of running mainnet node

Blockchain data persistence

The biggest challenge in getting the mainnet node up is getting it synced. Syncing a newly connected node from zero takes 2–3 hours to get to the current blocks. Then it takes additional 2–3 days to complete cryptographic verification process of all the blocks. The verification process uses all available IOPS making the node not very responsive. During this time, it’s often observed that the node would fall behind by up to 30 blocks which makes it not useful. Only after the verification process is completed the node gets stable and can be relied on. The stacks offered by AWS don’t address this problem — each node starts with a clean slate. This is fine, as long as you don’t have a lot of data to sync from other nodes.

Since it “costs” ~3 days to get the new node up, it’s necessary to maintain data between restarts of the node. There is surely more than one way to acomplish this. The solution we suggest is to store the chain data on EBS drive and take a snapshot of it once the blockchain gets fully synced. If a node gets terminated and new one takes over it will only have to sync the blocks since the snapshot and not the whole 3 years of chain history.

Clearly its not ideal and we would prefer to have persistent data and never have to re-sync any blocks although so far we’ve found ideal solution.

Approaches that we’ve dismissed

Single persistent EBS

One thing that we’ve tried was to create a persistent EBS volume outside of the context of EC2 machine and connect it to a node on startup. This approach has it’s upsides. When machine gets terminated and a new one is spinned it starts when the other one left off. That’s great feature, because it makes the delay of re-sync minimal.

On the downside this approach doesn’t play well with scaling number of instances up and down. In a scenario where we would like to have more nodes to failover or balance the load we would need to add additional layer to decide which EBS drive to use or possibly spawn a new one. We’ve dismissed it as too complicated.

Elastic File System (EFS)

Another interesting attempt to solve scaling problem was using EFS. Unlike EBS it can be connected to multiple instances which share it using NFS-like protocol. Unfortunately we’ve seen that nodes with chain-data on EFS took forever to synchronize. Parity uses a lot of IOPS and EFS offers much lower perfomance that EBS.

Public network access for synchronization layer

For the node to synchronize it needs to be able to accept connections from other nodes. Well, to be completely strict, it’s only required that one side of the connection can accept connections so technically we could work without. However it we skipped public access than our node could only work with nodes offering public access which eliminates big chunk of the peer pool.

To ensure public access we’ve used following steps.

Parity is run in a docker container. Port 30303 is bridged using following part of the cloudformation stack.

Resources:
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      ...
      ContainerDefinitions:
        ...
        PortMappings:
          - ContainerPort: 30303
            HostPort: 30303
            Protocol: tcp

2. The node needs to now its public IP, as this is used as enode identifier broadcasted to other nodes. The solution is specific to EC2 and relies on internal API available from machine. From docker/run_parity.sh:

PUBLIC_IP=`curl -s http://169.254.169.254/latest/meta-data/public-ipv4`
/parity/parity --config config.toml --nat extip:$PUBLIC_IP

3. For the port of EC2 machine to be accessible it also needs be opened in security group configuration. This part of the stack is responsible for doing just that.

Private access to json-rpc and websocket endpoints

Parity has two more network interfaces for accessing blockchain data.

port 8545 is used for json-rpc api: posting transactions and getting all sort of information
port 8546 can be used to receive notification from the node about new blocks and/or events

First lets discuss why we think json-rpc shouldn’t be publicly available. Depending on particular use case it may not be an issue to have json-rpc open. However at Rumble Fish we believe anything that can be hidden should remain hidden.

Leaving json-rpc endpoint open doesn’t put any funds in jeopardy. Not at least there is some fundamental bug in Parity thats still pending to be detetected. Nevertheless it’s easy to imagine that an attacker could simply run a lot of queries on the node just to prevent its legitimate use. Therefore we believe its worth to take extra effort to make this part more secure.

Our approach for private access consists of the following.

Cloudformation stack creates and export a special SecurityGroup used for accessing the node. You can import it another stack using:

!Fn::Import MainnetParity-AccessSecurityGroup

2. This group is given access to the instance using following setting in the SecurityGroup of the EC2 instance.

Resources:
  ECSSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      ...
      SecurityGroupIngress:
        - FromPort: 30303
          ToPort: 30303
          CidrIp: 0.0.0.0/0
          IpProtocol: tcp

These ports are routed to the docker container, similarly to what we've done before with port 30303.

3. Client connecting to json-rpc / websocket api need to do so by using private IP of the instance. We accomplish this by creating a Route53 HostedZone and registering instances IP in there on startup.

Cloudformation stack exports the nameservers of this zone to be imported as

or looked up in the AWS console exports.

You should put this value as NS entry in the configuration of your DNS domain.

Monitoring and logging

The stack is configured to gather interesting files from the machine and push them to CloudWatch log stream named MainnetParity-logs.

Sync and verification process

Here, the interesting bits are the files names /parity/parity/... which are the output of the parity process. The first time you launch the stack it will use warp sync to download the blockchain history using the bulk download protocol of Parity.

In the output it looks somewhat like this:

The process of syncing snapshots takes about 3 hours. After the snapshots are synced Parity will download all the blocks created since last snapshot until current head of blockchain. This phase look like this:

This will take about another hour to finish this stage.

When this phase is completed the log file will change like this:

The new type of logline starting with the block number (#40653 ..) comes from the process of verification of downloaded blocks. In this process Parity verifies each block cryptographicaly and ensures that noone tampered with the data.

This process takes about 3 days too complete when run on t2.machine with gp2 EBS 300 IOPS. While it's running you can observe in monitoring of EBS volume that all available IOPS are being consumed. Screenshot below represent the moment when verification process ends. You can see the difference in usage pattern.

Read IOPS

Write IOPS

Since the process of verification is IO bound it’s possible to make it faster by provisioning the EBS drive with extra IOPS. In our CloudFormation stack we use gp2 VolumeType with the size of 100 GB. AWS provisions 300 baseline IOPS for such drive. If you need to make verification faster you can modify the VolumeType to io1 and give it 1200 IOPS. At this level we observe that verification process is no longer constrained by available IOPS but it's missing the CPU power. Therefore you can push it to another level by changing the EC2 machine size from t2.medium to c5.large.

Running on c5.large we've observed that Parity during verification uses 2000 IOPS and can finish the whole process in about 7 hours, so it's a good shortcut if you need to have results fast. Just keep in mind that provisioned IOPS are not cheap, the monthly cost of leaving a drive of this size and IOPS will be in range of $100, so be careful.

The idea is that once the synchronization and verification is complete you can make a snapshot and use it to restart the cluster with the downsized disk and machine type.

Staying in sync

Once the node is fully synced and synchronized it generally stays in sync with head of the chain :-)

Parity diff to Infura

Image above presents effect of calling eth_blockNumber on our node and on Infura. Most of the time the nodes are in sync. Ocasionally either our node or Infura falls 1-4 blocks behind.

Please note, that currently this repository doesn’t include Lambda responsible for gathering metrics above. It will be included this in future articles.

[1] https://docs.aws.amazon.com/blockchain-templates/latest/developerguide/blockchain-templates-ethereum.html

Chcielibyście dowiedzieć się, jak uruchomić mainnetowy node Ethereum na AWS, ale wolicie przeczytać o tym po polsku? Zapraszamy do zapoznania się z tłumaczeniem artykułu opublikowanym przez justgeek.it: https://geek.justjoin.it/uruchomic-mainnetowy-node-ethereum-aws/