Installing Sneller in EKS using Terraform

Introduction

The previous example showed how to set-up a local Kubernetes cluster and run Sneller. Running the cluster inside AWS EKS is a much more practical solution and provides the following advantages:

  • More flexible scaling
  • Use S3 for object storage
  • Use IAM roles for least-privileged operation

Instead of running all scripts manually, it is much better to provision your infrastructure using code. AWS does provide CloudFormation, but we’ll use Terraform instead. Terraform also works for other cloud environments, such as Microsoft Azure or Google Cloud Platform. Additionally, it has providers for Kubernetes, Helm and other systems.1

Creating the complete infrastructure consists of two steps:

  1. Create the VPC and EKS cluster.
  2. Provision the EKS cluster to run Sneller.

You can also find all the scripts in the https://github.com/SnellerInc/examples/tree/master/terraform/install-eks repository.

Step 1: Create the VPC and EKS cluster

The first step will set up the VPC and EKS cluster. This is based on the Provision an EKS Cluster (AWS) example in the Hashicorp documentation. Make sure to check it out to get more in-depth knowledge about setting up the VPC and EKS cluster.

Setting up Terraform

The first step is to set-up Terraform. It uses the AWS provider and it will use the current user’s AWS credentials, so make sure you have sufficient rights.

This script uses the following variables:

  • region specifies the AWS region where to deploy the cluster.
  • instance_type specifies the instance-types of the EKS nodes. Make sure to choose instance types that support AVX-512 and have enough memory to cache data in memory.
  • prefix specifies a prefix that is used for all global resources. Some resources need unique names (i.e. S3 buckets). If you don’t specify a prefix, then it will generate a random 4 character prefix instead.

To ensure that we always have a prefix, we need some “magic” to create a randomized prefix if no prefix was set. Note that Terraform also stores the random prefix in the state, so it won’t change between runs.

Creating the VPC

The EKS cluster will run in its own private VPC called XYZXYZXYZ. The VPC will use up to 3 availability zones to ensure high availability. It creates both a public and a private subnet for each availability zone.

Create the Kubernetes cluster (EKS)

The EKS cluster is created using the following script. There is only a single node-group that uses on-demand instances. The eks_managed_node_groups can be extended to also make use of spot-instances. Refer to the terraform-aws-modules/eks/aws documentation for more information about this.

Output

We need some information from this step for the next step, so all relevant output will be stored into variables.

Create the infrastructure

This is all we need to create the VPC and EKS cluster, so you can now initialize Terraform and apply the script:

export TF_VAR_prefix=test      # make sure to use your own prefix here
export TF_VAR_region=us-east-2 
terraform init                 # only needed once
terraform apply

If everything is fine, then Terraform will show you a detailed plan of the required infrastructure changes. If you want to make changes, then alter the variables or Terraform scripts and run terraform apply again.

Note that some changes may require a rebuild of the VPC and/or EKS cluster. If they are recreated, then you lose all data inside the cluster. ALWAYS check the Terraform plan before making any changes!!

If you need to destroy the VPC and EKS cluster, then run terraform destroy. Make sure you first destroy the infrastructure in step 2 before destroying the infrastructure of step 1.

Step 2: Deploy Sneller

Setting up Terraform

The first step is to set-up Terraform again for this step. It uses the AWS provider, but it now also uses the Kubernetes and Helm providers to provision the cluster.

This script reads the output from the previous step (via the data.terraform_remote_state.step1 resource), so it knows about the prefix, EKS cluster, etc. It also introduces the following new variables:

  • namespace specifies the namespace within the Kubernetes cluster (defaults to sneller)
  • database / table specifies the name of the database and table.

Creating the namespace

It is good practice to group resources for a specific service within a namespace, so a new namespace is created using the following resource:

Creating the S3 buckets

Sneller uses two kinds of buckets:

  1. Source buckets that hold the data that will be ingested. The data can either stay in these buckets or it can be removed after ingestion.
  2. Ingestion bucket that holds the data that has been ingested by Sneller. The query engine always uses this data, so make sure it isn’t deleted (it’s not a cache). You can always export data back to the original JSON format.

Source bucket

First we’ll create the source bucket and make sure public access is denied. In this example we’ll also add some (small) sample data to ensure that we have some trial data by adding three ND-JSON encoded files to the bucket.

These are the three ND-JSON encoded data files:

Ingestion bucket

The ingestion bucket should also disallow public access. It holds the table definition file that is stored in s3://<ingestion-bucket>/db/<dbname>/<tablename>/definition.json and points to the sample data files.

Service Accounts

Kubernetes has a concept of service accounts to add an identity to the cluster with custom security policies. A service account is a Kubernetes concept, but it can be mapped to an IAM role. This is explained in detail in IAM roles for service accounts, but conceptually it is used to map a Kubernetes service account to an IAM role. This ensures that the services within the pod run with the security policies of that IAM role.

We have two services and we’ll apply the principle of least privilege to both services.

  • sdb requires read-only rights to the source bucket and should be able to read/write to the ingestion bucket.
  • snellerd requires only read-only rights to the ingestion bucket.

Both services will use their own dedicated service account. Mapping a service account to an IAM role requires that the Kubernetes service account is created. It also requires an IAM role (with some IAM policies attached). Additionally, there should be a trust relationship, so when EKS requests a certain service account, then the pod is granted the identity of the associated IAM role.

The service account itself is decorated with the eks.amazonaws.com/role-arn annotation that holds the requested IAM role for the service account. The IAM role itself is created and a trust-relationship is set up to allow the cluster’s OIDC provider to use this role for specific service accounts. This is a rather complicated setup, but Terraform makes this simple by using the iam-role-for-service-accounts-eks submodule.

The service account definition with IAM role for sdb looks like this:

The service account definition with IAM role for snellerd looks like this:

The great thing about mapping service accounts to IAM roles is that we don’t need static AWS access keys/secrets anymore. The service account will use an automatic rotating web identity that can be exchanged for temporary AWS credentials. These temporary AWS credentials are only valid for about an hour and will automatically rotate.

Deploy Sneller

The final step is to actually deploy Sneller. We have created a Helm package to allow an easy installation. The Helm package will install three pods that run the Sneller daemon and schedule an sdb job that ingests new data every minute.

The index file is protected using an 256-bit index key that should be treated as a secret. It is used by sdb to sign the index file and by the Sneller daemon to check if the index file hasn’t been tampered with. Note that the index file also contains the Etag (hash) of the ingested data files, so tampering with data files will immediately render them invalid.

The service accounts are mapped to the service accounts that we created before and the S3 bucket is set to the ingest bucket.

Output

We need some information from this step when querying Sneller, so all relevant output will be stored into variables again.

Create the infrastructure

This is all we need to actually deploy Sneller, so you can now initialize Terraform and apply the script:

terraform init                 # only needed once
terraform apply

If everything is fine, then Terraform will show you a detailed plan of the required infrastructure changes. If you want to make changes, then alter the variables or Terraform scripts and run terraform apply again.

Note that some changes may require require to re-ingest the data or you may even lose your source data. ALWAYS check the Terraform plan before making any changes!!

Using Sneller

Using this setup, by default the Sneller daemon can only be accessed from within the Kubernetes cluster. First we need to get access to our cluster, so we’ll issue the following command to update the kubeconfig file.

cd ../step1
export EKS_NAME=$(terraform output -json cluster_name | jq -r '.')
export EKS_REGION=$(terraform output -json region | jq -r '.')
aws eks update-kubeconfig --region $EKS_REGION --name $EKS_NAME
cd ../step2
export SNELLER_NAMESPACE=$(terraform output -json namespace | jq -r '.')
export SNELLER_DATABASE=$(terraform output -json database | jq -r '.')
export SNELLER_TABLE=$(terraform output -json table | jq -r '.')
kubectl config set-context --current --namespace=$SNELLER_NAMESPACE

Sneller always uses a bearer token to be able to access it. This bearer token can be fetched using this:

export SNELLER_TOKEN=$(kubectl get secret sneller-token --template={{.data.snellerToken}} | base64 --decode)

Accessing Sneller within the cluster

The Helm script exposes Sneller as a service, so it can be accessed using the standard way. You can fire up a pod inside the cluster using the following command (make sure you have already read the token in SNELLER_TOKEN):

kubectl run test --restart=Never --rm -i --tty --image=curlimages/curl:latest \
        --env "SNELLER_TOKEN=$SNELLER_TOKEN" \
        --env "SNELLER_NAMESPACE=$SNELLER_NAMESPACE" \
        --env "SNELLER_DATABASE=$SNELLER_DATABASE" \
        --env "SNELLER_TABLE=$SNELLER_TABLE" \
        --command -- /bin/sh

You should now be running inside a pod in the cluster and you can invoke a query using the following statement:

curl -H "Authorization: Bearer $SNELLER_TOKEN" \
      -H "Accept: application/json" \
      -s "http://sneller-snellerd.$SNELLER_NAMESPACE.svc.cluster.local:8000/query?database=$SNELLER_DATABASE" \
      --data-raw "SELECT COUNT(*) FROM $SNELLER_TABLE"

Once you are done, you can simply enter exit and the pod is gone.

Accessing Sneller locally (using port forwarding)

It’s also possible to use kubectl to set up a port forwarding to your local machine using:

kubectl port-forward service/sneller-snellerd 8000 > /dev/null &
SNELLERD_PID=$!

The Sneller daemon port-forwarding is running in the background and can be stopped again using kill $SNELLERD_PID when it’s not needed anymore. For now we’ll keep it running. Now that port-forwarding is active, we should be able to access the Sneller daemon:

curl http://localhost:8000

Now you can invoke a query using:

curl -H "Authorization: Bearer $SNELLER_TOKEN" \
     -H "Accept: application/json" \
     "http://localhost:8000/query?database=$SNELLER_DATABASE" \
     --data-raw "SELECT COUNT(*) FROM $SNELLER_TABLE"

Using ingress to expose the Sneller daemon

The typical way to expose REST endpoints to the outside world is by using an Ingress controller. Kubernetes supports different Ingress controllers, but in this example we’ll use the AWS Load Balancer Controller.2

Allowing ingress using AWS Load Balancer Controller

A full walk-through about installing the AWS Load Balancer Controller can be found in the AWS documentation.

Creating the service account

The Load Balancer Controller should be able to create AWS resources, so it requires some IAM permissions to do this. The iam-role-for-service-accounts-eks submodule will create an IAM role that attaches the proper rights to the IAM role and ensures that the IAM role can be assumed by the specified Kubernetes service account.

Installing the AWS Load Balancer Controller

With the appropriate service account, the AWS Load Balancer Controller can be installed using a Helm script:

The Helm script requires several values, but they are straightforward. Check the documentation to view all the the chart values.

Choosing a hostname in your domain

When the service is accessed from the internet, then it needs to have a fully qualified domain name (FQDN). It consists of the hostname and domain part that together make up the FQDN:

Changing the service type

The default Sneller service type is ClusterIP. This only allows access from within the cluster, so we need to change the service type to NodePort instead. The service type can be set using the snellerd.serviceType value in the Helm chart.

Creating the ingress resource

The AWS Load Balancer Controller works pretty simple. Each ingress resource with the proper annotations will be exposed via an AWS load balancer. Check the documentation for a complete list of all annotations.

First, we’ll enable ingress by setting the Helm chart value ingress.enabled to true and ingress.hosts.0 to the FQDN of the service. To ensure the service being exposed to the internet the following annotations should be set too:

  • kubernetes.io/ingress.class is set to alb to create an application load balancer
  • alb.ingress.kubernetes.io/scheme is set to internet-facing to ensure a public load balancer that is accessible from the internet.
  • alb.ingress.kubernetes.io/certificate-arn can optionally be set to the ARN of the certificate that is used to enable TLS.

Now the Sneller configuration depends on the AWS Load Balancer Controller, it shouldn’t be created before the controller has been installed. This can be done by adding depends_on = [helm_release.lb] to the Sneller Helm release.

The updated sneller.tf now has some additional settings:

Create a DNS entry

The AWS load balancer that is created for the ingress has a complicated name that is generated by AWS, such as k8s-sneller-snellers-a9299ac223-552088442.us-east-1.elb.amazonaws.com, so it should be mapped to the FQDN that has been defined using the hostname and domain variables.

Domain hosted in same AWS account

If the domain is hosted in the same AWS account, then you can use the following Terraform script to create an alias to the AWS load balancer

Domain hosted outside the AWS account

If the domain is not hosted in this AWS account, then you should create a CNAME entry in your domain that points to the AWS load balancer. Once the load balancer has been created, it can be obtained using the following command:

kubectl get ingress sneller-snellerd

Make sure you update your DNS configuration, so the actual FQDN points to the load balancer.

If your DNS provider also provides a Terraform plug-in, then you can use Terraform to create this CNAME entry too.

Create a certificate

AWS automates certificate management using AWS Certificate Manager (ACM) and can issue and renew certificates automatically. ACM always validates certificates using a DNS challenge. If you want to automate this process, then ensure that the required DNS entries are made.

Domain hosted in same AWS account

If the domain is hosted in the same AWS account, then you can use the following Terraform script to generate a certificate and the required DNS entries that are used for validation.

Domain hosted outside the AWS account

If the domain is not hosted in this AWS account, then you should create the required DNS entries in your domain by hand (or automate it using your DNS provider’s Terraform plug-in).

You also may want to use another ingress controller (i.e. nginx) and use cert-manager to issue the certificates. cert-manager can also validate certificates using an HTTP challenge, which may be easier in some situations.

Create the infrastructure

Now the ingress and certificates are set up, we’ll update the cluster with the new configuration:

export TF_VAR_hostname=sneller
export TF_VAR_domain=example.com # make sure to use your own domain 
terraform apply

If everything is fine, then Terraform will show you a detailed plan of the required infrastructure changes. If you want to make changes, then alter the variables or Terraform scripts and run terraform apply again.

Run some queries

Export the following variables to obtain the proper database, table, Sneller end-point and token:

export SNELLER_DATABASE=$(terraform output -json database | jq -r '.')
export SNELLER_TABLE=$(terraform output -json table | jq -r '.')
export SNELLER_ENDPOINT=$(terraform output -json fqdn | jq -r '.')
export SNELLER_TOKEN=$(kubectl get secret sneller-token --template={{.data.snellerToken}} | base64 --decode)

Now you can run the query directly on the Sneller end-point using TLS:

curl -H "Authorization: Bearer $SNELLER_TOKEN" \
      -H "Accept: application/json" \
      -s "https://$SNELLER_ENDPOINT/query?database=$SNELLER_DATABASE" \
      --data-raw "SELECT COUNT(*) FROM $SNELLER_TABLE"

Adding data

Data can be added easily by adding data to the source bucket. It will be automatically picked up by the sdb cronjob that runs every minute.


  1. The Sneller Cloud infrastructure is also managed using Terraform. ↩︎

  2. If you deploy Kubernetes in an on-premise Kubernetes cluster, then you may want to try the Traefik or Nginx ingress controllers instead. Check here for a complete list of ingress controllers. ↩︎