Sneller Endpoint Reference

Introduction

The Sneller query engine has a simple API with the following endpoints:

  • / to obtain version information.
  • /query to execute a SQL query.
  • /databases to list the databases.
  • /tables to list the tables in a specific database.
  • /inputs to list the ingested input files of a specific table.

The Sneller service can be reached at https://snellerd-production.<aws-region>.sneller.ai.

Authorization

All endpoints except the version information endpoint require the use of the Authorization header. The correct format for the Authorization header is:

Authorization: Bearer YOUR_TOKEN

where YOUR_TOKEN is replaced with a token generated from the Sneller Cloud console. If no Authorization header is specified, then the endpoints return 401 Unauthorized. If the authorization header is incorrect or the token is not valid, then 403 Forbidden will be returned.

Version endpoint (/)

The / endpoint only allows the GET and HEAD methods. The endpoint returns 200 OK and the GET method also returns the following information:

Sneller daemon date: <build-date>, revision: <git-revision> (cluster size: <cluster-size> nodes)

Request Headers

The version endpoint supports the following request headers.

Accept

Sneller uses the Accept header to determine how to format responses. Acceptable values are as follows:

  • text/plain: return the version as text string (default).
  • application/json: return the result as a JSON object that holds the cluster_size, data and revision fields.

The Accept header is ignored for HEAD requests.

Query endpoint (/query)

The /query endpoint allows GET, HEAD, and POST methods and is used to execute queries on Sneller.

GET and HEAD

GET requests should encode the query to be executed in the query URL query parameter, and the request body should be empty.

HEAD requests should encode the query that would be executed in the same manner as GET requests. The query will not be executed, but informational HTTP headers will be included in the response. (In other words, HEAD requests do not incur any billable usage.)

POST

POST requests should send the SQL query to be executed as the request body as UTF-8 encoded text.

POST should be used when the query is too large to fit in an URL query parameter, so it’s the preferred method if you need to execute arbitrary queries.

URL Query Parameters

  • query: for GET and HEAD requests, the raw SQL query text to execute.
  • json: for GET and POST requests, encode the response as NDJSON.
  • database: for all requests, the default database to use in SQL queries for which no database is specified explicitly.
  • dry: for POST requests to perform a dry run (like HEAD).

Request Headers

The query endpoint supports the following request headers. Note that also the Authorization header is required.

Accept

Sneller uses the Accept header to determine how to format responses. Acceptable values are as follows:

  • application/ion: results are returned as an ion data stream. This is the default format if no Accept header is specified. It cannot be used in combination with the json query parameter.
  • application/json: results are returned as a JSON array.
  • application/x-ndjson or application/x-jsonlines: results are returned as NDJSON records.

The Accept header is ignored for HEAD requests.

If-None-Match

The If-None-Match header can be used to make query execution conditional on the table contents changing between requests.

The value of the If-None-Match header should be the Etag header value from a previous query response. If the If-None-Match header is present and the Etag of the response would be equal to the value of the If-None-Match header, the server will respond with 304 Not Modified for GET and HEAD requests and 412 Precondition Failed for POST requests. Since no query is executed for 304 Not Modified or 412 Precondition Failed responses, no billable activity is accrued to your user account.

If-Modified-Since

The If-Modified-Since header can be used to make query execution conditional on the table contents having been modified after a particular time. (The If-Modified-Since header can only be used on GET and HEAD requests.)

If If-Modified-Since is a timestamp that is strictly after the latest modification time of any of the data referenced by the query, then the query will not be executed and the server will respond with 304 Not Modified and an empty response body.

Response Headers

Content-Type

The Content-Type response header indicates the format of the response. (The json query parameter and the Accept request header determine which Content-Type is used for the response.)

Content-Type will be one of these values:

  • application/x-ndjson for NDJSON responses.
  • application/json for JSON responses.
  • application/ion for ion-encoded responses.
  • text/plain (only used when returning errors).

X-Sneller-Query-ID

The X-Sneller-Query-ID response header is a unique ID assigned to the HTTP request.

X-Sneller-Max-Scanned-Bytes

The X-Sneller-Max-Scanned-Bytes response header indicates the maximum number of bytes that could be scanned by the query in the original request.

The HEAD request method can be used in conjunction with X-Sneller-Max-Scanned-Bytes to compute the maximum cost of a query without actually executing it.

Last-Modified

The Last-Modified response header indicates the time at which the most-recently-updated data block was changed. (This timestamp is computed only over data blocks that the query would actually scan. It is not necessarily computed over all the data blocks in the tables referenced by the query.)

The HEAD request method can be used in conjunction with Last-Modified to determine if the results of a previously-executed query would be different if the query were executed again.

Etag

The Etag response header is a hash computed over the current state of all the tables referenced by the query and the query text itself.

Like Last-Modified, the HEAD request method can be used in conjunction with Etag determine if the results of a previously-executed query would be different if the query were executed again.

A returned Etag can also be used as the If-None-Match header in a subsequent request to make the request conditional on the underlying data for a query having changed.

Conditional Request Example

Here’s a GET request to the Sneller playground executed via curl. It returns the Etag "DJjx2iOpLUjBf4O0NtRXwjXOMQIpH4ZSqe4jHknKyjU".

$ curl -i -G 'https://play.sneller.ai/query?database=demo&json' --data-urlencode 'query=SELECT count(*) from gha where created_at > `2021-12-30T00:00:00Z`'
HTTP/2 200 
date: Wed, 03 May 2023 21:35:37 GMT
content-type: application/x-ndjson
content-length: 19
access-control-allow-credentials: true
access-control-allow-headers: Accept, Authorization
access-control-allow-methods: GET, POST
access-control-allow-origin: *
access-control-expose-headers: Etag, X-Sneller-Max-Scanned-Bytes, X-Sneller-Query-ID, X-Sneller-Total-Table-Bytes, X-Sneller-Version
cache-control: private, must-revalidate
etag: "DJjx2iOpLUjBf4O0NtRXwjXOMQIpH4ZSqe4jHknKyjU"
last-modified: Fri, 28 Apr 2023 05:30:05 GMT
vary: Accept, Authentication
x-sneller-max-scanned-bytes: 64381517824
x-sneller-query-id: 4d3d8a55-98a1-4be9-98e5-3d21f226e441
x-sneller-version: date: 2023-04-28T09:47:25Z, revision: c0b24b18b8b13eef56b30a6746b431d288230274

{"count": 4968003}

If we run the query a second time with If-None-Match set to the Etag returned from the first query, we get a 304 Not Modified response with no body:

$ curl -i -H 'If-None-Match: "DJjx2iOpLUjBf4O0NtRXwjXOMQIpH4ZSqe4jHknKyjU"' -G 'https://play.sneller.ai/query?database=demo&json' --data-urlencode 'query=SELECT count(*) from gha where created_at > `2021-12-30T00:00:00Z`'
HTTP/2 304 
date: Wed, 03 May 2023 21:36:12 GMT
access-control-allow-credentials: true
access-control-allow-headers: Accept, Authorization
access-control-allow-methods: GET, POST
access-control-allow-origin: *
access-control-expose-headers: Etag, X-Sneller-Max-Scanned-Bytes, X-Sneller-Query-ID, X-Sneller-Total-Table-Bytes, X-Sneller-Version
cache-control: private, must-revalidate
etag: "DJjx2iOpLUjBf4O0NtRXwjXOMQIpH4ZSqe4jHknKyjU"
last-modified: Fri, 28 Apr 2023 05:30:05 GMT
vary: Accept, Authentication
x-sneller-max-scanned-bytes: 64381517824
x-sneller-query-id: 3a1783f9-daf7-4013-b041-cc25cac9b2a7
x-sneller-version: date: 2023-04-28T09:47:25Z, revision: c0b24b18b8b13eef56b30a6746b431d288230274

Databases endpoint (/databases)

The /databases endpoint only allows the GET and HEAD methods. The endpoint returns 200 OK and the GET method also returns the list of databases as a JSON array:

[{"name":"db1"},{"name":"db2"}]

The HEAD method only checks if the request is valid and if the bucket holding the databases can be accessed.

URL Query Parameters

  • pattern (optional) allows to only list databases that match the specific pattern. Patterns can use the _ (any character) or % (zero or more characters) wildcards.

Tables endpoint (/tables)

The /tables endpoint only allows the GET and HEAD methods.

The endpoint returns 200 OK and the GET method also returns the list of tables for the specified database as a JSON array:

["table1","table2","table3"]

If the specified database doesn’t exist, then 404 Not Found is returned. The HEAD method only checks if the request is valid and if the database exists.

URL Query Parameters

  • database is mandatory to specify from which database the tables should be listed. The endpoint returns 400 Bad Request if this parameter is not set.
  • pattern (optional) allows to only list tables that match the specific pattern. Patterns can use the _ (any character) or % (zero or more characters) wildcards.

Inputs endpoint (/inputs)

The /inputs endpoint lists the ingested source files for a specific database table and only allows the GET and HEAD methods. The HEAD method only checks if the request is valid and if the index can be read. This endpoint is limited to an execution duration of 30 seconds to avoid “endless” listing.

URL Query Parameters

  • database: the database that holds the table (mandatory).
  • table: the table for which the source files should be listed (mandatory).
  • max: maximum number of source files that should be returned. When it’s not set, then it will list ALL source files, which can take a long time for large tables with a lot of source files. When set to 0 it will behave as if it was a HEAD request and only check if the index can be opened.
  • start: start listing from the specified source file (see also next). The specified source file is included in the results again.
  • next: start listing after the specified source file (see also start). The specified source file is not included in the results.

Request Headers

The query endpoint supports the following request headers. Note that also the Authorization header is required.

Accept

Sneller uses the Accept header to determine how to format responses. Acceptable values are as follows:

  • application/json: results are returned as a JSON array.
  • application/x-ndjson or application/x-jsonlines: results are returned as NDJSON records.

The Accept header is ignored for HEAD requests. If no Accept header is specified, then the results are returned as application/x-ndjson.

Response Headers

Content-Type

The Content-Type response header indicates the format of the response. (The Accept request header determines which Content-Type is used for the response.)

Content-Type will be one of these values:

  • application/x-ndjson for NDJSON responses.
  • application/x-jsonlines for NDJSON responses (when explicitly asked using the Accept header).
  • text/plain (only used when returning errors).