Henk-Jan Lebbink

Branchless Code With AVX-512

Sneller uses 16 parallel data lanes for almost all tasks, including loading and decompressing data, all without the use of branches. We heavily rely on predicated instruction execution provided by the AVX-512 instruction set to achieve this. In this post, we will explain a simple example of converting a string to uppercase, which is frequently used in our string processing functions.

continue reading...

Accelerating Fuzzy Search using AVX-512

We present our SQL fuzzy string compare and contains functionality that allows multi GiB/s processing without any need for preprocessing or indexing. Yes, that is right, fuzzy functionality yet no planning needed!

continue reading...

Sneller Regex vs Ripgrep

We present a speed comparison between ripgrep and Sneller’s SQL regular expression engine. We conclude that Sneller is faster with large text files, thanks to its ability to leverage multi-threading and optimized hardware utilization, despite the performance penalty for decompressing data.

continue reading...

Accelerating Regular Expressions with AVX-512

We present a high-performance regular expression engine that uses 16 parallel lanes, that does not need branching or backtracking. This engine is developed for the Intel Icelake processor, and is written in AVX-512 assembly.

continue reading...

Blazing Fast Unicode-aware ILIKE in AVX-512

We present a method to perform case-insensitive comparison of UTF-8 encoded strings using 16 parallel lanes and no branching. This method is used to implement the ILIKE operator for the Intel SkylakeX processor, written in AVX-512 assembly.

continue reading...