00:00:00When you're spinning up a new project and you need a database, what's the first option that
00:00:03comes to mind? SQLite? It's probably SQLite, right? I mean, it's great and it's reliable,
00:00:09it has zero config, it's an industry standard. But as our local data gets heavier and our queries
00:00:14get more complex, we're starting to hit that ceiling of what a single threaded file locking
00:00:20engine can do. But now there's a new player in town that's trying to solve these issues.
00:00:25It's called Stulab and it's a database engine written entirely in Rust and it just released
00:00:31a native Node.js driver that is quite frankly showing some really strong performance. The
00:00:36database is 138 times faster than SQLite. So in this video, we're going to take a look under the
00:00:43hood of Stulab, see how it works and run a live benchmarking test to see if it's really as powerful
00:00:50as it claims to be. It's going to be a lot of fun, so let's dive into it. So what exactly is Stulab?
00:01:00Well, at its core, Stulab is an embedded OLAP or online analytical processing database. Now if
00:01:06you're used to standard databases like SQLite or Postgres, those are typically OLTP or online
00:01:14transaction processing databases, which are, as the name suggests, optimized for transactions. But
00:01:20Stulab is different. It's designed for analytical workloads and it's built from the ground up in Rust
00:01:27focusing on high-speed data processing. Think of it as having the portability of a SQLite file,
00:01:33but with the raw analytical power of something like a DuckDB or BigQuery. But the coolest thing
00:01:39is that you can now use it with Node.js thanks to its native Node driver. Now when I say it's
00:01:45a native driver, I'm not talking about some standard wrapper. Usually when a database is
00:01:49written in a different language like Rust or C++, your Node.js app has to talk to it over a bridge.
00:01:56Often that means converting your data into JSON or some other format, sending it over a local network
00:02:02socket and then converting it back on the other side. That is called serialization overhead and
00:02:08it's a massive performance killer. But Stulab Node does things differently. It uses NAPI-RS,
00:02:15which is a framework that allows the Rust engine to be compiled into a native binary that loads
00:02:21directly into your Node.js process. So there's no bridge and no translator in between. When you send
00:02:27a query, Node.js and Rust are essentially sharing the same memory space. And there are three big
00:02:33reasons why Stulab is blazingly fast. First, it's using MVCC or multi-version concurrency control.
00:02:40Unlike SQLite where a single writer can log the entire database, Stulab allows multiple readers
00:02:47and writers to work at the same time. Secondly, it uses parallel execution. Stulab uses a scheduler
00:02:53called Rayon. And with Rayon, when you run a massive query, instead of sending on one CPU core, it breaks
00:03:00that query apart and utilizes every core your machine has. And thirdly, it uses a cost-based
00:03:06optimizer. So it doesn't just execute your SQL blindly, it actually analyzes your data, estimates
00:03:13the cost of different paths, and picks the fastest way to get to your results. So that's what supposedly
00:03:19makes Stulab a much faster option than SQLite. But let's test it out to see if that's actually true.
00:03:25For this test, we'll be using a simple Node.js project and we will install both Stulab and
00:03:30SQLite as dependencies. So one of the biggest wins where Stulab really shines is using count distinct.
00:03:37So I'm really curious to know if that really is the case. So I set up this simple script where we spin
00:03:43up an in-memory version of each database and create a symbol sales table. We then populate this table
00:03:49with 10,000 rows of random sales data where each row represents a sale from a user whose ID can vary
00:03:56from 0 to 1,000 and which has one of the specific categories. We will then batch insert the data in
00:04:03both databases and then run the benchmark where we select a distinct count of sales made by a specific
00:04:10user on a specific category and calculate the performance of each database. Now I do have to
00:04:16note that it is a bit frustrating that as of now the package installed does not work. If we run the
00:04:22benchmark test now, we see that it's complaining about a missing native binding. So the project's
00:04:28author clearly forgot to add the binaries or link the correct ones to the package. So what I had to
00:04:34do is build it from the source. So I cloned the repo, ran the build inside of it and then hopped
00:04:39back to my benchmark project and linked my source directory as the dependency. This is a bit
00:04:44frustrating at the moment, so I hope the authors of the project will fix this in the future. But
00:04:49nonetheless, after doing so, we're now finally able to run the benchmark. So let's do that now.
00:04:54As you can see, the count distinct operation is indeed a lot faster on Stulap, although not as
00:05:01much as advertised. It's only four times faster. So what if we add another zero to the number of
00:05:07data we want to populate and then run the test again with 1,000,000 rows? So let's try that now.
00:05:12So even for a million rows, Stulap is only six times faster, not 138 times. But nevertheless,
00:05:20it's still a great result. So that's the count distinct test. I decided to do another benchmark
00:05:26test to test out the distinct + order by operation. And in this second test, I did a setup where we
00:05:33ingest some random logs with different IP addresses and status codes, and then try to find distinct
00:05:39logs by IP address and status code pair, and then order them by IP ascending and status code
00:05:47descending. And as you can see, once we run this test, Stulap still performs better than SQLite,
00:05:53but not by 14 times, just 1 to 1.5 times faster. So the measurements listed here are a bit inflated,
00:06:01in my opinion. But nevertheless, Stulap is indeed faster than SQLite, as we just saw in the tests.
00:06:08To be fair, the author also mentions areas where SQLite still performed better than Stulap. And
00:06:13these are mostly situations where you perform single row operations. So Stulap is great for
00:06:19analytical and complex queries. So is Stulap the SQLite killer? Well, honestly, no. They're built
00:06:26for totally different things. SQLite is still your reliable daily driver for transactions,
00:06:32but Stulap might be your high performance race car for data analysis. But the fact that we now
00:06:38have a pure Rust analytical engine that we can now drop into a Node.js project with nappy RS
00:06:45is pretty great. The project owners just got to make sure they patch their current NPM package,
00:06:50so we don't have to build it from source. So there you have it, folks. That is Stulap in a nutshell.
00:06:55But what do you think about it? Is the performance boost worth switching to Stulap, or are you sticking
00:07:01with the reliability of SQLite? Let us know in the comments down below. And folks, if you found this
00:07:06video useful, please let me know by smashing that like button underneath the video. And also don't
00:07:11forget to subscribe to our channel. This has been Andris from Better Stack and I will see you in the
00:07:17next videos.