Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,493
|
Comments: 51,044
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 103 words

A couple of months ago I had the joy of giving an internal lecture to our developer group about Voron, RavenDB’s dedicated storage engine. In the lecture, I’m going over the design and implementation of our storage engine.

If you ever had an interest on how RavenDB’s transactional and high performance storage works, that is the lecture for you. Note that this is aimed at our developers, so we are going deep.

You can find the slides here and here is the full video.

time to read 1 min | 99 words

One of the most fun things that I do at work is share knowledge about how various things work. A few months ago I talked internally about how Certificates work. Instead of just describing the mechanism of that, I decided to actually walk our developers through the process of building the certificate infrastructure from scratch.

You can find the slides here and the full video is available online, it’s just over an hour of both lecture and discussion.

time to read 1 min | 127 words

I’m trying to pay a SaaS bill online, and I run into the following issue. I have insufficient permissions to pay the invoice on the account. No insufficient funds, which is something that you’ll routinely run into when dealing with payment processing. But insufficient permissions!

Is… paying something an act that requires permissions? That something that happens? Can I get more vulnerabilities like that? When I get people to drive-by pay for my bills?

I can’t think of a scenario where you are prevented from paying to the provider. That is… weird.

And now I’m in this “nice” position where I have to chase after the provider to give them money, because otherwise they’ll close the account.

time to read 3 min | 566 words

We got an interesting question in the RavenDB Discussion:

We have Polo (shirts) products. Some customers search for Polo and others search for Polos. The term Polos exists in only a few of the descriptions and marketing info so the results are different.

Is there a way to automatically generate singular and plural forms of a term or would I have to explicitly add those?

What is actually requested here is to perform a process known as stemming. Turning a word into its root. That is a core concept in full-text search, and RavenDB allows you to make use of that.

The idea is that during indexing and queries, RavenDB will transform the search terms into a common stem and search on that. Let’s look at how this works, shall we?

The first step is to make an index named Products/Search with the following definition:


from p in docs.Products
select new { p.Name }

That is about as simple an index as you can get, but we still need to configure the indexing of the Name field on the index, like so:

You can see that I customized the Name field and marked it for full-text search using the SnowballAnalyzer, which is responsible for properly stemming the terms.

However, if you try to create this index, you’ll get an error. By default, RavenDB doesn’t include the SnowballAnalyzer, but that isn’t going to stop us. This is because RavenDB allows users to define custom analyzers.

In the database “Settings”, go to “Custom Analyzers”:

And there you can  add a new analyzer. You can find the code for the analyzer in question in this Gist link.

You can also register analyzers by compiling them and placing the resulting DLLs in the RavenDB binaries directory. I find that having it as a single source file that we push to RavenDB in this manner is far cleaner.

Registering the analyzer via source means that you don’t need to worry about versioning, deploying to all the nodes in the cluster, or any such issues. It’s the responsibility of RavenDB to take care of this.

I produced the analyzer file by simply concatenating the relevant classes into a single file, basically creating a consolidated version containing everything required. That is usually done for C or C++ projects, but it is very useful in this case as well. Note that the analyzer in question must have a parameterless constructor. In this case, I just selected an English stemmer as the default one.

With the analyzer properly registered, we can create the index and start querying on it.

As you can see, we are able to find both plural and singular forms of the term we are searching for.

To make things even more interesting, this functionality is available with both Lucene and Corax indexes, as Corax is capable of consuming Lucene Analyzers.

The idea behind full-text search in RavenDB is that you have a full-blown indexing engine at your fingertips, but none of the complexity involved. At the same time, you can utilize advanced features without needing to move to another solution, everything is in a single box.

time to read 2 min | 270 words

RavenDB is typically accessed directly by your application, using an X509 certificate for authentication. The same applies when you are connecting to RavenDB as a user.

Many organizations require that user authentication will not use just a single factor (such as a password or a certificate) but multiple. RavenDB now supports the ability to define Two Factor Authentication for access.

Here is how this looks like in the RavenDB Studio:

You are able to generate a certificate as well as register the Authenticator code in your device.

When using the associated certificate, you’ll not be able to access RavenDB. Instead, you’ll get an error message saying that you need to complete the Two Factor Authentication process. Here is what that looks like:

Once you complete the two factor authentication process, you can select for how long we’ll allow access with the given certificate and whatever to allow just accesses from the current browser window (because you are accessing it directly) or from any client (you want to access RavenDB from another device or via code).

Once the session duration expires, you’ll need to provide the authentication code again, of course.

This feature is meant specifically for certificates that are used by people directly. It is not meant for APIs or programmatic access. Those should either have a manual step to allow the certificate or utilize a secrets manager that can have additional steps and validations based on your actual requirements.

You can read more about this feature in the feature announcement.

time to read 1 min | 101 words

When Oren Eini originally developed RavenDB, he used the Lucene library to implement indexing. Eventually, his team encountered limitations with this strategy, so they created the Corax search engine, which improved query execution time significantly. Oren discusses the challenges involved in creating this engine and the approaches they took to overcome these challenges.

Part 1:

Part 2:

time to read 2 min | 399 words

One of the interesting components of RavenDB Cloud is status reporting. It turns out that when you offer X as a Service, people really care about your operational status.

For RavenDB Cloud, we have https://status.ravendb.net/, which will give you some insights into the overall health of the system. Here are some details from the status page:

 

The interesting thing about this page is that it shows global status, indicating issues affecting large swaths of users. For instance, Azure having issues in a whole region in the image above is a great example of one such scenario. Regular maintenance, which we carry over the span of days, is something that we report, but you’ll usually never notice (due to the High Availability features of RavenDB).

It gets more complicated when we start talking about individual instances. There are many scenarios where the overall system health is great, but a particular database may suffer. The easiest example is if you run out of disk space. That affects that particular instance only.

For that scenario, we are reporting Production Monitoring Alerts within the RavenDB Cloud portal. Here is what this looks like:

 

 

As you can see, we report specific problems on those instances, raising that to your awareness. That was actually needed because, for the most part, RavenDB itself handles those sorts of things via High Availability, which means that even if there are issues, you’re likely to not feel them for a while.

Resilience at the cluster level means that even pretty severe problems are papered over and the system moves on. But there is only so much limping that you can do. If you are running at the bare edge of capacity, eventually you’ll trip over the line.

Those Production Monitoring Alerts allow you to detect and act upon those issues when they happen, not when they bring down production.

This aligns with our vision for RavenDB, the kind of system where you don’t need to have a full-time babysitter monitoring the system. Instead, if there is a problem that the database cannot solve on its own, it will explicitly notify you, in advance.

That leads to a system that is far healthier all around and means that you can focus on building your system, rather than managing database minutiae.

time to read 4 min | 792 words

RavenDB can run on the Raspberry Pi, it is actually an important use case for us when our users are deploying RavenDB as part of Internet of Things systems. We wanted to showcase RavenDB’s performance and decided that instead of scaling up and showing you how well RavenDB does ridiculous loads, we’ll go the other way around. We’ll go small, and let you directly experience how efficient RavenDB is.

You can look at the demo unit directly on this page.

We decided to dial it down yet further, and run RavenDB on the Raspberry Pi Zero.

This tiny computer is about the size of a cigarette lighter and is small enough to comfortably fit on your keychain. Most Raspberry Pis are impressive machines given their cost, more than powerful enough to power real applications.

Here is what this actually looks like, with me as a reference for size 🙂.

However, just installing RavenDB on the Zero isn't much of a challenge or particularly interesting, to be honest. We wanted to do something that would be both fun and useful. One of the features we want users to explore is the ability to run RavenDB in appliance mode. The question is, what sort of an appliance will we build?

A key part of our thinking was that we wanted to show something that works with realistic data sizes. We wanted to have an actual use case for this, beyond just showing a toy. One of the things that I always find maddening about being disconnected is that I feel like half my brain has been cut away.

We set out to fix that, the project is to create a knowledge system inside the Pi Zero that would be truly Plug & Play. That turned out to be quite a challenge, but I think we met it in a very nice manner.

We went to archive.org and got some of the Stack Exchange data sets. In particular, we got the datasets that are most interesting for DevOps scenarios. In particular, we have raspberrypi.stackexchange.com, unix.stackexchange.com, serverfault.com, and superuser.com.

I find it deliciously recursive that we can use the Raspberry Pi Zero to store the dataset about the Raspberry Pi itself. We loaded all those datasets into the Zero, for a total of about 7.5 GB, and over 4.2 million documents were stored there.

Note that this is using RavenDB’s document compression, which reduced the total size by over 50% over the original dataset size.

Next was the time to actually make this accessible. Just working with RavenDB directly to query the data is cool, for sure, but we wanted to be useful.

So we built a portal to access the data. Here is what it looks like when you enter it for the first time:

We offer full search capabilities and complete offline access to all those data sets. Perfect when you are stuck in the middle of nowhere and urgently need to remember that awk syntax or how to configure networking on a stubborn device.

Another aspect that we have to consider is how this can work? The Raspberry Pi Zero is a tiny device, and actually working with it can be annoying. It needs Micro-USB power but has no ethernet or standard USB ports. For display, it uses a mini HDMI port. That means that you can safely assume that you’re likely to have a power cable for it, but not much else.

We want to provide a good solution, so what do we do? The Raspberry Pi Zero we use does have a wifi chip, so we took things further and set it up as an access point with a captive portal.

You can read exactly how we configured that in this post.

In other words, the expected deployment model is to plug this into power, wait 30 seconds for the machine to boot, and then connect to the “Hugin” wireless network. You will then land directly into the application, able to deep dive into the questions of your choice.

We have been giving away those appliances at the DevWeek conference, and we got a really good reaction from users. Beyond the coolness factor, the fact that we can run a high-performance system on top of a… challenging hardware platform (512MB RAM, 1Ghz RAM, SD Card for disk) and still provide sub-100ms response times is quite amazing.

You can view the project page here, the entire thing is Open Source, and you can explore how we are able to do that on GitHub.

time to read 6 min | 1075 words

I’m really happy to announce that RavenDB Cloud is now offering NVMe based instances on the Performance tier. In short, that means that you can deploy RavenDB Cloud clusters to handle some truly high workloads.

You can learn more about what is actually going on in our documentation. For performance numbers and costs, feel free to skip to the bottom of this post.

I’m assuming that you may not be familiar with everything that a database needs to run fast, and this feature deserves a full explanation of what is on offer. So here are the full details of what you can now do.

RavenDB is a transactional database that often processes far more data than the memory available on the machine. Consequently, it needs to read from and write to  the disk. In fact, as a database, you can say that it is its primary role. This means that one of the most important factors for database performance is the speed of your disk. I have written about the topic before in more depth, if you are interested in exploring the topic.

When running on-premises, it’s easy to get the best disks you can. We recommend at least good SSDs and prefer NVMe drives for best results. When running on the cloud, the situation is quite different. Machines in the cloud are assumed to be transient, they come and go. Disks, on the other hand, are required to be persistent. So a typical disk on the cloud is actually a remote storage device (typically replicated). That means that disk I/O on the cloud is… slow. To the point where you could get better performance from off-the-shelf hardware from 20 years ago.

There are higher tiers of high-performance disks available in the cloud, of course. If you need them, you are paying quite a lot for that additional performance. There is another option, however. You can use NVMe disks on the cloud as well. Well, you could, but do you want to?

The reason you’d want to use an NVMe disk in the cloud is performance, of course. But the problem with achieving this performance on the cloud is that you lose the safety of “this disk is persistent beyond the machine”. In other words, this is literally a disk that is attached to the physical server hosting your VM. If something goes wrong with that machine, you lose the disk. Traditionally, that means that you can only use that for transient data, not as the backend store for a database.

However, RavenDB has some interesting options to deal with this. RavenDB Cloud runs RavenDB clusters with 3 copies of the data by default, operating in a full multi-master configuration. Given that we already have multiple copies of the data, what would happen if we lost a machine?

The underlying watchdog would realize that something happened and initiate recovery, which will effectively spawn the instance on another node. That works, but what about the data? All of that data is now lost. The design of RavenDB treats that as an acceptable scenario, the cluster would detect such an issue, move the affected node to rehabilitation mode, and start pumping all the data from the other nodes in the cluster to it.

In short, now we’ve shifted from a node failure being catastrophic to having a small bump in the data traffic bill at the end of the month. Thanks to its multi-master setup, RavenDB can recover even if two nodes go down at the same time, as we’ll recover from the third one. RavenDB Cloud runs the nodes in the cluster in separate availability zones specifically to handle such failure scenarios.

We have run into this scenario multiple times, both as part of our testing and as actual production events. I am happy to say that everything works as expected, the failed node comes up empty, is filled by the rest of the cluster, and then seamlessly resumes its work. The users were not even aware that something happened.

Of course, there is always the possibility that the entire region could go down, or that three separate instances in three separate availability zones would fail at the same time. What happens then? That is expected to be a rare scenario, but we are all about covering our edge cases.

In such a scenario, you would need to recover from backup. Clusters using NVMe disks are configured to run using Snapshot backups, which consume slightly more disk space than normal but can be restored more quickly.

RavenDB Cloud also blocks the user's ability to scale up or down such clusters from the portal and requires a support ticket to perform them. This is because special care is needed when performing such operations on NVMe machines. Even with those limitations, we are able to perform such actions with zero downtime for the users.

And after all this story, let’s talk numbers. Take a look at the following table illustrating the costs vs. performance on AWS (us-east-1):

Type# of coresMemoryDiskCost ($ / hour)
P40 (Premium disk)1664 GB2 TB, 10,000 IOPS, 360 MB/s8.790
PN30 (NVMe)864 GB2 TB, 110,000 IOPS, 440 MB/s6.395
PN40 (NVMe)16128 GB4 TB, 220,000 IOPS, 880 MB/s12.782

The situation is even more blatant when looking at the numbers on Azure (eastus):

Type# of coresMemoryDiskCost ($ / hour)
P40 (Premium disk)1664 GB2 TB, 7,500 IOPS, 250 MB/s7.089
PN30 (NVMe)864 GB2 TB, 400,000 IOPS, 2 GB/s6.485
PN40 (NVMe)16128 GB4 TB, 800,000 IOPS, 4 GB/s12.956

In other words, you can upgrade to the NVMe cluster and actually reduce the spend if you are stalled on I/O. We expect most workloads to see both higher performance and lower cost from a move from P40 with premium disk to PN30 (same amount of memory, fewer cores). For most workloads, we have found that IO rate matters even more than core count or CPU horsepower.

I’m really excited about this new feature, not only because it can give you a big performance boost but also because it leverages the same behavior that RavenDB already exhibits for handling issues in the cluster and recovering from unexpected failures.

In short, you now have some new capabilities at your fingertips, being able to use RavenDB Cloud for even more demanding workloads. Give it a try, I hear it goes vrooom 🙂.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Recording (13):
    05 Mar 2024 - Technology & Friends - Oren Eini on the Corax Search Engine
  2. Meta Blog (2):
    23 Jan 2024 - I'm a JS Developer now
  3. Production postmortem (51):
    12 Dec 2023 - The Spawn of Denial of Service
  4. Challenge (74):
    13 Oct 2023 - Fastest node selection metastable error state–answer
  5. Filtering negative numbers, fast (4):
    15 Sep 2023 - Beating memcpy()
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}