Database API

Before we dive into the theory and implementation of my toy database, let's first look at the toy database's API, which consists of the following methods:

set_time
begin_txn
write
read
read_without_txn
abort_txn
commit_txn
run_txn

Here is an example of using the database:

#![allow(unused)]
fn main() {
let db = DB::new("./tmp/data", Timestamp::new(10))
let txn1 = db.begin_txn().await;
let value = db.read::<String>("foo", txn1).await.unwrap();
if value == "bar" {
	db.write("baz", 20, txn1).await.unwrap();
}
let commit_result = db.commit_txn(txn1).await;
}

In the code snippet above, we created a database by providing a path to specify where to store the records. We then began a transaction, performed a write and a read, then committed the transaction.

An alternative way to perform transactions is with the run_txn method. In the snippet below, the run_txn function automatically begins a transaction and commits the transaction at the end of the function scope. It would also abort the transaction if the inner function panics.

#![allow(unused)]
fn main() {
db.run_txn(|txn_context| async move {
		let value = txn_context.read::<i32>("foo").await;
		if value == "bar" {
	    txn_context.write("foo", 12).await.unwrap();
    }
})
}

For more examples, feel free to check out the unit tests I wrote for my database.

Thread-safe

The database is thread-safe. If you wrap the database instance around an Arc, you can safely use it across different threads. For example:

#![allow(unused)]
fn main() {
let db = Arc::new(DB::new("./tmp/data", Timestamp::new(10)));

let db_1 = Arc::clone(db);
let key1 = "foo";
let key2 = "bar";
let task_1 = tokio::spawn(async move {
    db_1.run_txn(|txn_context| async move {
      txn_context.write(key1, 1).await.unwrap();
      txn_context.write(key2, 10).await.unwrap();
    })
    .await
});

let db_2 = Arc::clone(db);
let task_2 = tokio::spawn(async move {
    db_2.run_txn(|txn_context| async move {
      txn_context.write(key1, 2).await.unwrap();
      txn_context.write(key2, 20).await.unwrap();
    })
    .await;
});
tokio::try_join!(task_1, task_2).unwrap();
}

In the example above, the serializability of the database guarantees that either all of task1 is executed first or all of task2 is executed first.

Database Clock

The database is powered by a Hybrid Logical Clock (which we will cover later). The developer can choose to create a database instance that uses the system's time or a manual clock. A manual clock requires the developer to manually increment the physical time with the set_time function. This is useful for writing unit tests.

Building a Transactional Key-Value Database

Database API

Thread-safe

Database Clock