LowLevelers
Published on

SolomonDB Weekly Update (#2): Palmon

Authors
Thumbnail

Support my project by visiting SolomonDB Github and give me a star ⭐️

Weekly Update #2: Palmon

Date: November 2022 (1-7/11/2022)

fig

Palmon is one of the Kings of Hell, more obedient to Lucifer than other kings are, and has two hundred legions of demons under his rule. He has a great voice and roars as soon as he comes, speaking in this manner for a while until the conjurer compels him and then he answers clearly the questions he is asked. When a conjurer invokes this demon he must look towards the northwest, the direction of Paimon's house, and when Paimon appears he must be allowed to ask the conjurer what he wishes and be answered, in order to obtain the same from him. - Wikipedia

Commits on Nov 1, 2022

Description

The prior implementation of RocksDBDriver was using the database instance initialized using DB object. And all methods are called from the object DB instance too. However, this approach has some limitations in terms of concurrency operations. Briefly, I migrate it from using DB to handling operations using transactions - a better control flow approach. Instead of using TransactionDB, I choose OptimisticTransactionDB which is more efficient for read contention database, correct me if I am wrong.

You can read more about this in RocksDB Wiki

Detail explanation

Even though there is only one commit and only today, this commit has a lot of code refractors. An obvious refractor is migrating from Driver to Adapter which follows adapter pattern.

  • StorageDriver to StorageAdapter
  • RocksDBDriver to RocksDBAdapter

Walk through the implementation of StorageAdapter, we have

pub struct StorageAdapter<T> {
    pub name: StorageAdapterName,
    pub db_instance: Pin<Arc<T>>,
    pub variant: StorageVariant,
}

You may notice the use of generic type Pin<Arc<T>>. Explain more about why I wrote this. Pin trait will pin the object in a memory, which means it can't be move to a different location in a memory, for example, using std::mem::swap. Inside the Pin type, we have Arc or Atomically Reference Counted.Arc type shares ownership between threads, which is different from single threaded type Rc. Hence, this is usually used for handling multi-threaded operations and it is suitable for our distributed database design.

New method added to RocksDBAdapter to create a transaction

pub fn transaction(
        self: &'static Self,
        w: bool,
        r: bool,
    ) -> Result<Transaction<TxType>, Error> {
        let inner = self.get_inner();
        let db_instance = &inner.db_instance;
        let tx = db_instance.transaction();

        // The database reference must always outlive
        // the transaction. If it doesn't then this
        // is undefined behaviour. This unsafe block
        // ensures that the transaction reference is
        // static, but will cause a crash if the
        // datastore is dropped prematurely.
        let tx = unsafe {
            std::mem::transmute::<
                rocksdb::Transaction<'_, OptimisticTransactionDB>,
                rocksdb::Transaction<'static, OptimisticTransactionDB>,
            >(tx)
        };

        Ok(Transaction::<TxType>::new(tx, w, r))
    }

There is a head-aching concept in this method, we have an unsafe method use std::mem::transmute. This is not recommended to use as it transform the lifetime of an object. The reason why we use this method here is because, we need to cast the original lifetime of OptimisticTransactionDB to static as we want the transaction remains as long as it can until the program stops. This is referenced from the source code of SurrealDB.

On the other hand, we have an implementation for internal transaction

impl Transaction<TxType>

Any storage adapter can be understand as a bridge to create transaction, it does not store any transaction value. This separation provides the ability to control a single transaction each operation instead of a whole db_instance.

Contributors


Commits on Nov 2, 2022

Description

  • Enhancing the development experience by configuring the CI/CD pipeline using Github Actions.
  • Implement basic methods of RocksDB OptimisticDB transaction

Detail explanation

There are two workflows added:

  • Formatter (check + apply) workflow: Use cargo clippy and cargo fmt to format and lint the repo.
  • Test: Run cargo test on the workspace whenever there's an update to master branch

Every datastore transaction will be marked generically as DBTransaction or Distributed Database Transaction. This is implied by Solomon DB technical directory. Transaction will implement a trait that requires these below method

// Check if closed
fn closed(&self) -> bool;
// Cancel a transaction
fn cancel(&mut self) -> Result<(), Error>;
// Commit a transaction
fn commit(&mut self) -> Result<(), Error>;
// Check if a key exists
fn exi<K>(&mut self, key: K) -> Result<bool, Error>
where
K: Into<Key>;
// Fetch a key from the database
fn get<K>(&mut self, key: K) -> Result<Option<Val>, Error>;
// Insert or update a key in the database
fn set<K, V>(&mut self, key: K, val: V) -> Result<(), Error>;
// Insert a key if it doesn't exist in the database
fn put<K, V>(&mut self, key: K, val: V) -> Result<(), Error>;
// Delete a key
fn del<K>(&mut self, key: K) -> Result<(), Error>;

Contributors


Commits on Nov 4, 2022

Description

Write a new macro to auto generate test code for datastore adapter. Whenever there's a new datastore implemented, we can add it to the test suite easily by

full_test_impl!(RocksDBAdapter::default());

This code implementation is referenced from IndraDB. On the other hand, these commits add a new feature tag called test-suite which must be declared to allow all test runs.

[features]
default = ["kv-rocksdb"]
kv-rocksdb = ["dep:rocksdb"]
+ test-suite = []

To run cargo test or cargo nextest enabling the test-suite feature, can follow these commands to run

cargo test --features test-suite
cargo nextest run --features test-suite

Detail explanations

The logic behind the new macro is not too complicated, the macro define_test! receive any datastore_adapter as an input along with the name for the test. This name is also a name of methods exported from crate tests. This approach is required to overpass the type strictness of DatastoreAdapter as we will support multiple types of datastore adapter.

/// Defines a unit test function.
#[macro_export]
macro_rules! define_test {
	($name:ident, $datastore_adapter:expr) => {
		#[tokio::test]
		async fn $name() {
			let datastore_adapter = $datastore_adapter;
			$crate::tests::$name(datastore_adapter).await;
		}
	};
}

/// Use this macro to enable the entire standard test suite.
#[macro_export]
macro_rules! full_test_impl {
	($code:expr) => {
		#[cfg(test)]
		define_test!(should_delete_key, $code);
	};
}

Contributors


Commits on Nov 5, 2022

Description

Introducing a new struct DatastoreManager which is generated using macro impl_datastore_manager. The idea behinds this implementation is to manage all datastore adapter in the cleanest way.

  • Adding method to generate a random datastore path
/// Generate a path to store data for RocksDB
fn generate_path(id: Option<i32>) -> String {
	 let random_id: i32 = generate_random_i32();
	 let id = &id.unwrap_or(random_id).to_string();
- String::from("./.temp/rocks-") + id + ".db"
+ let path = if cfg!(target_os = "linux") {
+	 "/dev/shm/".into()
+ } else {
+	 temp_dir()
+ }
+ .join(format!("solomon-rocksdb-{}", id));
+
+	path_to_string(&path).unwrap()
}
  • Adding column family to RocksDB

Adding models for graph related struct: Node - Label - Property - Relationship

Detail explanation

RocksDB Column Family

As being stated in the RocksDB Wiki:

Each key-value pair in RocksDB is associated with exactly one Column Family.

In SolomonDB design, there will be multiple column families specified for different set of data. For example: vertex, vertex-property, edge, edge-property and label. All methods of RocksDB transaction will include a column family attribute. For example, the get method

/// Fetch a key from the database
- async fn get<K: Into<Key> + Send>(&mut self, key: K) -> Result<Option<Val>, Error>;
+ async fn get<K: Into<Key> + Send>(&mut self, cf: CF, key: K) -> Result<Option<Val>, Error>;

Property Graph Design

Not sure but I remember this has been mentioned in a very first commit log, our graph database will follow the structure of Property Graph.

  • Vertex: Common object in every graph model. In Property Graph, vertex is more versatile. It has multiple properties, it can be considered as a document in NoSQL database. Vertex can also have multiple labels to identify itself.
  • Relationship: Or edge in single relational database. Indicates the link between two vertex. However, relationship can have properties too. It also has a type, for example, if two LOVER nodes connect, the relationship type should be LOVE. Defined as source_vertex -> relationship -> target_vertex.
  • Property: Define attribute type and name of object field. For example, vertex (or node) can have name, age, birthday and relationship can also have name. Structure of property is uuid | name | type. Property of each core objects (node and relationship) will be stored in a HashMap<Uuid, Vec<u8>> where Uuid = Property ID and Vec<u8> = Byte value for that property
  • Label: Vertex can have multiple labels. For example, one vertex can be a Person, Programmer and Employee at the same time. This can be misunderstood with Property. However, they are not the same. Labels are used for marking node instead of defining attributes.

Contributors


Commits on Nov 6, 2022

Description

  • Write DEVLOG for 2022 October CHANGLOG and November 1st commits

Detail explanation

In facts, I did try to add a Github Actions workflows to auto generate Github CHANGELOG. However, it did not work as expected so I just decide to write CHANGELOG on my own.

name: Changelog
on:
  release:
    types:
      - created
jobs:
  changelog:
    runs-on: ubuntu-20.04
    steps:
      - name: '✏️ Generate release changelog'
        uses: heinrichreimer/github-changelog-generator-action@v2.3
        with:
          token: ${{ secrets.GITHUB_TOKEN }}

Contributors


Commits on Nov 7, 2022

Description

  • Updating Rust compiler version to support new feature in Rust 1.65.0: Generic Associated Type.
  • Adding CassandraDB
  • Write two new methods for Vertex controller

Detail explanation

Generic Associated Type is not new and it is common in Rust nightly channel. However, it was only officially released to stable channel in the latest version. Based on the given definition:

Generic associated types are an extension that permit associated types to have generic parameters. This allows associated types to capture types that may include generic parameters that came from a trait method. These can be lifetime or type parameters.

Here is how GAT used in SolomonDB code. We no longer have to care about Generic Type passed between structs. On the other hand, GAT Transaction force all implemented Transaction type to be SimpleTransaction.

#[async_trait]
- pub trait DatastoreAdapter<T: SimpleTransaction> {
+ pub trait DatastoreAdapter {
+   type Transaction: SimpleTransaction;
	// # Create new database transaction
	// Set `rw` default to false means readable but not readable
- fn transaction(&self, rw: bool) -> Result<T, Error>;
+ fn transaction(&self, rw: bool) -> Result<Self::Transaction, Error>;

	fn default() -> Self;

With GAT, we can make a fully reusable struct DatastoreAdapter and get rid of declaring Transaction generic type like RocksDBTransaction. The implementation of DatastoreAdapter instance for RocksDB will be

#[async_trait]
- impl DatastoreAdapter<RocksDBTransaction> for RocksDBAdapter {
+ impl DatastoreAdapter for RocksDBAdapter {
+ type Transaction = RocksDBTransaction;

	fn default() -> Self {
		let path = &RocksDBAdapter::generate_path(None);
		RocksDBAdapter::new(path, None).unwrap()

Super clean right? Worth a try if you have not updated yet.

Contributors