I am developing a database server similar to Cassandra.
Development were started in C, but things became very complicated without classes.
I made test port on Java, but Java is not very well suited for such task, because in C / C++ I can pass pointers around without copying the objects. In Java I can not do so.
Currently I ported everything in C++11, but I am still learning "modern" C++ and have doubts about lot of things.
I asked lots of questions in Stackoverflow and got good answers, but seems I have XY problem :) . So I can try to explain what I need to do and probably someone can help me with some good solution.
So:
Database will work with Key / Value pairs. Every pair have some more information - when is created also when it will expire (0 if not expire).
Key is C string, Value is void *, but at least for the moment I am operating with the value as C string as well.
There are abstract "List" class. It is inherited from three classes - something like a vector, link list and skip list. In the past I did tests with hashtable and in the future I might do Red Black tree as well.
Each List contain zero or more *pointer* to pairs, sorted by key.
If list became too long, it can be saved on the disk in a special file. This special file is kind of read only list.
If you need to search for a key, first in memory list is searched (skip list, vector or link list). Then search is send to the files sorted by date (new file first, old file - last). All files are mmap-ed in memory.
-------------
What is currently puzzling me is following:
The pairs are with different size!!!! allocated by new() and they have std::shared_ptr pointed to them.
1 2 3 4 5 6 7 8
|
struct Pair::Blob{
uint64_t created; // 8
uint32_t expires; // 4, 136 years, not that bad.
uint32_t vallen; // 4
uint16_t keylen; // 2
uint8_t checksum; // 1
char buffer[1]; // dynamic
}
|
"buffer" member variable is the one with different size.
This same layout is used on the disk as well, so I can do something like:
|
Pair *p = (Pair *) & mmaped_array[pos];
|
However this different size is a problem on lots of places with C++ code.
For example I can not do std::make_shared.
From the other side, If I do "buffer" to dynamic array (e.g. new char[123]), I will lose mmap "trick", also I will do two dereferences if I want to check the key.
-------------
In case anyone want to help, the full source code is here:
https://github.com/nmmmnu/HM3