Database server development

I am developing a database server similar to Cassandra.

Development were started in C, but things became very complicated without classes.

I made test port on Java, but Java is not very well suited for such task, because in C / C++ I can pass pointers around without copying the objects. In Java I can not do so.

Currently I ported everything in C++11, but I am still learning "modern" C++ and have doubts about lot of things.

I asked lots of questions in Stackoverflow and got good answers, but seems I have XY problem :) . So I can try to explain what I need to do and probably someone can help me with some good solution.

So:

Database will work with Key / Value pairs. Every pair have some more information - when is created also when it will expire (0 if not expire).

Key is C string, Value is void *, but at least for the moment I am operating with the value as C string as well.

There are abstract "List" class. It is inherited from three classes - something like a vector, link list and skip list. In the past I did tests with hashtable and in the future I might do Red Black tree as well.

Each List contain zero or more *pointer* to pairs, sorted by key.

If list became too long, it can be saved on the disk in a special file. This special file is kind of read only list.

If you need to search for a key, first in memory list is searched (skip list, vector or link list). Then search is send to the files sorted by date (new file first, old file - last). All files are mmap-ed in memory.

-------------

What is currently puzzling me is following:

The pairs are with different size!!!! allocated by new() and they have std::shared_ptr pointed to them.

1
2
3
4
5
6
7
8
struct Pair::Blob{
	uint64_t	created;	// 8
	uint32_t	expires;	// 4, 136 years, not that bad.
	uint32_t	vallen;		// 4
	uint16_t	keylen;		// 2
	uint8_t		checksum;	// 1
	char		buffer[1];	// dynamic
}


"buffer" member variable is the one with different size.

This same layout is used on the disk as well, so I can do something like:

 
Pair *p = (Pair *) & mmaped_array[pos];


However this different size is a problem on lots of places with C++ code.

For example I can not do std::make_shared.

From the other side, If I do "buffer" to dynamic array (e.g. new char[123]), I will lose mmap "trick", also I will do two dereferences if I want to check the key.

-------------

In case anyone want to help, the full source code is here:

https://github.com/nmmmnu/HM3
I would add a size_t to your Blob to indicate the length of buffer. Or, better, you should use a std:string for buffer.
You have to careful with casts. In C, it's necessary to cast all the time.

In C++, this is not the case. Don't use C style casts. Every time a cast is used, think.

For example, is the cast necessary in:
 
Pair *p = (Pair *) & mmaped_array[pos];


This business with that char buffer[1] should be wrapped in a class. Once it's properly encapsulated, you can then use that type without worrying about the mess inside it. That's the whole point of a class.
Topic archived. No new replies allowed.