need ideas for a data storage structure

I will be storing DNA sequence matches to organize data in a program, it HAS to be built piece by piece.

What I'm currently thinking of doing is using a nested group of deques that can be accessed as follows: matches[matchsize][matchID][sourcefileID][matchnum].
It should return a position number of a match within a sequence.
I'm not sure if this is the best way to do it. I need to be able to differentiate between matches, and which sequence each match is from as well as the size (in nucleotides) of each match. Each sequence file might be able to have more than one match to a given sequence, which the last layer accounts for.

If I DO use the deque structures this way, I want to know something, if I allocate a new deque and use push.back() to put it into the larger level structure, is it actually being PUT in there or is the function copying the allocated structure?

The sequences are being stored in the following structure
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class sequence
{
public:
	char *seq;
	unsigned long long size;
	unsigned long long num;
	sequence(void) : size(0),num(0)
	{
		
	}
	~sequence()
	{
		delete seq;
	}
};
deque <sequence> sequences;

the char pointer is later allocated to a character array of the appropriate size.
If the deque is defined as containing the object rather than a pointer to the object, a copy of the object is being put into the deque. The copy is created by calling the copy constructor of the object being put into the deque. In your case be wary of your char * seq and write a specific copy constructor to make sure it does a deep copy and creates a full copy of the array. Otherwise you will just be copying the pointer, pointing to the array in the original copied instance of sequence.
Last edited on
Topic archived. No new replies allowed.