Which data structure / class should I use?

Hello guys,

I would like to kindly ask for an advice. I have a function that will output two parameters:

uint8_t *data;
int size;

I want to store this data somewhere (by copying it), size of data can vary with each function call and there will be multiple function calls. I would like to store 10 - 20 instances of that data in one structure (join them all toghether so they form a continous memory block) and then reuse this structure, to avoid new memory allocations.

I don't know what is the best C++ class to suit my needs? Should I use one of the streams, if so which one?

Thank you for your time.
Can't you just have an array of 20 structures to do what you need? I don't see why you would need a class. With an array you can then access each of the members of the structure and change them as you wish.

A pointer and an integer? A pair?

If so, std::vector<std::pair<uint8_t*,int>>


Then you start to talk about copying something of varying size. You use word "data" like it were not the pointer "data". Does the "size" have some relation to the "size" too?

It does sound like you actually get an array of uint8_t from the function and you want to concatenate those arrays.

Lets say that you have an array foo and have copied the result of two function calls to it, K and N elements, respectively. Now you want to replace/reuse the first subset. Alas, the function returns K+1 elements.

If you copy that to the beginning of the array, then you will overwrite the first element of the second subset. In order to avoid that you would have to shift the N elements first.

Are you sure that you are approaching a problem from the right angle and that what you perceive as a problem to be a real problem?
Thank you for taking time to respond to my problem or a "problem".
Sorry if I was not able to clarify what I am trying to achieve.

Can't you just have an array of 20 structures to do what you need? I don't see why you would need a class. With an array you can then access each of the members of the structure and change them as you wish.


@hoogo

The thing is that I need to copy memory to which data points to because it will be overwritten in next function call. I could have 20 structures but that would mean 20 allocations and I try to avoid allocations.
@keskiverto

Sorry, it seems I still haven't learned how to describe what I want in a precise way..

Then you start to talk about copying something of varying size. You use word "data" like it were not the pointer "data". Does the "size" have some relation to the "size" too?


data points to a memory location from which I must copy and size describes number of elemets of type uint8_t bytes that I must copy.

I can't use vector as you have shown because with each call of my function the memory pointed to by data will be overwritten.

It does sound like you actually get an array of uint8_t from the function and you want to concatenate those arrays.


Yes, I could have create 20 arrays or any other number I need and copy results there but that would require 20 allocations and I would prefer to have one allocation and copy my arrays there. The thing is I need that code to be as fast as possible and allocations take time.
I would preffer to allocate one bigger buffer with capability of growing if it needs to.

Please take a look at the example which I've created, hope now it becomes clear:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <random>

struct WorkResult
{
  uint8_t *data;
  int size;
};

class Worker
{
public:
  Worker()
  {
	rng = std::mt19937(rd());
	uni = std::uniform_int_distribution<int>(1, 1000);
	uni2 = std::uniform_int_distribution<int>(0, 255);
	work_result.data = new uint8_t[1000];
	work_result.size = 0;
  }
  WorkResult DoWork()
  {
	// The whole point of this is tho show that pointer stays the same but underlying memory changes with each call.
	work_result.size = uni(rng);
	for (int i = 0; i < work_result.size; i++)
	  work_result.data[i] = uni2(rng);
	return work_result;
  }
  WorkResult work_result;
  std::uniform_int_distribution<int> uni;
  std::uniform_int_distribution<int> uni2;
  std::random_device rd;
  std::mt19937 rng;
};

int main()
{
  SomeAwesomeClass tmp_class; // This is something I am looking for, should I use std::vector?
  Worker worker;
  WorkResult work_result;
  for (size_t i = 0; i < 20; i++)
  {
	work_result = worker.DoWork();
	// Here, I want to copy data to tmp_class; I want it to grow automatically. I want it to append data as well.
	tmp_class.write(work_result.data, work_result.size);
  }

  // Here I did something with data stored withing tmp_class and now I want to move its internal head to the beginning.
  // I assume that moving head to the beginning does not call memory reallocations.

  tmp_class.set_position(0); // Setting head to the very beginning.
  
  // And now I can again fill up my tmp_class;
  for (size_t i = 0; i < 20; i++)
  {
	work_result = worker.DoWork();
	// Here, I want to copy data to tmp_class; I want it to grow automatically. I want it to append data as well.
	tmp_class.write(work_result.data, work_result.size);
  }
}
Last edited on
The thing is that I need to copy memory to which data points to because it will be overwritten in next function call.


Can't you define the array of struct outside of your function and pass it by reference as an argument?


I could have 20 structures but that would mean 20 allocations and I try to avoid allocations.


I don't get what you mean. If you're going to store data, it will inevitably be allocated somewhere.
Last edited on
1
2
3
4
5
6
7
std::vector<uint8_t> foo;
foo.reserve( 20*1000 ); // One allocation. Prepare for worst case

// in a loop:
foo.insert( foo.end(), work_result.data, work_result.data + work_result.size );

foo.clear(); // Setting head to the very beginning. 



PS. I presume that the unmatched new on line 17 is just for the example. You could replace your WorkResult with a std::vector<uint8_t>.

EDIT: Or, as hoogo suggested:
1
2
3
4
5
6
7
8
  std::uniform_int_distribution<size_t> uni;
  std::uniform_int_distribution<uint8_t> uni2;

void Worker::DoWork( std::vector<uint8_t>& dst )
{
  const size_t size { uni(rng) };
  for ( size_t i = 0; i < size; ++i ) dst.emplace_back( uni2(rng) );
}
Last edited on

I don't get what you mean. If you're going to store data, it will inevitably be allocated somewhere.


Yes, but it can be done in two ways:

1. Every time worker.DoWork() is called I would call new uint8_t[size] and copy data to it. So if it has been called 20 times I need to allocate memory 20 times.

2. I allocate big memory fragment at the begining, only once. For example new uint8_t[approximated_size] Then with each subsequent worker.DoWork() call I just copy memory to preallocated space withing this big memory fragment. You can imagine it as I would do it with a vector.push_back(data, size).

You see the difference? I saved 19 allocations. That is the correct way of doing it but I just don't know what currently implemented class in standard library would be best to realize that. Perhaps vector is the way to go. I could do that myself but don't want to reinvent the well.
@keskiverto

Yes, it was for example. Ok, thanks so vector is the way to go :)
Topic archived. No new replies allowed.