Text Input of Variable and Unknown Size

Nov 23, 2008 at 12:18am

What is the best way to handle dealing with incoming data (via socket or stdin, whatever) when the size of the entire block of data can vary between a few hundred bytes and potentially, many tens or hundreds of thousands of bytes?

It doesn't seem efficient to simply have an array of type char[500000] in case a chuck of data arrives that's larger. But continually resizing a chunk of memory and copying data into a new widened buffer space doesn't seem wise either.

My thoughts are of a vector of pointers or linked lists of small memory blocks extending out as far as necessary to contain the data.

Opinions?
Last edited on Nov 23, 2008 at 12:18am
Nov 23, 2008 at 1:33am
std::string...?
Nov 23, 2008 at 2:17am
When not handling text, the most efficient way to allocate unknown space for sequential reading and writing (note, however, that a condition in which it's impossible to know beforehand the size of an incoming stream is very rare) is a linked list of pointers to dynamic arrays. Specially efficient if the data is expected to grow large, as it saves time by not copying the whole structure over and over again.
Nov 23, 2008 at 2:41am
could a stringstream work for you? i think that may be your best bet if you are dealing with chars, or byte chunks of data( http://www.cplusplus.com/reference/iostream/stringstream/ ).
or, based on helios' post, and using STL, how about
list< vector< your_data_type > > buffer;(linked list of dynamic arrays)?
the list and vector containers i believe also have a member where you can reserve memory for a certain size if you have an estimate of how much data you will be using.
Last edited on Nov 23, 2008 at 2:44am
Nov 23, 2008 at 1:41pm

std::string I think will work; in this specific situation there will not be any nulls coming in the stream. There may be control characters, but the std::string should handle those.

However, as I thought about this, I could not be for certain that only text would be handled and that was my thoughts on just using a linked list or vector.

Thanks for the input.
Nov 23, 2008 at 1:55pm
I used stdlib (malloc, realloc, free) for this. Start with a block of 1000 bytes, and if there is more data use realloc to make the block of memory 1000 bytes larger. If you think there will be more than 1000 bytes very often then you could start/increase with 2000 or 10000 or some other value instead of 1000.

Another method is to read the data twice: first read it to get the size, then allocate memory, then read it again to store the data.
Last edited on Nov 23, 2008 at 1:56pm
Topic archived. No new replies allowed.