I wrote a network sniffer that re-assembles TCP streams as network packets come in. On Linux, the packets are memory-mapped from kernel to user space. In order to reassemble one TCP request/response, I need to buffer a variable number of data packets, each of variable length, both not known in advance. What would be the most efficient, i.e. with the least number of data-copies and memory re-allocations, to re-assemble one TCP request/response?
Currently I am concatenating a std::string which is a bit clumsy as it has to reallocate when it runs out of capacity. Would it be more efficient to use std::iostreams instead? When I insert a string into an ostream, somehow it must be appended to the internal buffer, which also might run out of space triggering a re-allocation, so I wonder if this is the way to go.
If you want a container with random access, ordered in the order of insertion, and constant time insertion at the end, a deque would be a good thing i guess. A deque item could then be the pointer to a packet for example (this depends on how you use the data of course)
I don't think there's an easy answer. But you almost certainly won't get away with a single buffer that will do. So forget C++ streams here.
I'd recomend using a set of fixed size buffers, each buffer being the size of an ethernet frame as they're probably coming in over ethernet and the kernel may be storing them in buffers of that size anyway.
Not so much an answer but I'm interested in looking into how you are doing this since I'm planning to use similar functionality for my thesis. Would you care to share your sourcecode with me?