I'd expect iterators to be a bit quicker as you're always saving yourself the unnecessary column lookup.
|
I would expect iterators to definitely be faster for multidimensional vectors, but probably about the same for single dimension vectors.
I'd expect vector<bool> to be slower as the bits are packed into a byte. |
While that may have been true in the past, my test have been showing almost the same speed. I am still working to try and determine if it is a truly valid test though. It is only recently that processors got fast, low latency shift hardware. They can shift any number of bits in a single clock. Also, since setting up the mask for bit addressing is independent of the memory access of the right word, it can be done in parallel on any modern super scalar, OOO processor.
Counting instructions to determine performance stopped working a long time ago. These days, you just about have to run it and time it with representative data. Too many cache effects, super-scalar effects, and OOO effects. I need to boot up one of my really old machines to see how the behavior has changed.
Also, vector<bool> is 8 times smaller than the same number of bits in an array of bool, since the array has to use at least a char for each bit. 8 times smaller makes it more easily cacheable. I did get slightly better speed using an iterator vs. subscripting with vector<bool>, but not too much better. This probably has to do with scripting creating a new reference each time, while an iterator just shifts the mask in a pre existing iterator most of the time.
Without optimization, the array is much faster, but at O3 with g++ 4.0.1, I am getting about the same speed. O3 makes the array a lot faster also, but vector obviously benefits a lot more, since subscripting operations are function calls without in-lining. I don't know what some of the posters are talking about regarding subscripting. Vector subscripting is supposed to be unchecked access. The at() function is used for checked access. I suppose some IDEs may add checking in debug mode, but that better go away if optimization is turned on. I don't use an IDE, I just use emacs and g++.
I can't see how it depends on OS, filesystem or physical media. |
Some OSs/filesystems may read the entire file into a disk cache right from the start, which means that copying it into another string is a somewhat useless copy; it is already in memory. It does avoid a lot of smaller read system calls though, but the expense of a read() call varies from system to system. Modern processors and OSs can execute a context switch from user to system space very fast, so I would doubt that copying 900MB around in memory is worth it in most cases. Especially if you are just making a linear pass through the data; not seeking around.
Other systems my just read part of the file, and then go back and read more on demand; essentially it gets mapped into pages, and pages are read in on demand. Getting hit with seek latency is fine on an SSD, not to bad on a hard disk, and horrible on a cd-rom. Running stuff over an NFS, or other network mount can be painfully slow compared to local disk. As far as files go, a couple of hundred MB isn't really that large. It is a waste to use a text file though. A binary file would be significantly faster.