Hi, I'm writing a program that takes txt files as input and looks for strings in them.
Until now I have used the getline command
[code]
while (getline(lsass, lineFile))
{
string+=lineFile;
}
[code]
by taking one line at a time and adding it in a string but this method has proven to be very slow (text files weigh around 500mb). Do you know a command that allows me to import the text file all at once and insert the content into a string quickly enough to then be able to filter it with the fstream.find command?
I read on the web that the fstream.read command exists to be able to import "blocks" of files but I have not understood how it works ...
Thanks
ps. I am new in the world of C ++ and in general in the world of programming so
I don't know many commands and I am ready to know them
dutch i tried your code but the execution speed seems the same as the getline method ...
dhayden I also tried to search for strings directly from the line taken with the getline:
[code]
while (getline(lsass, lineaFile))
{
if (lineaFile.find("RYwTiizs2trQ", 0) != string::npos)
trovato = true;
}
[code]
but even with this method the speed does not improve.
I can't understand how other programs can load text files into memory in a very short time..
this isnt the best, but it demonstrates read and write. There are faster ways to do it above these. These are still PDQ for small files like yours, though. I get read in 0.2 sec on my cheap laptop. Yes, I said small file. A large file won't fit into your ram, is a good rule of thumb.
writes about a gig to a file and then reads it back with a report on the read time taken.
do not run in shell, dunno how it acts about writing files.
#include <chrono>
usingnamespace std;
usingnamespace chrono;
struct hrtimer
{
high_resolution_clock::time_point s;
duration<double> time_span;
void start(){s = high_resolution_clock::now();}
double stop()
{
time_span = duration_cast<duration<double>>( high_resolution_clock::now() - s);
return time_span.count();
}
};
int main()
{
char * cp = newchar[1000000000ull];
ofstream ofs("bff.txt"); //make a huge file to test with.
ofs.write(cp,1000000000ull);
ofs.close();
hrtimer h;
h.start();
ifstream ifs("bff.txt");
ifs.read(cp,1000000000ull);
ifs.close();
cout << "read took: "<< h.stop() << endl;
}
depending on what you are doing, now that you have the big mess in memory, finding what you want in the middle of it may also take some time. It may or may not be right to get it all in one big buffer for your application?
This forum has had the 'how best to deal with files' topic a few times. Some great resources in old threads if you want to take a look.
In the end I used a binary method very similar to the one dutch wrote me which turned out to be very fast, in less than 1 second it manages to import and filter a 500mb text file.
Thanks again
Hardware is getting to be amazing.
I redid mine. If you run it as posted, its .2 or less seconds to read the file, and apparently cached (from the write?), as noted.
but forced to not be cached, it was just over 0.5 and still subsecond, though more than twice as slow.
And I don't even have a SSD, my next machine will, but the current does not.
that is kind of annoying to test. the only way I know to ensure no cache is to reboot with a hard power cycle.