The most efficient way to find a string in a text file

Hi All

I just create a little program for myself to do some tedious work, finding a particular string in a text file.

The program is done, basically it reads the text file line by line, and after reading each line, I have a if statement to check the string against the line. However, it finds it quite slow. If I open the text file in notepad, and using 'Find', the done can be done in a second for a file, but it could take up to minute to be done by my program.

So I would like to discover any alternative and find the most efficient way to do this job please.
Can you show us the code ?
I don have the source code at the moment, so I can't copy and paste, but the core code is:

bool Check_F310_Pass(String line)
{
string string_to_find = "F3.10 Fail";
bool exist = line.find(string_to_find) != std::string::npos;
return exist;
}


string textline;
int F310pass = 0, F310fail = 0, F313pass = 0, F313fail = 0, F316pass = 0, F316fail = 0;

if (inFile.is_open())
{
while (getline(inFIle, textline)
{
if (Check_F310_Pass(textline)
F310pass++;
if (Check_F310_Fail(textline)
F310fail++;
if (Check_F313_Pass(textline)
F313pass++;
if (Check_F313_Fail(textline)
F313fail++;
if (Check_F316_Pass(textline)
F316pass++;
if (Check_F316_Fail(textline)
F316fail++;
}
}
Like Notepad you could try to load the file in one go and do the search later.
How to load it in one go please??
How to load it in one go please??


1. Get the file size
2. Allocate a buffer
3. Read the file into the buffer
4. Process the buffer
5. Delete the buffer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <iostream> 
#include <fstream>
#include <string>
#include <Windows.h>

using namespace std;

DWORD FileSize(LPCSTR filename);

int main ()
{
  char szFilename[] = "C:\\Temp\\Lorem.txt";
  DWORD len  = FileSize(szFilename);
  if (len == 0)
  {
    cerr << "File is empty.\n\n";
    exit(EXIT_FAILURE);
  }
  char *buffer = new char[len + 1];
  ifstream src(szFilename, ios::binary);
  src.read(buffer, len);
  buffer[len] = 0;
  // do sth. buffer
  //cout << buffer;
  delete [] buffer;

  system("pause");
}

DWORD FileSize(LPCSTR filename)
{
  WIN32_FIND_DATA fd;

  if (FindFirstFile(filename, &fd) == INVALID_HANDLE_VALUE)
    return 0;

  return (fd.nFileSizeHigh * MAXDWORD) + fd.nFileSizeLow;
}
however, when I have to do my string search, those Check_xyz, I still have to go through it line by line by get line, don't I? does it mean that I am doing this buffer as an extra work??
The standard tool used for this is grep. It's very fast. If you knew how grep did it, you could take some ideas from there.

http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html
Another option is to use Regular expressions.
http://www.cplusplus.com/reference/regex/
This may have more to do with file buffering than with search algorithms. Try changing the code to simply read the file and see how fast it is. If this is too slow then you need a larger buffer.

I have to run, but I think there's a way to give the streambuf more space. Off hand I suggest 64k.

If you can't do it with a streambuf, then try switching to C style I/O. I know that you can specify the buffer that way.
Topic archived. No new replies allowed.