Binary search

Hi, I have a task to copy a .txt files content (alphabetically sorted) to binary file (one string at each line). Then user inputs name and I've to make a Binary Search function to check if the name is on that binary file or not.
I'm stuck at binary search function as it's .bin file it's hard to wrap my head around it. Program says that name is not on the file, even tho it is.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#include <iostream>
#include <string>
#include <fstream>
#include <cstring>
#include <algorithm>
using namespace std;
const int buffer_size = 30;

void Create_Bin_File ()
{
    ifstream fin ("example.txt");  
    ofstream fout ("Binary.bin", ios::binary);  
    const unsigned int RECORD_SIZE = 30; // was BUFFER_SIZE
    char buffer[RECORD_SIZE] = {0}; // zero init buffer
    while (fin.getline (buffer, RECORD_SIZE))
    {
    fout.write (buffer, RECORD_SIZE);
    // refill buffer with zeroes for next time round
    fill_n (buffer, RECORD_SIZE, 0);
    }
    fin.close ();
    fout.close ();
}

void Binary_Search (const string& filename, string SearchVal)
{
    ifstream file (filename.c_str(), ios::binary);
    if (file.is_open())
    {
        cout << "The file is opened"<< endl;
        cout << "\n";
    }
    else
    {
        cout << "Error opening file"<< endl;
        cout << "\n";
        return; // no point continuing Binary_Search() if file failed to open!
    }
    const unsigned int RECORD_SIZE = 30; // was BUFFER_SIZE
    char buffer[RECORD_SIZE] = {0}; // zero init buffer
    int recordCount  =  0;
    int recordWanted = -1;
    while (file.read(buffer, RECORD_SIZE))
    {
        if(SearchVal == buffer)
        {

            recordWanted = recordCount;
            // if this was just a naive search loop could bail out now...
        }
        cout << recordCount << " : " << buffer << "\n";

        // refill buffer with zeroes for next time round
        fill_n (buffer, RECORD_SIZE, 0);
        ++recordCount;
    }

    cout << "\n";
    cout << "file contains " << recordCount << " records\n";
    cout << "\n";
    if (recordWanted == -1)

        cout << "record wanted could not be found\n";
    else

        cout << "record wanted is at index " << recordWanted << " records\n";
    cout << "\n";

}
int main()
{

    Create_Bin_File();  
    string word;
    cout << "Enter name: " << endl;
    cin >> word;
    Binary_Search("Binary.bin", word);

return 0;
}
It works for me.

Enter name:
Chrissie
The file is opened
0 : Anna
1 : Bettina
2 : Chrissie
3 : Elisa
4 : Heidi
5 : Julia
6 : Katharina

file contains 7 records

record wanted is at index 2 records


What did you try ?
@Thomas1965

I tried: example.txt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
and 
and_eq 
asm 
auto 
bitand
bitor 
bool 
break 
case 
catch
char 
class 
const 
const_cast 
continue
default 
delete 
do 
double 
dynamic_cast
else 
enum 
explicit 
export 
extern
false 
float 
for 
friend 
goto
if 
inline 
int 
long 
mutable
namespace 
new 
not 
not_eq 
operator
or 
or_eq 
private 
protected 
public
register 
reinterpret_cast 
return 
short 
signed
sizeof 
static 
static_cast 
struct 
switch
template 
this 
throw 
true 
try
typedef 
typeid 
typename 
union 
unsigned
using 
virtual 
void 
volatile 
wchar_t
while 
xor 
xor_eq
It seems some words have trailing spaces so they can't be found. Not sure if they come from copy and paste here. Might be a good idea to remove them before you write the word to the binary file.

1
2
3
4
5
6
7
8
9
10
11
 string TrimRight(const string& src)
  {
    size_t j = src.size()-1;
    
    while (j > 0 && src[j] == ' ')
      j--;
      
    string s(src, 0, j+1);
    
    return s;
  }
@Thomas1965 I actually took a look and understood, that this finding algorithm is nothing like 'BinarySearch' algorithm should be. That's not my task. I tried to modify and got till here. Same results (doesn't show that word exists, even tho it does)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void BinarySearch(const string& filename, string SearchVal, int recordCount)
{
    int pos;
    const unsigned int RECORD_SIZE = 30; // was BUFFER_SIZE
    char buffer[RECORD_SIZE] = {0}; // zero init buffer
	int lowerLimit = 0;
	int upperLimit = recordCount; //recordCount is 73 (counted prev)
	while (lowerLimit <= upperLimit)
		{
			pos = (lowerLimit + upperLimit) / 2;
            if (buffer == SearchVal) cout << "Got it!" << endl;
			else if (buffer < SearchVal)
                lowerLimit = pos + 1;
			else if(buffer > SearchVal)
                upperLimit = pos - 1;

        }
}
Where do you read into buffer ? In the above code it will always be 0.
@Thomas1965 Sorry. Take a look again, please.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
std::string GetRecord(std::ifstream& inFile, int pos)
{
    char buffer[RECORD_SIZE];
    // clear possible flags like EOF, before moving the read position
    inFile.clear();
    // set the file read position to the requested record position
    inFile.seekg(pos * RECORD_SIZE, std::ios::beg);
    inFile.read(buffer, RECORD_SIZE);
    // note: automatic conversion from char[] to std::string
    return buffer;
}

void BinarySearch(const string& filename, string SearchVal)
{
    ifstream file (filename.c_str(), ios::binary);
    int pos = 0;
	int lowerLimit = 0;
	int recordCount = 73; // Calculated before[I'll change this part, when I get this function working]
                          // At this point, there's exactly 73 records in .bin file
    char buffer[RECORD_SIZE] = {0}; // zero init buffer (while loop will overwrite with record values)
	int upperLimit = recordCount;
	while ( (lowerLimit < upperLimit) && (buffer != SearchVal)) // Searching as long as it didn't find it
		{

			pos = (lowerLimit + upperLimit) / 2;
			std::string buffer = GetRecord(file, pos);
            if (buffer == SearchVal){ cout << "Got it!" << endl; }
            else if (SearchVal > buffer) lowerLimit = pos + 1;
			else if (SearchVal < buffer) upperLimit = pos;
        }
}


It works for same words like xor_eq but not for new. I guess you still have the problem with the trailing spaces. I also would change the BinarySearch so that it returns the position of the record or -1 if it doesn't find it. Also the condition in the while loop should be (lowerLimit <= upperLimit) see
https://en.wikipedia.org/wiki/Binary_search_algorithm#Procedure
You are also redefining buffer inside the loop as a string. You probably should just define buffer as a string before the loop instead of using the C-string.

Since all of your values have only one word you could use the extraction operator>> in your create binary file function to eliminate any problems with trailing spaces.

You should always check that your file open operations actually succeed.
@Thomas1965
I put trimRight function in before writing into file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
void Create_Bin_File ()
{
    int recordCount = 0;
    ifstream fin ("example.txt");
    ofstream fout ("Binary.bin", ios::binary);
    char buffer[RECORD_SIZE] = {0}; // zero init buffer
    while (fin.getline (buffer, RECORD_SIZE))
    {
    recordCount++;

    fout.write (buffer, RECORD_SIZE);
    // refill buffer with zeroes for next time round
    fill_n (buffer, RECORD_SIZE, 0);
    TrimRight(buffer);
    }
    cout << "There're: " << recordCount << " records!" << endl;
    fin.close ();
    fout.close ();
}


But .bin file still has trailing spaces.
Since your fields have no spaces why not just use the extraction operator?
1
2
3
while(fin >> setw(RECORD_SIZE) >> buffer) // Using C-string so need to limit the number of characters received with setw().
{
...


And you should always, always check that the file open operations succeed.

You also don't need the stream.close() calls, just let the destructor close the streams.


Last edited on
You need to call TrimRight before you write it to the file. Also it would be usefult to add some output while testing.

1
2
3
4
5
6
7
8
9
10
11
12
while (fin.getline (buffer, RECORD_SIZE))
    {
       cout << "Read " << buffer << "Buffer len: " << strlen(buffer);
      recordCount++;
      TrimRight(buffer);
      cout << "Buffer len: " <<  strlen(buffer);
      fout.write (buffer, RECORD_SIZE);
      // refill buffer with zeroes for next time round
      fill_n (buffer, RECORD_SIZE, 0);
      cout << "There're: " << recordCount << " records!" << endl;
}
 


jlb has a point if you are sure that the fields won't have spaces in the future.
jlb and thomas, I changed while statement. When I open .bin file now, I still have like 8 spaces between every word, but exactly 8 so it's fixed. And, binary search now just crashes (nothing happens until I press 'X') Any ideas why?
Can you post the complete code now?
@Thomas1965 :

Edited few things - everything works fine, thank you guys!
Last edited on
Topic archived. No new replies allowed.