How do you read from binary file without knowing size of data you're reading

Hello guys!

I'm challenging myself to write a program that encrypts a text file with a key and the program requests the user, before running, for a password.

The way i'm setting up the password thing is that i'm creating a binary file within the project folder (for now, i'm working with visual studio 2017).

When the program starts I do a few things
_check if the password binary file exists, if it doesn't it creates it
_check if the file contains a password, if it does, prompt the user to
enter the password and check validity. If it doesn't, prompt the user
to enter a password twice, check validity, if correct write to binary
file, if not correct, ask for password until it is correct and write
to file.


The thing i seem to have an issue is that i'm storing the password in a string which I write on the binary file with file.write(reinterpret_cast<char *>(&passWord), sizeof(passWord)); where file is a binary file opened in output mode and passWord is a string. That goes OK.

Now the thing i think is causing me issues is when i run the program, it reads the file to store the password in the passWord string, yet it doesn't know the size of the data it's supposed to read on the file, and using sizeof(passWord) doesn't seem relevant since passWord is so far uninitialised.

How would you fix this, how can you tell the program to read on a binary file without knowing in advance the size of the data it should read from it?

Here is my code:

Main.cpp
1
2
3
4
5
6
7
8
9
10
#include <iostream>
#include "Crypt.h"


int main()
{
	Encryption encrypt;

	pause(-1);
}



Crypt.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#ifndef CRYPT_H
#define CRYPT_H

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

void pause(int);

class Encryption
{
	private:
		fstream passWordFile;
		string passWord;
		string passWordInput;

		bool passWordSet;
		bool correctPassWord;



	public:
		Encryption();
		~Encryption();

		void checkPWFileIntegrity();

		void setPassWord();
		bool getPassWord();
};



#endif 


Crypt.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
#include "Crypt.h"

void pause(int num)
{
	switch (num)
	{
		case -1:
			cout << "\n\n\n Press ENTER to exit...";
			cin.sync();
			cin.get();
			cout << "\n\n\n";
			break;
		case 0 :
			cout << "\n\n\n Press ENTER to resume program...";
			cin.sync();
			cin.get();
			cout << "\n\n\n";
			break;
	}

	return;
}

Encryption::Encryption()
{
	checkPWFileIntegrity();

	if (passWordSet)
		correctPassWord = getPassWord();
	else
		setPassWord();
}

Encryption::~Encryption()
{
}

void Encryption::checkPWFileIntegrity()
{
	ifstream PWFile;
	PWFile.open("pw.bin", ios::binary);

	if (PWFile)
	{
		cout << "\nFile successfully opened!";
	}
	else
	{
		PWFile.close();

		ofstream createFile;
		createFile.open("pw.bin", ios::binary);
		createFile.close();

		PWFile.open("pw.bin", ios::binary | ios::in);

		cout << "\nSuccessfully created \"pw.bin\" file.";
	};

//I THINK THIS IS WHERE THE ISSUE LIES
	PWFile.read(reinterpret_cast<char *>(&passWord), sizeof(passWord));

	if (passWord != "")
		passWordSet = true;
	else
		passWordSet = false;

	cout << "\n\nPassword set status: " << passWordSet
		<< "\nPassword: _" << passWord << "_";

	PWFile.close();

	return;
}

void Encryption::setPassWord()
{
	while (!passWordSet)
	{
		cout << "\n\nPlease enter a password:\t_";
		getline(cin, passWord);

		cout << "Confirm password:\t\t_";
		getline(cin, passWordInput);

		if (passWord == passWordInput)
			passWordSet = true;
		else
			cout << "\n\nInput incorrect..\n";
	}

	passWordFile.open("pw.bin", ios::binary | ios::out);
	passWordFile.write(reinterpret_cast<char *>(&passWord), sizeof(passWord));
	passWordFile.close();


	return;

}

bool Encryption::getPassWord()
{
	cout << "\n\nEnter password: _";
	getline(cin, passWordInput);

	if (passWordInput == passWord)
		return true;
	else
	{
		cout << "\n\nPassword incorrect.";

		return false;
	}
}



Thanks in advance for any help or insight.

Wishing you all a great day!

Hugo.



EDIT:
The error i'm getting:
file "iosfwd" opens and this part is highlighted:
1
2
3
static constexpr int_type to_int_type(const char& _Ch) _NOEXCEPT
		{	// convert character to metacharacter
		return (static_cast<unsigned char>(_Ch));


Error reads:
Exception thrown: read access violation.
_Ch was 0x2A81F0.


Note that this error is random, according to what the password is set to, program crashed or doesn't. It seems somewhat random.
Last edited on
Well password is a std::string which doesn't contain the actual string itself, just a pointer (and some other data).

1
2
3
string foo = "hello";
string bar = "this is a much longer string containing a quick brown fox";
cout << sizeof(foo) << " " << sizeof(bar) << endl;

The size is the object is decoupled from the size of the string it contains.

For a string, one would normally write out foo.c_str(), or possibly foo.data() if there is any chance that the string contains embedded \0 characters.

See also https://isocpp.org/wiki/faq/serialization
@salem c

Yes i figured that this must be the problem with strings.
Isn't using foo.c_str() going to result in the same issue though, when you read from the file to store the data in an uninitialised variable, without knowing the size of the data you're reading?
Last edited on
Hello hoogo,

My experience has been that you can not use a "std::string" with a binary file.

The binary file is read and written to in chunks or blocks of information and needs a fixed size for what is read and written to the file. A "std::string" is a variable length and although you can write a "std::string" to a binary, which may not write the whole string, when reading the file is the big problem.

One possibility is to set up a C style character array and write and read this to the binary file. Or just use a regular text file.

In the line:
PWFile.read(reinterpret_cast<char *>(&passWord), sizeof(passWord));
It is very possible that in "sizeof(passWord)" "passWord" may be empty and have a zero length and that is thronging the read off.

Hope that helps,

Andy
Hey Andy,

One possibility is to set up a C style character array and write and read this to the binary file. Or just use a regular text file.


Right I see what you mean, So would using an array of char, or C_string with a set number of maximum characters work?

For instance I create an empty C_string of 40 char and ask the user for a password which I store in the C_string and write it to the binary file.
Could I then read it using the size of a 40 char C_str without having initialised the variable in which I'm going to store the password?

Thanks!
Last edited on
Could I then read it using the size of a 40 char C_str without having initialised the variable in which I'm going to store the password?
Yes.

@Andy
The string will store the data using a pointer. When using the string that way only internal data (i.e. member variables) will be read/written which will corrupt the string object.

It is very possible that in "sizeof(passWord)" "passWord" may be empty and have a zero length and that is thronging the read off.
Actually sizeof will return the siize of the string object which has nothing to do with the data it manages (thru pointers)
Hello hoogo,

As coder777 said yes. Two years age when I tried the same thing I created a struct to hold all the information with the idea that I would read and write the struct to the file. I solved the "std::string" problem I had by using a fixed size character array.

Since I wrote that program I have not used it that much.

Andy
Thank you to both of you for your time and help!

Have a great day!

Hugo.
if you want to read and write the whole records as a block, that is the only way to do it. You can actually have variable length records, but that means reading and writing sub blocks for each record, which is less efficient as you would have a file format of size, data, size, data and have to pick off the sizes before reading more, and of course you can no longer seek / read positions and get a record that way. The technique is really more useful for stuff like a jpg file, where the data isnt a bunch of records so much as one entity.
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
#include <iostream>
#include <iomanip>
#include <fstream>
#include <string>

std::ofstream& write_string(std::ofstream& out, const std::string& str)
{
    auto size = str.size();
    if (out.write(reinterpret_cast<char*>(&size), sizeof size))
        out.write(reinterpret_cast<const char*>(str.data()), size);
    return out;
}

std::ifstream& read_string(std::ifstream& in, std::string& str)
{
    std::string::size_type size = 0;
    if (in.read(reinterpret_cast<char*>(&size), sizeof size))
    {
        //str.resize(size);           // For C++17, these two lines ...
        //in.read(str.data(), size);  // ... are all you should need.
                                      // But before that:
        auto carray = new std::string::value_type [size];
        in.read(reinterpret_cast<char*>(carray), size);
        str.assign(carray, size); // can assign nul chars (unlike operator=)
        delete [] carray;
    }
    return in;
}

bool dump_hex(const std::string& filename)
{
    std::ifstream in(filename, in.binary);
    if (!in) { std::cerr<<"error opening "<<filename<<'\n'; return false; }

    char byte, chars[17];
    int i = 0;
    unsigned addr = 0;
    std::cout << '\n' << std::hex << std::setfill('0');

    while (in.get(byte))
    {
        if (i == 0)
        {
            std::cout << std::setw(6) << addr << ":  ";
            addr += 16;
        }

        // (I used C-style casts here to avoid clutter.)
        std::cout << std::setw(2) << (unsigned)(unsigned char)byte << ' ';
        chars[i++] = std::isprint(byte) ? byte : '.';

        if (i == 16) {
            chars[i] = '\0';
            std::cout << "    " << chars << '\n';
            i = 0;
        }
    }

    if (i > 0)
    {
        chars[i] = '\0';
        std::cout << std::setfill(' ') << std::setw((16 - i) * 3) << "";
        std::cout << "    " << chars << '\n';
    }

    return true;
}

int main() {
    //// Store some strings in a binary file.
    std::string a, b, c;

    std::cout << "Enter three strings:\n";
    std::getline(std::cin, a);
    std::getline(std::cin, b);
    std::getline(std::cin, c);

    std::ofstream out("out.bin", out.binary);
    if (!out) { std::cerr<<"error opening output\n"; return 1; }

    write_string(out, a);
    write_string(out, b);
    write_string(out, c);
    out.close();

    //// Read the strings back from the binary file.
    std::string a2, b2, c2;

    std::ifstream in("out.bin", in.binary);
    if (!in) { std::cerr<<"error opening input\n"; return 1; }

    read_string(in, a2);
    read_string(in, b2);
    read_string(in, c2);
    in.close();

    std::cout << '\n';
    std::cout << '[' << a2 << "]\n";
    std::cout << '[' << b2 << "]\n";
    std::cout << '[' << c2 << "]\n";

    dump_hex("out.bin");
}


Enter three strings:
one two three four
alpha beta gamma delta epsilon zeta eta theta
a b c

[one two three four]
[alpha beta gamma delta epsilon zeta eta theta]
[a b c]

000000:  12 00 00 00 00 00 00 00 6f 6e 65 20 74 77 6f 20     ........one two 
000010:  74 68 72 65 65 20 66 6f 75 72 2d 00 00 00 00 00     three four-.....
000020:  00 00 61 6c 70 68 61 20 62 65 74 61 20 67 61 6d     ..alpha beta gam
000030:  6d 61 20 64 65 6c 74 61 20 65 70 73 69 6c 6f 6e     ma delta epsilon
000040:  20 7a 65 74 61 20 65 74 61 20 74 68 65 74 61 05      zeta eta theta.
000050:  00 00 00 00 00 00 00 61 20 62 20 63                 .......a b c

Last edited on
Topic archived. No new replies allowed.