Dynamic Memory Allocation Good Practice

I have written the following function to load generic data from a file. Is it bad practise to dynamically allocate memory in this function and then leave it for the user to deallocate (as I have done here)?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
char* AscendantEngine::LoadData(char* filename, unsigned size)
{
	if (!filename) throw Exceptions::BadArgument("AscendantEngine::LoadData - Null filename was passed");
	if (!size) throw Exceptions::BadArgument("AscendantEngine::LoadData - Zero data size was passed");

	std::ifstream file(filename);
	if (file.fail()) throw Exceptions::FileError("AscendantEngine::LoadData - file error");

	char* mem = new char[size];
	file.read(mem, size);

	file.close();

	return mem;
}


The only alternative I can see would be to require the user to pass the pointer to be written to: char* LoadFunction(char* filename, unsigned size, char* ptr). This, however would be cumbersome to work with as many calls would have to be prefaced by a dynamic memory allocation.
It is bad indeed, as user is likely not to free it himself. The second solution, as you said, isn't any better. It would be best to use an object which would take care of memory management. It could be an std::vector, or you can write one yourself.
This is one of the places where I think that garbage collected platforms, such as say Java and C#/.Net, truly excel. And not because they use garbage collector instead of RAII, but because the memory management has a system-wide uniform contract, while in C++ everyone is for themselves, and consequently discrepancies easily happen.

That said, I think that if the user knows how much space he should allocate, it is better to let him handle the allocation. I understand your concern about comfort and readability and it is justified, but the user may have the option to use custom allocator, memory pool, or to reuse the same buffer repeatedly with consecutive calls, and may even not allocate the buffer as separate object.

On the other hand, if the size is not known in advance, and if there is no way to reliably anticipate some tight upper bound, then allocating the memory during the call (directly or indirectly) is unavoidable. You may even provide parameter for custom allocator.

If the caller is not the one that supplies the allocator, there are several strategies to handle the deallocation:

- Create managing wrapper object - like string or vector, that manages the lifetime of the buffer inside it, with several nuances of this solution:

-- The wrapper object copies the entire buffer when you copy the wrapper (like std::vector which you can use directly). This is value semantics. Let the client worry about sharing the object efficiently.

-- The wrapper object copies around only the location of the buffer. This is reference semantics. Let the client worry about concurrent modifications. You can use smart pointer to implement reference counting, or you can implement it yourself

-- The wrapper object uses copy-on-write idiom and allows both sharing without copying and value semantics (I think std::string does that)

- Return the data in static buffer (or thread local storage). Let the caller copy the information from there if it is needed after the next call, when you may overwrite it. This approach is noones favorite, but it helps alleviate the stress on the system from frequent allocations.

- Let the caller worry about releasing the memory. Specify how this should be done.

Managing classes will not simply hide the memory management detail from the caller, but also from the callee. Also, they can provide other useful semantics and facilities. But for a simple buffer, as in your case, using smart pointer may be just as well.

I still personally think that allowing the client of the call to deal with the lifetime management is best. I mean, this will improve the cohesion of the code, because the functions will not be burdened with multiple responsibilities. But I am not firm on that.

Regards
Last edited on
Thanks for the advice. I'll have a think about which approach is best for me and see how that goes :)
In this specific case, why are you not returning a std::string? It's almost always better to return by value when you can. Strings allow you to do that.

More generally, this is solved with RAII and smart pointers. http://www.boost.org/doc/libs/1_46_1/libs/smart_ptr/smart_ptr.htm

@simeonz: I have seen just as many problems with Java garbage collection as with C++ memory leaks of late.
While I have allocated the memory as a character array, it isn't necessarily an actual string. Indeed, it could be a 'memory dump' of a possibly large class. Thus I'm not sure I can return by value since the std::string could be very large.

Howver, the smart pointer option looks pretty good and it sounds like it will also solve my original problem of where to deallocate the memory.

Thanks for the help.
Actually in this case, a vector<char> would be a reasonable option. RVO would eliminate the copy in virtually all cases.

Otherwise, have the caller pass in the destination since it already must know the size:

1
2
3
4
5
6
7
8
9
10
11
12
13
// size bytes will be copied into result.  Caller is responsible for allocating enough space to hold the data.
void AscendantEngine::LoadData(const char* filename, unsigned size, char* result)
{
	if (!filename) throw Exceptions::BadArgument("AscendantEngine::LoadData - Null filename was passed");
	if (!size) throw Exceptions::BadArgument("AscendantEngine::LoadData - Zero data size was passed");

	std::ifstream file(filename);
	if (file.fail()) throw Exceptions::FileError("AscendantEngine::LoadData - file error");

	file.read(result, size);

	file.close();
}


Also, filename should be a const char* since you are not modifying it.
So I can just create an automatic instance of vector<char> and return it by value?

I read about RVO and out of a matter of interest, how can this return by value and hence copy operation be avoided by the compiler? Would it have to change the vector<char> from being automatic?
The standard is actually clear about that:
The Standard wrote:
This elision of copy operations is permitted in the following circumstances (which may be combined to eliminate multiple copies):
— in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object with the same cv-unqualified type as the function return type, the copy operation can be omitted by constructing the automatic object directly into the function’s return value
— when a temporary class object that has not been bound to a reference (12.2) would be copied to a class object with the same cv-unqualified type, the copy operation can be omitted by constructing the temporary object directly into the target of the omitted copy

You can read some of my personal blabber here: http://www.cplusplus.com/forum/general/38647/#msg208328
You may be look into an example here: http://en.wikipedia.org/wiki/Return_value_optimization#Summary

Here is what your code might look like with attempt to use both copy elision in initialization and RVO:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
vector<char> AscendantEngine::LoadData(char* filename, unsigned size)
{
	if (!filename) throw Exceptions::BadArgument("AscendantEngine::LoadData - Null filename was passed");
	if (!size) throw Exceptions::BadArgument("AscendantEngine::LoadData - Zero data size was passed");

	std::ifstream file(filename);
	if (file.fail()) throw Exceptions::FileError("AscendantEngine::LoadData - file error");

	vector<char> mem(size);
	file.read(&mem[0], size);

	file.close();

	return mem; //Could perform RVO
}
int main()
{
	AscendantEngine engine();
	vector<char> v = engine.LoadData("haha", 1024); //Could perform copy elision for this initialization
	//...
}

Regards
For the save function, I assume I should just pass the vector by reference:
void AscendantEngine::SaveData(const char* filename, vector<char>& data, unsigned size);;

Originally, the idea was to use reinterpret_cast so I could load/save any data structure to a file by turning it into a char*.

However, to do this now would not be so simple because the objects I'm dealing with are no longer pointers. How would you suggest I go about turning an arbitrary type into a vector<char> and back? I think the following would work, but there must be something neater and safer.

1
2
3
4
5
6
7
8
9
10
11
some_data foo; // where some_data is a class/structure/etc
// etc
vector<char> bar(sizeof(foo));
for (int i=0; i<sizeof(foo); i++)
      bar[i] = (reinterpret_cast<char*>(&foo))[i]; // is this bad practice?
AscendantEngine::SaveData("foobar.txt", bar); // no longer need to provide size since vector takes care of it
// etc
vector<char> haha = AscendantEngine::LoadData("foobar.txt", sizeof(some_data));
some_data hoho;
for (int i=0; i<sizeof(some_data); i++)
      *((reinterpret_cast<char*>(&hoho))[i]) = haha[i]; // this is worse even than the other one 


Please could you suggest an improved way of dealing with this problem (and I doubt it'll be hard to improve upon the code I've just put there...)

Thanks.
Like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
std::ostream& operator<<(std::ostream& output, const some_data& data)
{
    output.write(reinterpret_cast<char*>(&data), sizeof(some_data));
    return output;
}

std::istream& operator>>(std::istream& input, some_data& data)
{
    input.read(reinterpret_cast<char*>(&data), sizeof(some_data));
    return input;
}

{
    some_data foo;
    std::ofstream save_data("foobar.txt");
    save_data << foo;
}

{
    some_data bar;
    std::ifstream save_data("foobar.txt");
    save_data >> bar;
}


You can avoid writing numerous insertion and extraction operators by using templates:

1
2
3
4
5
6
template <typename T>
std::ostream& operator<<(std::ostream& output, const T& data)
{
    output.write(reinterpret_cast<char*>(&data), sizeof(T));
    return output;
}


Though without a little template meta-programming, this is somewhat unsafe.

However, the proper way to do this IMO is to serialize the data. The reason is that reading and writing binary data the way you are doing is rather error prone. For example, 32-bit and 64-bit executables are likely to have different binary representations. Pad sizes change, floating points might be represented differently, etc. Your users are going to be unhappy when they upgrade their machine and find they cannot load their saved game data. Or they may want to load a game saved from their PC onto their Android phone.

Storing the data in a formalized manner (JSON, XML, or some formal binary model) avoids this issue, but requires a bit more work.

edit: fix typo in code example
Last edited on
^Don't forget about saving/loading objects that contain pointers.
I'll look up template meta programming as I've heard it a lot recently but have no idea how it differs from regular template programming.

firedraco, thanks for the tip - I intended to ensure all the classes I saved in this way no pointer members.

Thanks everyone!
Topic archived. No new replies allowed.