Force stream << >> to interpret as binary

I'm writing custom stream classes for a project of mine, and I want to change the formatted extraction operators to interpret everything as binary rather than human readable text. I can't see any easy way to do this without some extreme hacks.

The goal is for the class to be compatible with all code that deals with the abstract std::istream/std::ostream and for the input and extraction operators to write and read everything as binary, that is writing integer 7 writes four bytes (assuming 32-bit integer ofc). The class needs to be easy to use for people who don't know how to use .read() and .write(), that is, they will be using operator>> and operator<< for input and output.

Any suggestions?
As you know I am not enough skilled to help an expert like you, but I can speak about Qt libraries (I do a massive use of Qt so I am not more used to use standard functions like stl_vectors or standard stream class).

In Qt you have 2 stream classes. QTextStream (for text based streams) and QDataStream (for binary based streams). QDataStream can handle qt-based-objects and read-write them in binary way (and also allowing you to use a platform-independent binary, with the chance to use little endian instead of - default for Qt - big endian). You can also overload operators << >> to manage your custom data.

I don't know how to help you with std::istream/std::ostream, so I hope I wrote something useful :(
> The class needs to be easy to use for people who don't know how to use .read() and .write()

A C++ programmer who finds it difficult to understand how to use .read() and .write()?
Shouldn't be attempting to write production C++ code, IMHO.

As a purely academic exercise, you might consider writing wrappers over the standard stream classes; for instance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include <iostream>
#include <type_traits>

struct binary_ostream_wrapper
{
    binary_ostream_wrapper( std::ostream& s ) : stm(s) {}

    template< typename T >
    typename std::enable_if< std::is_pod<T>::value, binary_ostream_wrapper& >::type
    operator<< ( const T& v )
    { stm.write( reinterpret_cast<const char*>(&v), sizeof(T) ) ; return *this ; }

    binary_ostream_wrapper& operator<< ( const char* cstr )
    { stm << cstr ; return *this ; }

    // overloads for std::string, std::complex<> etc.

    binary_ostream_wrapper& operator<< ( std::streambuf* buf )
    { stm << buf ; return *this ; }

    binary_ostream_wrapper& operator<< ( std::ios_base& (*manip)( std::ios_base& ) )
    { manip(stm) ; return *this ; }

    binary_ostream_wrapper& operator<< ( std::ios_base& (*manip)( std::ostream& ) )
    { manip(stm) ; return *this ; }

    binary_ostream_wrapper& operator<< ( std::ostream& (*manip)( std::ostream& ) )
    { manip(stm) ; return *this ; }

    // manipulators with args (if required)

    operator const void* () const { return stm ; }
    bool operator! () const { return !stm ; }

    // other ostream operations: good(), eof(), fail(), bad(), clear(), flush() etc.
    // (forward to stm)

    std::ostream& stm ;
};

// getline() etc.

int main()
{
    binary_ostream_wrapper bstm(std::clog) ;
    if( bstm << "hello world! " << 100 << std::endl ) std::cout << "ok.\n" ;
}
@Nobun: Qt is not an option.

@JLBorges: The problem is that the wrapper cannot be passed to a function expecting a std::istream/std::ostream and retain the behavior of the << >> operators.

JLBorges wrote:
A C++ programmer who finds it difficult to understand how to use .read() and .write()?
Shouldn't be attempting to write production C++ code, IMHO.
You're right, I just keep thinking that I have to make it simpler because the people using this code will be using it because they are new to C++. I guess I shouldn't be teaching them that >> and << are binary. I have a bad habit of this kind of thinking.
> The problem is that the wrapper cannot be passed to a function expecting a std::istream/std::ostream
> and retain the behavior of the << >> operators.

struct binary_ostream_wrapper : std::ostream { /* ... */ }; would allow that.
However, the << >> operators are not virtual, and no binary i/o would take place for << and >> if it is passed to a function expecting a std::istream or std::ostream.

(If they were virtual and were overridden in the manner suggested, it would lead to general insanity anyway - the semantics of the base class would be grossly violated by the derived class.)

Yeah. I just thought that there was some flag or something I could set to indicate binary I/O - obviously violating the behavior constraints of the base classes would not be right.
however, I remember that if I must write a file, I can use simply

ifstream in("filename", ios::in | ios::binary)
ofstream out("filename", ios::out | ios::binary)

I think that a similar thing is possible also for the parent "ostream" and "istream" (ios::binary will allow you to write your data - even the default ones - in a binary way)
No. The effect of ios::binary is to not automatically convert the various newline styles for you. It has no effect on the formatted input and output operators.
LB wrote:
I just thought that there was some flag or something I could set to indicate binary I/O - obviously violating the behavior constraints of the base classes would not be right.

That sounds like an I/O manipulator.

As a learning exercise, you could derive from num_put and num_get and provide your own do_put() and do_get() (which are the virtual functions to override if you want to change the behavior of operator>> or operator<< for numbers). Your do_put/do_get would either do binary I/O or regular text, based on an iword flag. Then you can write an I/O manipulator that flips that flag.
links:
num_put: http://en.cppreference.com/w/cpp/locale/num_put
num_get: http://en.cppreference.com/w/cpp/locale/num_get
iword: http://en.cppreference.com/w/cpp/io/ios_base/iword

but this is a really long way to go to replace .read() and .write()
Thanks Cubbi! I'l check those out. I can't believe I didn't think of writing my own stream manipulator :)

Edit: This StackOverflow Q/A pretty much suits me for my purposes:
http://stackoverflow.com/questions/799599/c-custom-stream-manipulator-that-changes-next-item-on-stream
Last edited on
Yes, Cubbi. Thanks!
Hm, I ran into a snag. It seems that with std::num_put::do_put, I can't actually tell the original type passed to the stream insertion operators, which is crucial for binary, and even more so since std::num_get::do_get can tell in more detail which type needs to be extracted.

Any suggestions?
Usually, as you know better than me, in raw binary you don't need to specify what kind of data you are writing, becouse normally it is demanded to the binary reader to interpret data.
Infact, usually, when you write a binary file, you define a "format" who determines the rules to interpret datas.

So, for example, if offset 0x0f must have an int, the data in 0x0f is usually not marked as "int" but it simply, for example, rappresented by the value of int (it is the work of the reader to know that here it is expected an int).

So I would like to understand if your problem is simply the need to write correctly the data for int, long, etc etc or if you need also to "mark" if the data is int, long, etc etc

I don't know if I can help you to solve your problem (it is interesting for me to trying to help you regardless of my limited skills. it is a good exercise also for my greed of learning. Moreover it could be nice to me to help you after you helped me) and I would be happy to try to do it, if it doesn't bother you (becouse my low level can be an actual problem to your needs)
Last edited on
@Nobun that's not the point - the point is that do_put only has overloads for 4 and 8 byte types, whereas do_get has overloads for 1, 2, 4, and 8 byte types.

Imagine writing a short (two bytes) and it gets written as a long (4 bytes), then trying to read it back as a short (2 bytes). All sort of problems.
Ah, you are right! That's ugly...
enforcing writing 4 bytes for short is something bad (I didn't note that short not defined).

Now I can see only a solution to circumvent the problem:

Create your own class that uses internally an "iostream" object (probably dumb, this is they way I would work, probably) or derives from iostream class... but in both case your class will not inter-operate directly with iostream-derived classes

Ah... I cannot figure a decent solution in this moment :(
Last edited on
I think I'll just have to pile on more abstraction layers around these classes, but it would be nice if the stream interface could be used directly and as intended. I guess that the formatted stream insertion operators really were never intended for preserving the original type of the data, but that doesn't really make sense because there are operator<< overloads for all the types.

Technically, I guess I could just make the read process read the usual four bytes even for smaller types, but it seems stupid and hacky. I was specifically asked to provide a C++ std::stream interface for this, but I think I'll have to stick with a separate class that simplifies binary I/O.
You could probably abuse setw:
1
2
3
int n = 0x1234;
mystream << binary_io << std::setw(2) << n; // prints '\x12', '\x34'
mystream << binary_io << std::setw(4) << n; // prints ''\0', '\0', '\x12', '\x34' 

or even make the output size the parameter of the manipulator, which makes it even closer to .write()
1
2
mystream << raw_bytes(2) << n; // prints '\x12', '\x34'
mystream << raw_bytes(4) << n; // prints ''\0', '\0', '\x12', '\x34' 


do_get has overloads for 1, 2, 4, and 8 byte types.

technically no, they are overloads for types. The byte counts may vary platform to platform (and there is no 1 anyway)

In general, the ostream interface isn't designed to handle shorts and ints differently: it's mimicking the usual arithmetic conversions that take place when you use such operands with built-in operators. Actually that makes a good SO question, let's try: http://stackoverflow.com/questions/15908853
I'm not up for abusing anything. I just find it hard to believe that they would make the formatted insertion/extraction operators extensible only to the point that it could support the required manipulators of the C++ standard library and no further.

And yes, I know that it is different types and not different numbers of bytes, which I guess is a pretty good reason for my desired behavior to be considered non-portable. I'm definitely just going to use a binary I/O wrapper (mainly because .read and .write are too buffer oriented to be used practically in my situation)

Thanks for all the info, though! I learned a lot!
Last edited on
Topic archived. No new replies allowed.