Hex output

I am attempting to create a program that among other things reads in binary data and writes it in hex format to other (so that it is shown simply as 4B 56 0F, etc.). The data includes not only integers, but also chars and even filetime data(with which I am particularly unfamiliar). Is there is simple way to read the data out of a file, change it to hex, then right it back out to another file?

Duthomhas (13196)

Sure.

#include <algorithm>
#include <fstream>
#include <iomanip>
#include <iostream>
#include <iterator>
#include <string>
using namespace std;

struct hex_type
  {
  int  i;
  bool reset;
  hex_type( int i, bool r = false ): i( i ), reset( r ) { }
  };

hex_type reset_hex()
  {
  return hex_type( 0, true );
  }

std::ostream& operator << ( std::ostream& os, const hex_type& h )
  {
  static int count = 0;
  if (h.reset)
    {
    count = 0;
    return os;
    }
  os << std::setw( 2 ) << std::setfill( '0' ) << std::hex << std::uppercase << h.i;
  if (++count > 25)
    {
    count = 0;
    os << std::endl;
    }
  else os << ' ';
  return os;
  }

int main()
  {
  ifstream inf( "in.txt", ios::binary );
  ofstream outf( "out.txt" );

  inf >> noskipws;

  outf << reset_hex();

  copy(
    istream_iterator <unsigned char> ( inf ),
    istream_iterator <unsigned char> (),
    ostream_iterator <hex_type> ( outf )
    );

  return 0;
  }

The hex_type class is just a dummy class to get the cool overloaded operator, which does all the work of converting the char-->int that was read from inf to a nice, two-digit, uppercase, hexadecimal values, separated by spaces and arranged in neat lines of 26.

I've also added in a reset mechanism in case you want to use hex_type more than once. Just be sure to call reset_hex() before you begin every output operation that starts a new line.

Hope this helps.

Last edited on

marlinde17 (16)

Yes, that is very helpful. That is the form I am trying to create. I have another question or two now, if you dont mind. I have added that code to my program and it is does write the hex version. My next step is to seperate the hex codes into blocks of say 125 bytes, write them to the file, translate them to text and write that( with labels), then repeat. I have tried to modify the code you provided, but I am unable to get it to stop prior to the end of file, much less write text in between sections. Do you have any insight?

Last edited on

Duthomhas (13196)

It is always a good idea to look up the functions you are using in the documentation. For example, copy() says:
http://www.cplusplus.com/reference/algorithm/copy.html
it copies the entire range from the begin iterator to the end.

So there are two ways to fix it.
1) Give it a smaller input range,
2) Use an algorithm that stops before it hits the end.

We'll do both. (2) to read input that we can play with. And (1) to write output to file.

Since we are dealing with files and we want to stop on two conditions, a certain number of elements are copied or we hit the end of input, we'll write our own version of the copy() function that adds the maximum number of elements to copy as an argument. (Might as well make it a nice general-purpose algorithm that can be used elsewhere too.)

//--------------------------------------------------------------------------
// copy_max()
//   Copy a maximum number of elements in the input range to the
//   destination.
//
// returns
//   result +num_elements_copied
//
template <typename InputIterator, typename SizeType, typename OutputIterator>
inline OutputIterator copy_max(
  InputIterator  first,
  InputIterator  end,
  SizeType       count,
  OutputIterator result
  ) {
  for (; (count > 0) && (first != end); --count)
    *result++ = *first++;
  return result;
  }

I suppose we could abuse some other STL algorithm to do it, but this makes for more readable code with zero increase in bloat (well, except for about 20 lines of source code).

Now we have all we need to get a block of input.

string s;
copy_max(
  istream_iterator <unsigned char> ( inf ),
  istream_iterator <unsigned char> (),
  125,
  back_insert_iterator <string> ( s )
  );

At this point, we can do whatever we like with s and write stuff to outf.

  outf << "The following block contains the value 65h (ASCII 'e') "
       << std::count(
            s.begin(),
            s.end(),
            (char)0x65
            )
       << " times.\n";

When we're done messing with it, we can do our thing like before:

istringstream iss( s );
iss >> noskipws;
copy(
  istream_iterator <unsigned char> ( iss ),
  istream_iterator <unsigned char> (),
  ostream_iterator <hex_type> ( outf )
  );

The hex_type was designed just for this one program, not for general use. You may have noticed it has a magic number on line 30.

if (++count > 25)

I chose to line the output in rows of 26 just because it fit nicely in 80-columns. But if you're outputting 125 characters at a time, it will look just a little funny. So you might as well change it to 25 items per row, so you'll have a nice square block of (125/25) five rows. Change that 25 to 24.

Finally, you want everything in a loop, so you can do it for your entire input. However, there is one caveat you need to be aware of. (It stymied me for a while.) There is a bug in the way the istream_iterator indexes the input between multiple references. (Confirmed both for GCC and Borland C++Builder 2006, and presumed the standard works the same with other compilers too.)

Fortunately, since we are playing with a file, we've got nice random-access powers that we might as well put to work and fix it. Our loop:

// While there's input, get sections.
// The last section may not be fully populated.
for (unsigned section_number = 0; inf; section_number++)
  {
  // Handle the stupid istream_iterator bug: goto the next block to read.
  inf.seekg( section_number *125 );

  // Get the block of input
  string s;
  copy_max(
    istream_iterator <unsigned char> ( inf ),
    istream_iterator <unsigned char> (),
    125,
    back_insert_iterator <string> ( s )
    );

  // Play with it
  outf << "Section "
       << (section_number +1)
       << " contains the value 65h (ASCII 'e') "
       << std::count(
            s.begin(),
            s.end(),
            (char)0x65
            )
       << " times.\n";

  // And write our pretty rows
  istringstream iss( s );
  iss >> noskipws;
  copy(
    istream_iterator <unsigned char> ( iss ),
    istream_iterator <unsigned char> (),
    ostream_iterator <hex_type> ( outf )
    );
  outf << endl;
  }

Whew. Hope this helps. :-)

marlinde17 (16)

Wow, yeah it does. Thanks so much for your help. As much as I hate to say it, there is one last thing. It doesnt seem to recognize a blank space (or 20h). Is there a way to fix this or did I just type something in wrong? Thanks again.

Duthomhas (13196)

You forgot line 30.

The iostream iterators use the << and >> operators to handle I/O, so you have to tell the stream you are operating on to stop doing all that fancy 'ignore whitespace' stuff.

The same goes for making sure you istreams are opened with ios::binary, otherwise anything that looks like a newline (CRLF) will be treated as such. Binary turns off that translation and gives you the input data exactly as-is.

BTW. The answer I've given you is often considered 'pretty advanced stuff', but it really isn't. I'm glad you've made sense of it so readily. Good job!

:-)

Last edited on

marlinde17 (16)

I had that one, but I discovered that "inf >> noskipws" is also necessary since apparently it was ignoring them when it first read from the file. Thanks for all your help.

marlinde17 (16)

This code has been incredbily helpful. One last question about it. I am trying to use it in multiple locations (class routines of more than one class). If I were to put it in its own file to reference it from, how would I go about that? All my attempts have resulted in either multiple definition errors or a binary operator '=' error that I have not been able to solve. Any help is appreciated.

Duthomhas (13196)

I'm not sure I understand. How is it placed in another file and how are you using it?

To put it in another file, the copy_max() function and the function prototype for the routine that does stuff should be in a header file (hpp) and the cpp file should include the header and define the routine's body.

Thereafter, other files in the project only need include the header, and compilation should work the same as ever:

g++ main.cpp routine.cpp other.cpp another.cpp ...

Let me know if that works.

marlinde17 (16)

Right that is what I am attempting to do, put it into a header file and a cpp file, but I guess I am putting things in the wrong places or too much in the header or something.

Duthomhas (13196)

Did you forget multiple #include guards? Without seeing what you have done I can't speculate further...

marlinde17 (16)

Im afraid I am not familiar with the term #include guards. I will try to get an example of what I have tried so far and post it soon.

Last edited on

marlinde17 (16)

I can put the hex_type class in another file and also reset_hex without problem. The biggest problem I am having is writing prototypes for the std::ostream& operator << and for the template ...outputIterator copymax. I am unfamiliar with both types of "functions" and am unsure what goes in a file and what in the prototype or where to look it up. Of course including all in multiple places generates a multiple definition error. Simply taking the whole thing up to the { and using it as a prototype also fails, usually causing a binary operator error. Any guidance is appreciated and I can try to give more specific info if needed.

Duthomhas (13196)

Template functions go entirely into header files. There's no need to write prototypes... Just stick the whole thing in the same file that you use to #include the prototypes for your functions.

Here you go:

// hex_type.hpp

#ifndef HEX_TYPE_HPP
#define HEX_TYPE_HPP

#include <iostream>

struct hex_type
  {
  int  i;
  bool reset;
  hex_type( int i, bool r = false ): i( i ), reset( r ) { }
  };

inline hex_type reset_hex()
  {
  return hex_type( 0, true );
  }

std::ostream& operator << ( std::ostream& os, const hex_type& h );

template <typename InputIterator, typename SizeType, typename OutputIterator>
inline OutputIterator
copy_max(
  InputIterator  first,
  InputIterator  end,
  SizeType       count,
  OutputIterator result
  ) {
  for (; (count > 0) && (first != end); --count)
    *result++ = *first++;
  return result;
  }

#endif

// hex_type.cpp

#include <iomanip>
#include <iostream>

#include "hex_type.hpp"

std::ostream& operator << ( std::ostream& os, const hex_type& h )
  {
  static int count = 0;
  if (h.reset)
    {
    count = 0;
    return os;
    }
  os << std::setw( 2 ) << std::setfill( '0' ) << std::hex << std::uppercase << h.i;
  if (++count > 25)
    {
    count = 0;
    os << std::endl;
    }
  else os << ' ';
  return os;
  }

You can stick the copy_max() template function in its own header file if you like (I actually have a bunch of them like that that I wrote), but the way it is above is fine also.

Now to use hex_type and/or copy_max() all you have to do is #include "hex_type.hpp" in your file and use it normally. When you compile, make sure that "hex_type.cpp" gets compiled and linked into the executable.

Hope this helps.

marlinde17 (16)

That is again very helpful and again brings up one more question. I added to the operator

std::ostream& operator << ( std::ostream& os, const hex_type& h )
  {
  static int total = 0;
  static int count = 0;
  if (h.reset)
    {
    count = 0;
    return os;
    }
  os << std::setw( 2 ) << std::setfill( '0' ) << std::hex << std::uppercase << h.i;
   if(++total == X) //different number depending on file
   {
     count = 0;
     total = 0;
     os << std::endl;
   }  

 else if (++count > 7)
    {
    count = 0;
    os << std::endl;
    }
  else os << ' ';
  return os;
  }

The purpose of this was that I needed rows of eight and in some instances there was a number not a multiple of eight, so I added this in order to make it write a partial line in those instances. Now I need to change the number (which I denoted by X) in each file that uses that operator. The overloaded operator of course only takes two arguments so I cannot pass it in there. Currently I am using renamed versions of copy_max in each different file in order to hard code the X in, but that cant possibly be the best way. Is there another solution?

Last edited on

Duthomhas (13196)

The original design wrote partial lines just fine. There is no need to add anything for that case.

The original assumption was that each line is always the same length. You can set it differently by modifying the reset_hex() function.

However, here I've completely re-written it so you can get a better idea of what is going on. I've also added standard ostream manipulators (instead of the hex_type manipulator) to reset to the beginning of a new line and to change the number of items that can be output per line.

// hex_type.hpp

#ifndef HEX_TYPE_HPP
#define HEX_TYPE_HPP

#include <iostream>

struct hex_type
  {
  // This is the value that is printed
  int value;

  // These control line formatting
  static int current_count;  // number of items output to the current line
  static int total_count;    // total number of items to output per line

  // Initialization Constructor for use by the insertion operator
  hex_type( int value ): value( value ) { }
  };

std::ostream& hex_newline(  std::ostream& outs );

struct hex_newlength { };
hex_newlength hex_linelength( int newlength );
inline std::ostream& operator << ( std::ostream& outs, const hex_newlength& foo ) { return outs; }

std::ostream& operator << ( std::ostream& outs, const hex_type& hex_value );

#endif

// hex_type.cpp

#include <iomanip>
#include <iostream>
#include "hex_type.hpp"

int hex_type::current_count = 0;   // current number of items per line (initialize at beginning of line)
int hex_type::total_count   = 25;  // maximum number of items per line (default)

std::ostream& hex_newline( std::ostream& outs )
  {
  // Reset ourselves to thinking we are at the beginning of the line
  hex_type::current_count = 0;
  return outs;
  }

hex_newlength hex_linelength( int newlength )
  {
  // Change the number of items per line
  // but only if the new length is valid
  if (newlength > 0)
    {
    hex_type::total_count = newlength;
    }
  return hex_newlength();
  }

std::ostream& operator << ( std::ostream& outs, const hex_type& hex_value )
  {
  // Do we need to start a new line?
  if (hex_value.current_count >= hex_value.total_count)
    {
    hex_value.current_count = 0;
    outs << std::endl;
    }

  // Output a properly formatted value
  outs << std::setw( 2 ) << std::setfill( '0' ) << std::hex << std::uppercase << hex_value.value;

  // Separate the output values with space
  if (++hex_value.current_count < hex_value.total_count) outs << ' ';

  // As always, return the ostream for proper operator chaining
  return outs;
  }

Hopefully this new code (which is more strictly professional than the original) gives you new insights into what is going on, and provides you with some cool capabilities.

There is no need to watch out for uneven lines. The design handles that. All you have to do is stop writing the current line before you hit the end.

Use it as normal:

  ...

  // start a new line, with a new line length
  outf << hex_newline << hex_linelength( 8 );

  // output all the items
  copy(
    istream_iterator <unsigned char> ( iss ),
    istream_iterator <unsigned char> (),
    ostream_iterator <hex_type> ( outf )
    );
  outf << endl;

  ...

Hope this helps.

[edit] urgh, wait a sec. I think I goofed. Give me a minute to fix it.
[edit2] done. sorry about that. it was a really stupid mistake. Anyway, now you can see how to implement I/O manipulators.

Last edited on

marlinde17 (16)

As soon as I read the first line, I realized that I had completely forgotten the purpose of the reset_hex function. Sorry I didnt think it through more before asking. The new way did look simpler. I guess it is still over my head a little though since I do not understand how it knows what the input file is. In the original it was passed into copy_max(). Is copy_max still used? Or am I just missing the point yet again? Its not a big deal, the original gets the job done when I remember to use reset_hex. Thanks for all of your help.

[edit] Took a closer look

Duthomhas (13196)

Don't feel bad. This isn't exactly beginner stuff. But it is simple enough that you can understand it well enough.

Yes, copy_max() (or whatever other copy function you want to use) is still OK.

The hex_type class doesn't know anything about anything. It just does two things:

1. Prints an int as a two-digit hexadecimal value on some (unknown) output stream

2. Keeps track of two numbers that it uses to decide when to output a newline. (It has no idea if it is correct or not. It expects that you are using the reset and linelength manipulators correctly.)

Essentially all it is is a transformer. The copy_max() function reads an unsigned char from the input iterator (your list of bytes), turns it into a hex_type, then writes the new hex_type value to the output iterator (your output file).

Since we overloaded the ostream << operator to know how to print a hex_type, what we get is an output stream where all the input bytes are turned into two-character hex strings.

You'll notice that the hex_type class only stores information: the integer it represents, and access to two global values.

The ostream << operator does all the work of turning a hex_type into a string of characters.

BTW, the new version is superior because it handles newlines more correctly. The old one printed a newline whenever it finished a line. The new one only prints a newline when it starts a new line.

Don't try to read too much into it all. C++ tends to make really simple stuff take a lot of typing... :-)

Topic archived. No new replies allowed.