Input Output of 32 bit sizes in chars

Forum

Forum
Beginners
Input Output of 32 bit sizes in chars

Input Output of 32 bit sizes in chars

Aug 15, 2016 at 5:04pm

I want to know how, if there's any reasonable way, to take a 32 bit int and turn it into a character that will be counted as that number in a formatted file. I have done this before. I saved a long number as a char, and I then extracted that same number later. I want to know how I must've done it. I'm very very close to finishing a program and I need to know what it was I did. I tried to hire a programmer and he just gave me crap that it can't be done. I have seen it be done. I realize I'll be using char32_t. But if you can tell me how to input and output those, I'd be much appreciative. Thanks.

Aug 15, 2016 at 6:50pm

closed account (48bpfSEw)

void did_you_mean() { this ->

http://stackoverflow.com/questions/16593775/how-to-convert-u32string-to-int-in-c11

}

Aug 16, 2016 at 4:36am

mbozzi (3943)

Do you mean "convert a 32-bit integer to a string"?

1
2

# include <string> ...
std:: cout << std:: to_string (my_value) << "\n";

A file is a collection of arbitrary bytes with a known size. Formatted how? Counted how?

An arbitrary 32-bit value cannot fit in a single 8-bit value. That's common sense. I must be missing something.

Last edited on Aug 16, 2016 at 4:36am

Aug 16, 2016 at 1:25pm

thexiv (12)

	char d;

	for (int x = 0 ; x < H.size(); ++x) {
		m++;
		long int b = H[x].to_ulong();
		d = b;
		out << d;
	}

H is a vector of bitset<32>'s. I'm curious that I can do this, I wanted to some professional experience to validate my claim. This actually works as far as I can tell. I don't know if you knew that. But you can get maximum space relief with it in a compression scenario.

Last edited on Aug 16, 2016 at 1:41pm

Aug 16, 2016 at 9:20pm

mbozzi (3943)

If you're concerned about space then <vector> is the wrong approach.
std::vector allows (amortized) constant time insertion at the end by increasing the container capacity exponentially.

Therefore at most half of the allocated memory in the vector doesn't contain bitsets at any given time.

In your code, b should be unsigned. If H[x].to_ulong() evaluates to a value that's too big to fit in the signed integer, then the value of that signed integer is up to the implementation.

You also can't rely on long int (unsigned or not) being 32 bits wide. If you require that, use std::uint32_t from <cstdint> instead.

The conversion from long int to char in d = b; discards information.

As for whether or not this actually does what you intend, I can't tell since you never clarified anything. You are dropping the top bits of the value, which can be done with a simple assignment like on line 6.

Last edited on Aug 16, 2016 at 9:20pm

Aug 16, 2016 at 9:31pm

thexiv (12)

Okay, you're right on this, you almost nailed what I was looking for with the things you said there, @mbozzi. an unsigned char, according to climits.h allows for 256 or greater size for chars. Do I have to use climits in my code to stop the discarding of information?

Aug 16, 2016 at 10:03pm

mbozzi (3943)

I still don't know what you're trying to do!

By definition a C++ char (which is different than a "character") is a value that is 8 bits wide.
You cannot fit 32 bits of entropy into 8 bits.

If you want to write a 32 bit value to a file as a string and then read it back you can write code like this:

# include <fstream>
# include <cstdint>
# include <iostream>

int main (int, char **) {
  std::uint32_t write_val = 123456789;
  std::uint32_t read_val  = 0;

  char const * filename = "my-file";

  { /* Write `write_val' out to a file. */
    std::ofstream out_strm (filename);
    out_strm.exceptions (std::ios::badbit  |
                         std::ios::failbit |
                         std::ios::eofbit);

    out_strm << write_val; /* Write the value out. */
  } /* Output stream is closed here. */

  { /* Now read it back. */
    std::ifstream in_strm (filename);
    in_strm.exceptions (std::ios::badbit  |
                        std::ios::failbit |
                        std::ios::eofbit);
    in_strm >> read_val; /* Now read_val contains whatever we wrote out.  */
  } /* Input stream is closed here. */

  /* Check the value of read_val. */
  std::cout << "Read the value " << read_val << " into new_val\n";
}

You can find much more information about this in the tutorial on this site and elsewhere. Nowhere is `char' used in this code.

Last edited on Aug 16, 2016 at 10:07pm

Aug 16, 2016 at 10:12pm

thexiv (12)

4294967192*0 4294967245*0 4294967192*0 12*0 4294967236*0 51*0 24*0 4294967177*0 102*0 123*0 4294967193*0 92*0 24*0 17*0 4294967168*0 2*0 3*0 4294967178*0 97*0 99*0 4294967232*0 4294967217*0 12*0 4294967276*0 4294967168*0 74*0 4294967217*0

The left side of the asterisk is the read in value of the 'H' vector after output to file. The left (which apparently wants to be a bugger right now) is on the right of the asterisk. I know, that if I come out with the left, I'm going to be able to get the information back. That's my goal. I'm making a compression utility. Just a neat wrapper I made, isn't it? According to climits.h a char is not represented only in 8 bits. But it can be made of any number of bits.

Aug 16, 2016 at 11:39pm

mbozzi (3943)

Okay, you're right.

A char is not necessarily 8 bits. It is necessarily one machine byte. The size of a char is a constant expression. The constant expression sizeof(char) evaluates to 1. The precise number of bits in a char can be obtained from the C macro constant CHAR_BIT.

Since your long int is 32 bits I feel like it is a reasonable assumption that CHAR_BIT is 8.

--

So where's the compression going on? What does the left side of the asterisk represent? You haven't really explained your problem.

Compression relies on the fact that in most messages there is less entropy than is usually apparent per the length of the message. This entropy is the information-theoretic kind, i.e., "Shannon entropy".

If you hope to compress data but not lose any information (called "lossless compression"), you must have (or obtain through analysis) information about the data you are compressing. The total entropy of a message represents the absolute limit on the size of the compressed message.

You can't fit 32 bits of message into 8 bits unless you know something about the 32 bits. Truncating the top 24 just won't work.

Aug 16, 2016 at 11:47pm

thexiv (12)

void Node::insert_leaf(bool HL, bool LR) {
	HiLo.push_back(HL);
	if (HiLo.size() >= outSize) {
		for (int x = 0; x < outSize; x++)
			tempB[x] = HiLo[x];
		temp = tempB.to_ulong();
		H.push_back(temp);
		HiLo.erase(HiLo.begin(),HiLo.begin()+outSize);
	}
	LftRgt.push_back(LR);

	if (LftRgt.size() >= outSize) {
		for (int x = 0; x < outSize; x++)
			tempB[x] = LftRgt[x];
		temp = tempB.to_ulong();
		R.push_back(temp);
		LftRgt.erase(LftRgt.begin(),LftRgt.begin()+outSize);
	}
}

while (counter < length) {
	int p = static_cast<int>(mybuffer[counter]);
	counter++;

	// Here we give the relative bit size
	// 8 can be changed to whatever.
	// you'll always end up with 1/2 + 1

	long long a = static_cast<long long>(pow(2, MAX_BITS));

	if (p < 0)
		p = p * (-1);
	totalCnt++;
do {
	if (0 <= p - a/2 - a/4) {
		insert_leaf(Hi, RS);
		p = p - a/2 - a/4;
	}
	else if (0 <= p - a/2) {
		insert_leaf(Hi, LS);
		p = p - a/2;
	}
	else if (0 <= p - a/4) {
		insert_leaf(Lo, LS);
		p = p - a/4;
	}
	else
		insert_leaf(Lo, RS);
			

	a = a/4;
} while (p != 0);
}

That's my compressor. It looks like I'm taking 4 bits, but I'm really recording 8. It's just a strange way to count them (CONFIDENTIAL TASK). I just want a software version that does the same. I showed you the wrapper to get 32bit chars, in which I save 75% of my file size. the top wrapper just saves them into sequential bytes. I know it sounds lame, but I save 75% (somewhere in the 32 bit chars, as stated).

Last edited on Aug 16, 2016 at 11:49pm

Topic archived. No new replies allowed.