I am working on a program that sends data via mmap from one process to another. A buffer is used which holds sequential pairs of uint16_t values. Each pair contains x and y coordinates of an array of h264 macroblocks in a frame and will have values from 0 to 1920 for x and 0 to 1080 for y.
My thought was to bit shift the uint16_t to the Right by 4 in order to divide by 16. If the value is 16 or below, then change the sign. On the other end, decompress by bit shifting to the Left by 4 and if the value is negative, just multiply by -1. The values that are not multiples of 16 will obviously not decompress to the original value.
Is there a better algorithm or library that is fast and provide greater precision?
How do I decomress the last two values to their original and still have the type as uint16_t?
I'm probably missing something but I don't understand what the compression is for. 1080 and 1920 is way below the upper limit of uint16_t so a pair of uint16_t should be able to store the x, y coordinates just fine.
Sorry, should have been more clear. A buffer is used to transfer data between two processes via mmap. There could be up to 4 pairs of these processes communicating this way. Right now a pair of coordinates takes up 4 bytes ( 2 uint16_t). If I can use a pair of uint8_t instead then I can halve the size of the buffer. I am working on the raspberry pi and don't have a lot of system RAM to work with.
Are you willing to lose the last 3 bits? Then you can simply shift it by 3 to the right for "compressing" and shift it to the left by 3 for "uncompressing". The low 3 bits of the uncompressed values be zeroes.
The whole "negating" idea simply doesn't work.
If you can live with saving 25% instead of 50%, that would be pretty easy. Just put each value into 12 bits. You can pack 8 of those into 12 bytes (instead of 16 bytes for uint16_t). The values can actually fit into 11 bits (which, if packed, could save 30%), but that's quite a bit more work to compress/uncompress for only a little more savings.
If all values (0 ≤ x ≤ 1920 ; 0 ≤ y ≤ 1080) are possible, and you don't want to lose precision, then you need to use at least 21 bits for each x y coordinate pair.
you can run a zip compression over a block of them and reduce it by 50 to 90% or so. The standard gz library, maybe? This costs time to pack and unpack, of course, and again to bundle a group, so it may not be ideal for a stream or real time use? Compression will handle the extra bits so you don't have to squeeze it to 21 if you go this route.
are the numbers correlated at all? you may be able to exploit that (eg if you can compute one from the other). Do you know anything about the numbers (statistics, etc?).
Using a memory map is really fast. Are you sure that this is the bottleneck? I would think that you could transfer the data faster than you could do anything with it.
I am working on the raspberry pi and don't have a lot of system RAM to work with.
If the data won't fit in the memory? This device probably uses a flash disk, not a rotating hard drive, right? Its probably almost as fast as ram to write to a file, if that is the case... might be all you need is a disk file...?
@Peter87, How do you pack 1919,1079 into 21 bits? It seems it might be possible, but I'm not sure how. 10 bits can hold a max value of 1023 (56 less than we need for 1079), but 11 can hold 2047 (128 more than we need for 1919) so I suppose there's a little wiggle room there, but how to use it?
(I'm assuming the OP's mistaken that he needs to hold up to 1920 and 1080, but if that's the case it would obviously be the same problem.)
Yes, it kind of sucks.
OP has said losing the 4 least significant bits of precision of each dimension is acceptable, and both 1920 and 1080 are divisible by 8. 1920 * 1080 / 64 < 2^15. That's more easily packable.
encode(x, y) = x / 8 + y * 1920 / 8
decode(n) = (n % (1920 / 8) * 8, n / (1920 / 8) * 8)
I am working on a program that sends data via mmap from one process to another. A buffer is used which holds sequential pairs of uint16_t values. Each pair contains x and y coordinates of an array of h264 macroblocks in a frame and will have values from 0 to 1920 for x and 0 to 1080 for y.
Why does intarray contain an odd number of elements?
I would have expected your input to look like this:
Note 1: Keep in mind that the code I wrote earlier assumes that the input range is [0; 1920) for the x axis and [0; 1080) for the y axis. Using values outside these ranges will produce incorrect results, if you use the constants I used.
I tried this test code. It seems to work for smaller values but not if both the x and y coordinates are close to the maximum values. What's wrong with the code?
decode is the reverse, pull the 16 bit thing apart into 2 bytes and multiply by the conversion factor.
unsigned char * cp = (unsigned char *)result;
first = cp[0]/0.23611111111111111111111111111111;
second = cp[1]/0.1328125;
right?
edit
the pointer gibberish is a bit much here ... a working example of it
The main problem with this function is that multiplying with 1920 is too wasteful so the result doesn't always fit in an uint16_t. You should actually scale this value too, by dividing by 8, like you do for the other values.
Another problem is that the code assumes 1919 is the highest value that you will be using. If you want to be able to use 1920 you'll have to make it one bigger.
1920 / 8 + 1 = 241
So, if you replace 1920 with 241 in these functions it should work.
The main problem with this function is that multiplying with 1920 is too wasteful so the result doesn't always fit in an uint16_t. You should actually scale this value too, by dividing by 8, like you do for the other values.
Good eye! I moved the division by 8 because I thought p.second * 1920 / 8 would not be properly reversible by dividing by (1920/8). I forgot the correct expression was p.second / 8 * (1920 / 8).
Thanks a lot guys. I will be looking at this and evaluating some other strategies. Jonnin put in the idea of exploiting how the numbers are correlated and I think I will try to pursue that avenue as well as I am dealing with two groups of data. This works well for the first group where not having done any detailed statistics but just looking at the numbers, I don't see a specific pattern that I could exploit. The second group of data represents vectors with coordinates that come in ordered representing 16x16 blocks of pixels left to right and top to bottom. I switched my code for the second group of data for now and got it working but I am wondering if my implementation is efficient. I will post the question in a different thread.