writing to a file ( file formats)

I'm trying to make a simple steganography like program that will change every 50th byte of an image to the value 0xff

but I'm having no success, I'm not sure where the data section starts and ends in a jpeg file, right now I am just winging it and all the program is does is renders the image as not "viewable"

I have an image named fighter.jpg in the the same folder as the program.

I've tried reading - https://en.wikipedia.org/wiki/JPEG_File_Interchange_Format and https://docs.fileformat.com/image/jpeg/

but still I can't decipher where the data part starts and ends as my loop is probably horribly wrong

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

#include <iostream>
#include <fstream>

using namespace std;

int main()
{
    int sizeBytes;
    {
    ifstream pic;
    pic.open("fighter.jpg",ios::binary);
    pic.seekg(0,ios::end); // get file size
    sizeBytes = pic.tellg();
    }
    ofstream pic;
    pic.open("fighter.jpg",ios::binary);

    for(unsigned int i = 2000; i < (sizeBytes-3000); i+=50)
        pic.write((char*)0xff,1);
}
Writing random bytes into any complex file type is just going to give you a corrupted file. JPEG in particular does compression, so even just writing to the compressed pixel data sections could corrupt it. According to the links, the compressed image data is between the "Start of scan" and "end of image" markers, so I would first try to locate those by parsing the file. Here's some tips from an SO post: https://stackoverflow.com/a/1602428

What exactly is your goal? To overwrite actual pixel values? Or to modify the compressed data?
Why not use a library to load the image as an array of pixel data, then modify that pixel data and write it back?

Of course, even if you did it the latter method, JPEG is still tricky since it is lossy, so whatever data you're trying to weave in there will also be subject to loss; my first idea would be to apply some error-correcting codes (coding theory). PNG, on the other hand, would not be lossy.
Last edited on
goal would be to overwrite the actual pixel values in increments of 50

Why not use a library to load the image as an array of pixel data, then modify that pixel data and write it back?


what Libraries would be effective for such a task? also could you not just do it manually like the way I'm trying to achieve?

thanks
Depends on what you mean by 'manually'? Again, if you are writing to the compressed image data directly, you'll probably just corrupt the compressed data, or you'll modify multiple pixels at once in non-obvious ways, not every 50th.

If you mean 'manually parse and decompress/recompress the JPEG data': It might be fun to write your own JPEG decoder/encoder, but it sounds like a massive time sink if you're just looking for a practical solution.

If you're looking for some libraries, I would suggest stb_image, which can handle JPEG encoding/decoding.
https://github.com/nothings/stb
https://github.com/aleksaro/gloom/wiki/Loading-images-with-stb
https://developpaper.com/introduction-of-simple-and-easy-to-use-image-decoding-library-stb_image/
Last edited on
Thanks Ganado,

I thought the file format of jpeg would be a lot simpler than it is, I thought it would just have a header with all of it's various metadata then I assumed that after the header it would just contain the rgb values of each pixel, couldn't have been more wrong,

is that any image file format close to what I described above, BMP even looks complicated from what I can tell and that is supposed to be one of the easier ones.

to stress the above: jpeg does NOT store images as raw pixel values, and trying to hide a message in one is going to cause trouble. Likewise, hiding a message in a raw RGB array and then converting that to jpg will LOSE most of the hidden message unless you use lossless jpg encoding.

that is the ONLY way to do this with jpg, really... is to convert the jpg to RGB, put your message into it, save it now in lossless format. Opening this lossless and converting it to RGB will now have the message in it. The size will increase slightly to save as lossless if the image was jpeg to begin with. It will be larger than usual if you started with an uncompressed format to begin with, like png, as a jpeg.

honestly it may be best to do it from the inside out.
make a RAW image format tool to do what you want to do, and you can apply the jpeg front and back ends later. RAW format is simply a binary file of RGB bytes (optionally store the dimensions in the first N locations, eg 2 16 bit ints). Several free image programs can store and load images in this format. Get that working, and worry about the format later.
Last edited on
BMP is one of the easier ones. It has uncompressed byte data in BGR order. One of the first C++ programming books I read actually had an example of loading a BMP image and modify, although I don't have it now. If you want a challenge that isn't too difficult, writing/reading a BMP image yourself would be reasonable.

Even easier than that might be something like "PPM", where you can optionally write everything in simple plaintext ASCII.
https://en.wikipedia.org/wiki/Netpbm
P1
# This is an example bitmap of the letter "J"
6 10
0 0 0 0 1 0
0 0 0 0 1 0
0 0 0 0 1 0
0 0 0 0 1 0
0 0 0 0 1 0
0 0 0 0 1 0
1 0 0 0 1 0
0 1 1 1 0 0
0 0 0 0 0 0
0 0 0 0 0 0

Thanks Ganado, that actually sounds like a fun experiment for today :) I will give it a shot and update you with any problems I encounter and the final result

Thanks Jonnin, that also sounds like a pretty fun yet complicated project but I'll give that a shot after I try to modify the easier BMP file
Last edited on
I always used uncompressed tga, which is similar to above but in standard RGB order. Order doesn't matter, though, if you are doing every Nth byte. Trying to remember.. humans see fewer shades of green than R/B so tampering with the low order nibble of greens on every 10th pixel or so barely touches the image?


Thanks Jonnin, that also sounds like a pretty fun yet complicated project but I'll give that a shot after I try to modify the easier BMP file


there isnt any point; the raw idea is similar to the bmp idea, exact same thing except bmp has more fluff in its header before the data starts.
Last edited on
Topic archived. No new replies allowed.