16

I was reading an article explaining How to Hide Files in JPEG Pictures.

I am wondering how it's possible for a file to contain both jpeg data and a rar file without any visible distortion either to the image or to the compressed file.

My guess is that it has something to do with how either the compressed file or the jpeg file is represented in binary form, but I have no idea how this works.

Can someone elaborate on that?

Luke
  • 7,719
  • 3
  • 43
  • 74
yasar
  • 11,262
  • 26
  • 80
  • 154

4 Answers4

12

All that is doing is adding the archive to the end of a JPEG stream. You then hope your JPEG decoder will not read past the EOI marker, find data there, and say something is wrong.

A JPEG image is a stream of bytes starting with an SOI marker and ending with an EOI marker.

ZIP and RAR are streams of byte. A ZIP stream starts with 50 4B. A RAR stream starts with 52 61 72 21 1A 07.

The method described in the link above takes a binary copy of (multiple) a JPEG stream and appends a ZIP or RAR stream to it.

The RAR/ZIP decoders scan the stream until they find the signature for RAR or ZIP (ignoring the JPEG stream).

user3344003
  • 18,590
  • 3
  • 22
  • 52
  • And how come archieve program skip jpeg data to get to arhieve. how does it know where jpeg ends and archive data starts? – yasar Apr 14 '15 at 22:52
  • It must look for its own signature in the data stream. Presumably this trick would work with most binary file types. So it is probably not scanning the JPEG and looking for the EOI marker. – user3344003 Apr 15 '15 at 00:01
3

This answer does not address the exact case in the link you gave, but it provides another way of hiding data:

It would also be theoretically possible to hide a file within the JPEG picture itself, but you would need a complicated program to write the encoded data and then read it again.

Basically, a JPEG photograph contains a lot of information which, if it changed, would not be noticeable to the human eye. Imagine you have a photo of a person in a blue shirt. If you zoom in on that shirt you will see that it is not an even blue colour, but made up of a multitude of flecks of colour, most of which are a bluish tone (but some could be other colours as well). You could easily change some of those flecks to a slightly different tone and it would make no obvious visible difference to the picture.

A clever program could embed a code in the photo by subtly changing pixels to a pattern that represents data. A very simple example: if the "hue" (i.e. colour tone) is represented by a number between 0 and 255, pixels of an even hue could represent a "0" bit and pixels of odd hue a "1" bit. It would be hard for the human eye to detect such a difference in the picture.

It is an old idea and this article discusses how much data could be hidden in this way: High capacity data hiding in JPEG-compressed images (2004)

rghome
  • 7,212
  • 8
  • 34
  • 53
1

In general, hiding a file within another file is a practice known as Steganography. The method described in the link you provided simply concatenates the .rar to the end of the .jpg using the + operator, taking advantage of the different headers of each file type. @user3344003 does an excellent job of explaining why this works in his/her answer. This doesn't distort the image because the image data is left unaltered.

Another common method of hiding a file within an image is to use the Least Significant Bit (LSB) of each byte. The way this is performed is to replace every 8th bit in the image's bitstream with the next bit of the file you wish to hide. This works because the image's colors can be distorted slightly without being easily perceived by the human eye. In this approach, the image's size on disk will not grow as it would in the method from your link. This makes evidence of the hidden file much harder to detect. For a detailed look at this and other Steganographic methods, see this paper by Bret Dunbar.

Community
  • 1
  • 1
Luke
  • 7,719
  • 3
  • 43
  • 74
0

there is a simple algorithm that I implemented it with matlab. if you division your image to 8 bit. the most significant bit has most valuable information and you can remove bit 0 and bit 1 without any change on original image. so you can put your file instead of bit 0 and 1. I saw this algorithm in anil.k.jain book.

Sara Santana
  • 931
  • 1
  • 10
  • 19