Wednesday, April 22, 2009

More than meets the eye

I recently encountered a JPG image which appears as a picture in image viewers, internet browsers, and the like. However, opening it in WinRar reveals that it also encodes for a data archive.


One of the files in the archive is a text file containing instructions on how to create such an image/archive combination, along with a batch script that will do everything for you. Reading the contents of the batch script reveals that the combination is perfomed by using the "copy" command (I am a Windows user so I will refer to DOS commands) with not one but two source files - the jpeg and the rar archive. If the copy command sees more than one source file, it concatenates the data of all the source files, in their specified order, and outputs them to a destination file, or if a destination file is not specified it overwrites the contents of the first source file. The '/b' option must also be used with the copy command to specify that the data should be treated as binary rather than ASCII (since we are not dealing with text files).

It is important to specify the jpeg as the first file and the rar archive as the second, otherwise the image will not render. The reason is that jpeg encoding is broken up into a bunch of different chunks, each of which begins with a marker. The jpeg file starts with a Start of Image marker, and if that marker cannot be found at the beginning of the file (i.e. there is rar data before the jpeg data), then the image is "corrupt" and unable to be rendered. Likewise, the jpeg file ends with an End of Image marker, and disregards any data after it.

The rar file similarly begins with a header marker and ends with a ending marker. However, as I understand it, unlike how jpegs are interpreted, WinRar searches the entire file for the rar header marker, so this marker does not need to be located at the beginning of the file, which allows us to put the jpeg data there instead.So, when the file is opened in an image viewer, it sees the jpeg header at the beginning of the file and renders the jpeg image. When the file is opened in WinRar, WinRar searches and finds the rar header, and then displays the subsequent rar data.

This method of combination shoul also imply that more than just a jpeg file can be combined with a rar file - any other file should be able to be used instead of the jpeg, except for another rar file. And indeed, I was able to successfuly use a png image instead of a jpeg. Using a wav audio file instead of an image file was successful as well. When combining with an mp3, the rar archive was unable to be opened, and switching the order of the files made the mp3 unplayable. An avi file had the same problem as the mp3. When combining an mpg with a rar, the rar became unopenable, but switching up the order and combining the rar with the mpg let both files be openable/runnable. Realizing that this means that both an mpg and rar file can be the second source file in the combination made me consider attempting a triple combination. Would combining a jpeg with a rar and an mpg be successful? It sure is. Opening the file in the Windows Picture And Fax Viewer displayed the jpeg, opening with WinRar displayed the rar archive, and opening with Windows Media Player played the mpg.

I do not know the reasons that mp3's and avi's don't work, and why for a rar/mpg combination the rar must be the first source file. The assumption I would make is that the mp3 and avi have the same starting marker as the rar file, so when their data comes before the rar, WinRar finds the incorrect starting marker and fails to open an archive, and vice versa for when the rar data comes before the mp3/avi data. For the mpg, I would assume that the starting markers are distinct, but that the mpg data contains somewhere within it a marker that is the same as the rar's starting marker. In this way, when the mpg comes before the rar, WinRar finds the incorrect starting marker for the rar, yet when the rar comes before the mpg, the mpg starting marker is distinct and is not seen anywhere else in the file, so video players can open the mpg. Do keep in mind that these are mere speculations and not based on any research into the file structures.

Enjoy your newfound ability to hide files within files, and do keep in mind that a jpeg will look a little conspicuous if it is 30mb large.