FILE CORRUPTION FOR THE VISUALLY-ORIENTED, or

OFF-LABEL USES FOR NETPBM

Version G5L.1232

One thing that we can do with Netpbm that's a lot of fun is artful corruption of binary data. I used the following method to corrupt a video file, but it will work for any data.

Let's say you've got a small video file called "video_file.mkv" that is 5432101 bytes long. You can very easily turn it into an image that you can edit using GIMP or Krita or any other image editing software by simply grafting a Netpbm header onto the top of the file. In this example we will use the PPM format specifically, for a full-color image with 8-bit RGB pixels.

Before we do anything, we want to look at the hex values of the first few bytes of the file we'll be corrupting. Let's use hexdump to read the hex values for the first 16 bytes:

$ hexdump -Cn 16 video_file.mkv 
00000000  1a 45 df a3 01 00 00 00  00 00 00 23 42 86 81 01  |.E.........#B...|
00000010

The "00000000" is an index that tells us what part of the file we're looking at. Obviously, we're at the very beginning. The part that starts with "1a 45 df a3 ..." is the actual file data, and we'll want to remember those first few bytes later.

Now, let's concern ourselves with creating a PPM image header to tack onto our video file. The PPM header format is very simple. It can just be an ASCII string starting with the letters "P6" followed by a newline, then the image width and height (in pixels) separated by whitespace and followed by a newline, then the maximum value of each color component, again, followed by a newline; like this:

P6
[width] [height]
[pixel_depth]

To determine the width and height to use in your header, divide your file's size (in bytes) by 3 (each pixel will require three bytes, remember), then find the square root. For our example video_file.mkv: √ 5432101 ÷ 3 = 1345.6226563689142. We can only use integer dimensions though, so discard or round the fractional part. Our image will be 1345 x 1345 pixels.

For the pixel_depth, or maximum value of each color component, just use 255 for a the typical eight-bits-per-color RGB format.

So, our PPM header will look like this:

P6
1345 1345
255

Very simple. Now, write that to a file by issuing:

$ echo -e "P6\n1345 1345\n255" > header

It's important that the header ends with a newline character. The echo tool appends a newline by default, but if you decide to edit the header or create it using a different tool, make sure there's a newline after the "255".

Next, stick the header onto your video file:

$ cat header video_file.mkv > video_file_image.ppm

Now you can open video_file_image.ppm in your favourite graphics editor and mangle it. One thing to keep in mind is that video files will produce interesting glitches just by changing the occasional byte or two, but changing a lot of bytes will make them unplayable.

After you've tweaked a few pixels, save the image and then remove the PPM header. How do we do that? Well, this is why we did that hex dump earlier. Your graphics editor will probably change the format of your PPM header — GIMP certainly does — so we don't know exactly how much of the top of the file we need to remove, but we do know what the beginning sequence for the video portion of the file looks like, so we can use sed to erase everything before a given sequence of bytes:

$ sed -ze 's/^.*\?\(\x1A\x45\xDF\xA3\)/\1/' modified_video_file_image.ppm > corrupted_video_file.mkv

Did you catch that? Mind the part with "\x1A\x45\xDF\xA3". Those are the first four bytes of the video file: 1A 45 DF A3, remember? I used four bytes here; in some cases two would be enough, an in other cases you might need more than four, but in most cases four bytes should be enough of a pattern for sed to match.

And there you have it! If you haven't destroyed the bitstream too badly, you should be able to open that file up in your video player and enjoy its glitchy goodness. Before you can use your glitched file in any kind of nonlinear video editor, you'll probably want to render those glitches to a non-corrupt file using something like avconv, because the file that you corrupted might play okay in VLC or mplayer, but an NLVE will choke on an error-ridden file when you try to work with it.

—LÆMEUR <adam@laemeur.com>