March 4th, 2009

The Past Didn't Go Anywhere

Why back in my day we walked uphill both ways through slushing sleeting flaming hydrogenated bat oil without iPhones and we liked it. Youngsters today have no [INSERT NOUN HERE].

Okay, so sometimes the old ways are retarded. Entertaining to read about, but retarded. We no longer have to write video games in 2K of ROM (English translation: considerably less than a printed page's worth of information). Better yet we're not stuck playing them.

But sometimes, insane constraints produce beautiful solutions.








These days most web sites employ dynamically generated graphics, even if it's something as simple as generating and displaying thumbnails of a photo. This is typically done with tools like the gd library, which I originated, or ImageMagick.

Both of these tools assume you have a lot of memory to play with. This is a reasonable assumption for two reasons:

1. It's 2009. You do have a lot of memory to play with. 2. Both tools let you draw things on the fly on top of an image (creating charts and graphs, for instance) and then save it again. The sensible way to do that is to load the entire image into memory as a nice, convenient two-dimensional grid of pixels.

Trouble is, that two-dimensional grid of pixels can be pretty darn big. Sometimes even by modern standards.

That ten-megapixel camera you just bought? It creates images that take up ten megapixels times four bytes equals roughly forty million bytes of memory when you load them up with the gd library (usually via PHP's imagecreatefromjpeg function).

Is that a dealbreaker? Well, your server probably has at least 2GB of RAM in it, and virtual memory besides. But that's not the whole story. Most sites are deliberately set up with much tighter limitations on the amount of memory that a single PHP request can use up. And when dozens of people are uploading huge photos at the same time... well, there are good reasons to have those limitations.

So we're back to the old days. That damn red dragon is going to eat us if we don't find the sword to kill the memory monster.

Enter the netpbm library: a collection of sweet, simple command line tools for playing with images. A collection of tools that dates back to 1988... an age when you might just barely have enough memory in your computer to hold one half-megabyte, low-resolution photo.

So these tools don't load the whole picture into memory at once. They load it one line at a time, or as many lines as they need to do the job in question, and they deal with those lines and pass them on to the next tool in the chain. You can read, resize, and re-save an image without using prodigious amounts of memory.

The netpbm tools are part of the Unix tradition, and they take advantage of the Unix concept of pipelines, which allow you to "pipe" the output of one program into another, and pipe that on to a second, and on and on. And that allows for some extremely elegant things:

anytopnm < uploadedfile.bmp | pnmscale -xsize 600 | pnmtojpeg > outputfile.jpg

That pipeline accepts a file in almost any format, scales it so that it is 600 pixels wide (preserving the aspect ratio), and converts it to a JPEG without ever loading the entire beast into memory at a single go. It loads just the rows it needs to see as it proceeds through the file. This isn't suitable for everything, but for scaling, cropping and conversion it's fan-freakin-tastic.

You can call that from PHP, using the system() function. But if you're working with the Symfony framework, you might not need to. I've wrapped up some common netpbm operations in pkImageConverterPlugin. In fact, you should be able to use that plugin with non-Symfony PHP code, though I haven't tried that myself.

You won't be able to conveniently use it on a Windows host, though. The netpbm utilities exist for Windows but the Windows command shell is one of my least favorite things, so you'd have to make some minor changes to get my code working with it.

Macs, on the other hand, are Unix-based and will work just fine with netpbm and pkImageConverterPlugin, once you install it via macports or Fink.

In fairness I should note that ImageMagick supports caching images on disk rather than representing every pixel in memory. That amounts to simulating virtual memory within your program, which might sneak you past the PHP memory limit. But this is not as elegant or as fast as a set of tools designed for row-by-row operation from the get-go.

It's good to let it all hang out and explore the possibilities of excess. But sometimes, restrictions are good things. And long after restrictions have been lifted, they can return in new forms and new situations. At which point we're grateful to have the carefully optimized code of an earlier era. Y'know, before everybody had one of these:


Check out another article
March 2nd, 2009
Of Our Time
February 26th, 2009
IDES 322: Instructables Part 2
By