Photo Mosaics
Jan. 27th, 2013 01:37 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Recently, a colleague of mine moved to a new office and he used the chance to put some posters on his wall. One was a photo mosaic he made back when the first of the new Star Wars movies (the one with Qui-Gon Jinn) came out. Based on a few hundred scaled-down images, those images where chosen and arranged in a grid to for a larger image from the same movie. Large image could be identified from a distance, but if you would step closer, you could clearly see each small image.
We talked about this and how he implemented the algorithm back then and I got inspired to write an algorithm on my own …
The Math behind Photo Mosaics
The math behind photo mosaics is not very difficult and any computer science major should be able to figure it out. The core mathematical problem behind photo mosaics is the assignment problem, which Wikipedia defines as follows:
There are a number of agents and a number of tasks. Any agent can be assigned to perform any task, incurring some cost that may vary depending on the agent-task assignment. It is required to perform all tasks by assigning exactly one agent to each task in such a way that the total cost of the assignment is minimized.
This text can be translated into the photo mosaic problem if you see agents as the set of small image tiles we have at our disposal, tasks the set of positions on the large image, and as the costs as the difference between a small image tile and the part of the large image corresponding to a position. The distance or cost is based on the colors in each of the two images. The largest possible distance would be between a completely black and a completely white image, whereas two images dominated by shades of yellow would be very close (low cost). A ‘low-cost’ mosaic means that for each position, the tile is chosen which has the lowest difference to the position in the original image.
The Implementation
As I wanted to skip to implementation part (to get quick results), I chose to use dlib, a C++ library with a lot of functions including an implementation of one problem variant (maximum cost assignment problem solved by the Hungarian algorithm) and basic image loading, saving, and handling functions. Eventually, the most time-consuming part was to deal with templates and namespaces (enforced by dlib), which took longer than the ‘real’ programming …
Anyhow, I ended up with the implementation as seen on Gitorious. It is a C++ program and its only dependency is dlib. To compile it, you need cmake.
Using the Program
Once compiled, the program needs a number of arguments to generate mosaics:
- -n n
- The number of tiles per axis. The final image will be composed of n×n small images.
- -s f
- The factor at which the tiles are scaled down from their original size. For example, if you would have 16 tiles per axis, the original images were 320×200, at a scale-down factor of 1 the final image would be 5120×3200, at a factor of 8 it would be 640×400.
- -i filename
- The original image you are basing the mosaic on.
- -o filename
- Filename of the final mosaic image.
- one or several directories
- Directories where .png files are read from and to be used as tiles.
Example 1
You can get the original images used to create the Big Buck Bunny short film from their webpage. Of course, you won't need all 14315 and you want to skip the end credits. A simple shell script as the following will download you a good selection of images:
for (( n = 50 ; n < 11750 ; n += 25 )) ; do wget http://media.xiph.org/BBB/BBB-360-png/big_buck_bunny_$(printf "%05d" $n).png done
Let those lines run in a subdirectory called raw. Have a look in this directory for images you want to create mosaics from. I recommend images with large, clear forms so that it is easily recognizable, but feel free to experiment. For the following example, I am using raw/big_buck_bunny_11350.png (see here).
Now, for the first example, we want to render a photo mosaic that consists of 8×8 tiles. The resulting image shall be as large as the original image, so we scale down the tiles by a factor of 8. To create the image, run the command as follows:
/path/to/photomosaic -n 8 -s 8 -i raw/big_buck_bunny_11350.png -o mosaic1.png raw/
Running photomosaic the first time takes a little bit longer, as it loads each image and creates a ‘fingerprint’ (6×6 large thumbnail). All fingerprints from all files in a directory are stored in a file fingerprints.txt in this directory. Next time you run photomosaic, it will recognize this file and use this instead of inspecting each file again.
The first image you got may look like this:
You can recognize the basic structure, but the result is not really convincing. What you can try to do is to increase the number of tiles and see if the results get better. For example, increasing the number of tiles per axis to 20 and, to keep the final image at the same size, increasing the scale-down factor to the same value, results in this image:
/path/to/photomosaic -n 20 -s 20 -i raw/big_buck_bunny_11350.png -o mosaic2.png raw/
This result is better, but it can be improved further by a technique I call ‘sublevel’ing.
Sublevels
Imagine a film sequence you want to build a mosaic from, where most shots have blue/white sky in their upper half and dark soil in the lower half. To build a good mosaic, you need a lot of images that are mostly blue/white for the upper half and images that are mostly dark for the lower half. Well, your film clip does not provide such images and the resulting mosaic would be filled with alternating horizontal dark and bright stripes.
The idea I came up with is to extract parts of each tile image and to use those those subregions as tiles themselves. Those subimages are scaled at half the scaling factor so they become as large as the original tile. In total, nine subimages are extract, each half the width and height of the original one: One from each of the four edges, one from each of the four corners, and one from the center. So, starting from this image:
The following subimages are created:
Effectively, you will have 10 times more tiles available than you have raw images in your directory. In addition, you may have just the ‘right’ images generated as motivated in my sky/soil example above.
However, this will dramatically increase the time to generate a mosaic: The Hungarian algorithm has a cubic complexity, so expect that the step ‘Finding best cost assignment’ takes about 1000 times longer.
Example 2
To use the sublevel feature, you have to remove the fingerprints.txt file first (if you ran photomosaic before without sublevels) and add the -S (capital S) switch:
/path/to/photomosaic -n 20 -s 20 -S -i raw/big_buck_bunny_11350.png -o mosaic2.png raw/
Image Extraction
To extract images from a film clip using mplayer, I can recommend the following shell script. It automatically detects the length of the clip and saves screenshots in a sequence of PNG files. You can tune the for-loop's parameters to select the clip's region and number of images.
clipname=/path/to/clip.mp4 maxn=$(mplayer -endpos 0:01 -identify -nosound "${clipname}" | awk -F '[=.]+' '/^ID_LENGTH=/ {print $2}') for (( n = 1 ; n < $maxn ; n += 5 )) ; do mplayer -frames 1 -nosound -ss $n -vo png:z=1 "${clipname}" && mv 000*01.png raw/$(basename ${clipname})-$(printf "%06d" $n).png done
Good luck making your own photo mosaics!
Copyright Disclaimer
The original images taken from the film ‘Big Buck Bunny’ are licensed under the Creative Commons Attribution 3.0 license.
Created by the photomosaic program and used in this blog posting are licensed under the Creative Commons Attribution 3.0 license, too.
The source code of photomosaic is released under the GNU General Public License 3.0.