597

If a picture's worth 1000 words, how much of a picture can you fit in 140 characters?

Note: That's it folks! Bounty deadline is here, and after some tough deliberation, I have decided that Boojum's entry just barely edged out Sam Hocevar's. I will post more detailed notes once I've had a chance to write them up. Of course, everyone should feel free to continue to submit solutions and improve solutions for people to vote on. Thank you to everyone who submitted and entry; I enjoyed all of them. This has been a lot of fun for me to run, and I hope it's been fun for both the entrants and the spectators.

I came across this interesting post about trying to compress images into a Twitter comment, and lots of people in that thread (and a thread on Reddit) had suggestions about different ways you could do it. So, I figure it would make a good coding challenge; let people put their money where their mouth is, and show how their ideas about encoding can lead to more detail in the limited space that you have available.

I challenge you to come up with a general purpose system for encoding images into 140 character Twitter messages, and decoding them into an image again. You can use Unicode characters, so you get more than 8 bits per character. Even allowing for Unicode characters, however, you will need to compress images into a very small amount of space; this will certainly be a lossy compression, and so there will have to be subjective judgements about how good each result looks.

Here is the result that the original author, Quasimondo, got from his encoding (image is licensed under a Creative Commons Attribution-Noncommercial license): Mona Lisa

Can you do better?

Rules

  1. Your program must have two modes: encoding and decoding.
  2. When encoding:
    1. Your program must take as input a graphic in any reasonable raster graphic format of your choice. We'll say that any raster format supported by ImageMagick counts as reasonable.
    2. Your program must output a message which can be represented in 140 or fewer Unicode code points; 140 code points in the range U+0000U+10FFFF, excluding non-characters (U+FFFE, U+FFFF, U+nFFFE, U+nFFFF where n is 110 hexadecimal, and the range U+FDD0U+FDEF) and surrogate code points (U+D800U+DFFF). It may be output in any reasonable encoding of your choice; any encoding supported by GNU iconv will be considered reasonable, and your platform native encoding or locale encoding would likely be a good choice. See Unicode notes below for more details.
  3. When decoding:
    1. Your program should take as input the output of your encoding mode.
    2. Your program must output an image in any reasonable format of your choice, as defined above, though for output vector formats are OK as well.
    3. The image output should be an approximation of the input image; the closer you can get to the input image, the better.
    4. The decoding process may have no access to any other output of the encoding process other than the output specified above; that is, you can't upload the image somewhere and output the URL for the decoding process to download, or anything silly like that.
  4. For the sake of consistency in user interface, your program must behave as follows:

    1. Your program must be a script that can be set to executable on a platform with the appropriate interpreter, or a program that can be compiled into an executable.
    2. Your program must take as its first argument either encode or decode to set the mode.
    3. Your program must take input in one or more of the following ways (if you implement the one that takes file names, you may also read and write from stdin and stdout if file names are missing):

      1. Take input from standard in and produce output on standard out.

        my-program encode <input.png >output.txt
        my-program decode <output.txt >output.png
        
      2. Take input from a file named in the second argument, and produce output in the file named in the third.

        my-program encode input.png output.txt
        my-program decode output.txt output.png
        
  5. For your solution, please post:
    1. Your code, in full, and/or a link to it hosted elsewhere (if it's very long, or requires many files to compile, or something).
    2. An explanation of how it works, if it's not immediately obvious from the code or if the code is long and people will be interested in a summary.
    3. An example image, with the original image, the text it compresses down to, and the decoded image.
    4. If you are building on an idea that someone else had, please attribute them. It's OK to try to do a refinement of someone else's idea, but you must attribute them.

Guidelines

These are basically rules that may be broken, suggestions, or scoring criteria:

  1. Aesthetics are important. I'll be judging, and suggest that other people judge, based on:
    1. How good the output image looks, and how much it looks like the original.
    2. How nice the text looks. Completely random gobbledigook is OK if you have a really clever compression scheme, but I also want to see answers that turn images into mutli-lingual poems, or something clever like that. Note that the author of the original solution decided to use only Chinese characters, since it looked nicer that way.
    3. Interesting code and clever algorithms are always good. I like short, to the point, and clear code, but really clever complicated algorithms are OK too as long as they produce good results.
  2. Speed is also important, though not as important as how good a job compressing the image you do. I'd rather have a program that can convert an image in a tenth of a second than something that will be running genetic algorithms for days on end.
  3. I will prefer shorter solutions to longer ones, as long as they are reasonably comparable in quality; conciseness is a virtue.
  4. Your program should be implemented in a language that has a freely-available implementation on Mac OS X, Linux, or Windows. I'd like to be able to run the programs, but if you have a great solution that only runs under MATLAB or something, that's fine.
  5. Your program should be as general as possible; it should work for as many different images as possible, though some may produce better results than others. In particular:
    1. Having a few images built into the program that it matches and writes a reference to, and then produces the matching image upon decoding, is fairly lame and will only cover a few images.
    2. A program that can take images of simple, flat, geometric shapes and decompose them into some vector primitive is pretty nifty, but if it fails on images beyond a certain complexity it is probably insufficiently general.
    3. A program that can only take images of a particular fixed aspect ratio but does a good job with them would also be OK, but not ideal.
    4. You may find that a black and white image can get more information into a smaller space than a color image. On the other hand, that may limit the types of image it's applicable to; faces come out fine in black and white, but abstract designs may not fare so well.
    5. It is perfectly fine if the output image is smaller than the input, while being roughly the same proportion. It's OK if you have to scale the image up to compare it to the original; what's important is how it looks.
  6. Your program should produce output that could actually go through Twitter and come out unscathed. This is only a guideline rather than a rule, since I couldn't find any documentation on the precise set of characters supported, but you should probably avoid control characters, funky invisible combining characters, private use characters, and the like.

Scoring rubric

As a general guide to how I will be ranking solutions when choosing my accepted solution, lets say that I'll probably be evaluating solutions on a 25 point scale (this is very rough, and I won't be scoring anything directly, just using this as a basic guideline):

  • 15 points for how well the encoding scheme reproduces a wide range of input images. This is a subjective, aesthetic judgement
    • 0 means that it doesn't work at all, it gives the same image back every time, or something
    • 5 means that it can encode a few images, though the decoded version looks ugly and it may not work at all on more complicated images
    • 10 means that it works on a wide range of images, and produces pleasant looking images which may occasionally be distinguishable
    • 15 means that it produces perfect replicas of some images, and even for larger and more complex images, gives something that is recognizable. Or, perhaps it does not make images that are quite recognizable, but produces beautiful images that are clearly derived from the original.
  • 3 points for clever use of the Unicode character set
    • 0 points for simply using the entire set of allowed characters
    • 1 point for using a limited set of characters that are safe for transfer over Twitter or in a wider variety of situations
    • 2 points for using a thematic subset of characters, such as only Han ideographs or only right-to-left characters
    • 3 points for doing something really neat, like generating readable text or using characters that look like the image in question
  • 3 points for clever algorithmic approaches and code style
    • 0 points for something that is 1000 lines of code only to scale the image down, treat it as 1 bit per pixel, and base64 encode that
    • 1 point for something that uses a standard encoding technique and is well written and brief
    • 2 points for something that introduces a relatively novel encoding technique, or that is surprisingly short and clean
    • 3 points for a one liner that actually produces good results, or something that breaks new ground in graphics encoding (if this seems like a low number of points for breaking new ground, remember that a result this good will likely have a high score for aesthetics as well)
  • 2 points for speed. All else being equal, faster is better, but the above criteria are all more important than speed
  • 1 point for running on free (open source) software, because I prefer free software (note that C# will still be eligible for this point as long as it runs on Mono, likewise MATLAB code would be eligible if it runs on GNU Octave)
  • 1 point for actually following all of the rules. These rules have gotten a bit big and complicated, so I'll probably accept otherwise good answers that get one small detail wrong, but I will give an extra point to any solution that does actually follow all of the rules

Reference images

Some folks have asked for some reference images. Here are a few reference images that you can try; smaller versions are embedded here, they all link to larger versions of the image if you need those:

Lena Mona Lisa Cornell Box StackOverflow Logo

Prize

I am offering a 500 rep bounty (plus the 50 that StackOverflow kicks in) for the solution that I like the best, based on the above criteria. Of course, I encourage everyone else to vote on their favorite solutions here as well.

Note on deadline

This contest will run until the bounty runs out, about 6 PM on Saturday, May 30. I can't say the precise time it will end; it may be anywhere from 5 to 7 PM. I will guarantee that I'll look at all entries submitted by 2 PM, and I will do my best to look at all entries submitted by 4 PM; if solutions are submitted after that, I may not have a chance to give them a fair look before I have to make my decision. Also, the earlier you submit, the more chance you will have for voting to be able to help me pick the best solution, so try and submit earlier rather than right at the deadline.

Unicode notes

There has also been some confusion on exactly what Unicode characters are allowed. The range of possible Unicode code points is U+0000 to U+10FFFF. There are some code points which are never valid to use as Unicode characters in any open interchange of data; these are the noncharacters and the surrogate code points. Noncharacters are defined in the Unidode Standard 5.1.0 section 16.7 as the values U+FFFE, U+FFFF, U+nFFFE, U+nFFFF where n is 110 hexadecimal, and the range U+FDD0U+FDEF. These values are intended to be used for application-specific internal usage, and conforming applications may strip these characters out of text processed by them. Surrogate code points, defined in the Unicode Standard 5.1.0 section 3.8 as U+D800U+DFFF, are used for encoding characters beyond the Basic Multilingual Plane in UTF-16; thus, it is impossible to represent these code points directly in the UTF-16 encoding, and it is invalid to encode them in any other encoding. Thus, for the purpose of this contest, I will allow any program which encodes images into a sequence of no more than 140 Unicode code points from the range U+0000U+10FFFF, excluding all noncharacters and surrogate pairs as defined above.

I will prefer solutions that use only assigned characters, and even better ones that use clever subsets of assigned characters or do something interesting with the character set they use. For a list of assigned characters, see the Unicode Character Database; note that some characters are listed directly, while some are listed only as the start and end of a range. Also note that surrogate code points are listed in the database, but forbidden as mentioned above. If you would like to take advantage of certain properties of characters for making the text you output more interesting, there are a variety of databases of character information available, such as a list of named code blocks and various character properties.

Since Twitter does not specify the exact character set they support, I will be lenient about solutions which do not actually work with Twitter because certain characters count extra or certain characters are stripped. It is preferred but not required that all encoded outputs should be able to be transferred unharmed via Twitter or another microblogging service such as identi.ca. I have seen some documentation stating that Twitter entity-encodes <, >, and &, and thus counts those as 4, 4, and 5 characters respectively, but I have not tested that out myself, and their JavaScript character counter doesn't seem to count them that way.

Tips & Links

  • The definition of valid Unicode characters in the rules is a bit complicated. Choosing a single block of characters, such as CJK Unified Ideographs (U+4E00–U+9FCF) may be easier.
  • You may use existing image libraries, like ImageMagick or Python Imaging Library, for your image manipulation.
  • If you need some help understanding the Unicode character set and its various encodings, see this quick guide or this detailed FAQ on UTF-8 in Linux and Unix.
  • The earlier you get your solution in, the more time I (and other people voting) will have to look at it. You can edit your solution if you improve it; I'll base my bounty on the most recent version when I take my last look through the solutions.
  • If you want an easy image format to parse and write (and don't want to just use an existing format), I'd suggest using the PPM format. It's a text based format that's very easy to work with, and you can use ImageMagick to convert to and from it.
meager
  • 209,754
  • 38
  • 307
  • 315
Brian Campbell
  • 289,867
  • 55
  • 346
  • 327
  • Feel free to offer suggestions on the rules I wrote up in the comments; I'm certainly willing to tweak them if people feel like they need clarification or are too over-specified. – Brian Campbell May 21 '09 at 06:58
  • 6
    You probably should say that uploading the image to a server and posting the url to it is not valid. – Shay Erlichmen May 21 '09 at 07:26
  • 2
    @Shay Didn't I already say that? "The decoding process may have no access to any other output of the encoding process other than the output specified above; that is, you can't upload the image somewhere and output the URL for the decoding process to download, or anything silly like that." – Brian Campbell May 21 '09 at 13:26
  • @Brian I think it would be a good idea to supply a reference image or images that all entries can be jugged on, this would level the field and make it easier to judge. – Richard Stelling May 22 '09 at 12:30
  • Nice challenge – but I oppose to your use of the word “silly.” In fact, that solution (using a URI/DOI or similar) is an extremely *good* idea to solve this kind of problems because it uses (very basic) *semantic* information to encode the image. This is in the spirit of the Semantic Web project. It's of course fair to exclude such solutions because they're simply impossible to beat – but still, they are anything but silly, and they are definitely a kind of compression algorithm (using a very large dictionary). – Konrad Rudolph May 22 '09 at 12:58
  • 1
    @Konrad Rudolph I agree; I did not mean "silly" from a practical point of view (clearly, this whole contest is silly from a practical point of view), I meant "silly" in the context of this contest. Using a URI is not really a compression algorithm, in the information theory sense, as it does not allow you to transfer any more information without simply using an alternate channel. You could give the encoder and decoder a large database of images, and call it compression that works only on a limited set of images, but I specified that you need to be able to handle an arbitrary image. – Brian Campbell May 22 '09 at 20:07
  • 2
    Here are a couple of links I've run across that may help folks out: http://www.azillionmonkeys.com/qed/unicode.html for an explanation of the valid range of Unicode characters. Note that the UTF encodings are the ones that can encode the entire Unicode range; UCS-4 is a superset of Unicode, and UCS-2 & ASCII are subsets. And on the compression front, here's a similar technique as the original post, though he's allowing himself 1k rather than 350 bytes: http://www.screamingduck.com/Article.php?ArticleID=46&Show=ABCE – Brian Campbell May 22 '09 at 20:38
  • It's written anywhere if the resulting image should have the same size of the original? – Gabriele D'Antona May 24 '09 at 08:44
  • @friol From the guidelines: "It is perfectly fine if the output image is smaller than the input, while being roughly the same proportion. It's OK if you have to scale the image up to compare it to the original; what's important is how it looks." You should maintain the original aspect ratio (approximately is OK), so if you have a 200x400 input image, it's OK if the output image is 20x40 (or if the output is some form of vector graphic, even). You should not turn all images into a square shape, though; aspect ratio is one of the pieces you need to transmit somehow. – Brian Campbell May 25 '09 at 03:58
  • One rules question, but would converting over to a gray scale image count against us, assuming that the image still looked good? – rjzii May 28 '09 at 13:40
  • @Rob It will be based on a subjective judgement. If converting to grayscale allows you more detail, that's great. If the detail is about the same as a color version, then the grayscale version probably won't compete very well. So yes, please submit solutions using greyscale if you think that helps. Part of the point of the contest is just to see how different approaches fare; I'd love for people to experiment and try out different solutions. – Brian Campbell May 28 '09 at 14:21
  • In my opinion, I would set it up so that you could use multiple twits to send the image... with some TCP type code in the front to allow for 1/3, 2/3 type notes for the decrypter. – Ape-inago May 30 '09 at 18:06
  • @Joey Sorry you didn't see it soon enough! I tried to publicize it well, but it's hard to make sure everyone sees it. @ape-inago Yes, you could fit any image into multiple twits if you split it up and recombined it. But then there wouldn't be much of a challenge, would there? The point of this was supposed to be to see what kind of compression people could do with brutally tight constraints. – Brian Campbell May 30 '09 at 18:15
  • This wouldn't count as an entry but would uploading data to the user's "avatar" slot a possible solution? – chakrit Jun 19 '09 at 21:14
  • I like this challenge. Here's an alternative thought though, since we can encode the data of an image into text, what if we do the opposite. Take a random selection of say, 1000 tweets, and decode them into images and see what we get. – Nick Radford Jul 19 '11 at 22:36

15 Answers15

288

image files and python source (version 1 and 2)

Version 1 Here is my first attempt. I will update as I go.

I have got the SO logo down to 300 characters almost lossless. My technique uses conversion to SVG vector art so it works best on line art. It is actually an SVG compressor, it still requires the original art go through a vectorisation stage.

For my first attempt I used an online service for the PNG trace however there are MANY free and non-free tools that can handle this part including potrace (open-source).

Here are the results

Original SO Logo http://www.warriorhut.org/graphics/svg_to_unicode/so-logo.png Original Decoded SO Logo http://www.warriorhut.org/graphics/svg_to_unicode/so-logo-decoded.png After encoding and decoding

Characters: 300

Time: Not measured but practically instant (not including vectorisation/rasterisation steps)

The next stage will be to embed 4 symbols (SVG path points and commands) per unicode character. At the moment my python build does not have wide character support UCS4 which limits my resolution per character. I've also limited the maximum range to the lower end of the unicode reserved range 0xD800 however once I build a list of allowed characters and a filter to avoid them I can theoretically push the required number of characters as low as 70-100 for the logo above.

A limitation of this method at present is the output size is not fixed. It depends on number of vector nodes/points after vectorisation. Automating this limit will require either pixelating the image (which removes the main benefit of vectors) or repeated running the paths through a simplification stage until the desired node count is reached (which I'm currently doing manually in Inkscape).

Version 2

UPDATE: v2 is now qualified to compete. Changes:

  • Command-line control input/output and debugging
  • Uses XML parser (lxml) to handle SVG instead of regex
  • Packs 2 path segments per unicode symbol
  • Documentation and cleanup
  • Support style="fill:color" and fill="color"
  • Document width/height packed into single character
  • Path color packed into single character
  • Color compression is acheived by throwing away 4bits of color data per color then packing it into a character via hex conversion.

Characters: 133

Time: A few seconds

v2 decoded http://www.warriorhut.org/graphics/svg_to_unicode/so-logo-decoded-v2.png After encoding and decoding (version 2)

As you can see there are some artifacts this time. It isn't a limitation of the method but a mistake somewhere in my conversions. The artifacts happen when the points go outside the range 0.0 - 127.0 and my attempts to constrain them have had mixed success. The solution is simply to scale the image down however I had trouble scaling the actual points rather than the artboard or group matrix and I'm too tired now to care. In short, if your points are in the supported range it generally works.

I believe the kink in the middle is due to a handle moving to the other side of a handle it's linked to. Basically the points are too close together in the first place. Running a simplify filter over the source image in advance of compressing it should fix this and shave of some unnecessary characters.

UPDATE: This method is fine for simple objects so I needed a way to simplify complex paths and reduce noise. I used Inkscape for this task. I've had some luck with grooming out unnecessary paths using Inkscape but not had time to try automating it. I've made some sample svgs using the Inkscape 'Simplify' function to reduce the number of paths.

Simplify works ok but it can be slow with this many paths.

autotrace example http://www.warriorhut.org/graphics/svg_to_unicode/autotrace_16_color_manual_reduction.png cornell box http://www.warriorhut.com/graphics/svg_to_unicode/cornell_box_simplified.png lena http://www.warriorhut.com/graphics/svg_to_unicode/lena_std_washed_autotrace.png

thumbnails traced http://www.warriorhut.org/graphics/svg_to_unicode/competition_thumbnails_autotrace.png

Here's some ultra low-res shots. These would be closer to the 140 character limit though some clever path compression may be need as well.

groomed http://www.warriorhut.org/graphics/svg_to_unicode/competition_thumbnails_groomed.png Simplified and despeckled.

trianglulated http://www.warriorhut.org/graphics/svg_to_unicode/competition_thumbnails_triangulated.png Simplified, despeckled and triangulated.

autotrace --output-format svg --output-file cornell_box.svg --despeckle-level 20 --color-count 64 cornell_box.png

ABOVE: Simplified paths using autotrace.

Unfortunately my parser doesn't handle the autotrace output so I don't know how may points are in use or how far to simplify, sadly there's little time for writing it before the deadline. It's much easier to parse than the inkscape output though.

SpliFF
  • 35,724
  • 15
  • 80
  • 113
  • 2
    Excellent! At first I wanted to create a hybrid vector solution with both sharp edges and smooth areas, but it proved far too complex without using a tracing library (which I didn't want to use). I'm looking forward to seeing how far you can get with your method! – sam hocevar May 28 '09 at 10:16
  • Nice! I was hoping we'd see some attempts at near-lossless approaches by vectorization. It means it has lower generality, but higher quality for the images it does cover. It's fine to use an online service for vectorization. Good luck on getting the size down further! – Brian Campbell May 28 '09 at 13:00
  • I would consider image compression and character encoding as two different steps - Sam's technique seems to be optimal for the encoding, and could easily be built into a stand-alone program. You'll get more bang for your buck by concentrating on the unique part of your solution (i.e. the compression part) and just outputting a string of bits. – Mark Ransom May 28 '09 at 21:12
  • 70
    Wow. These images look really stylish. – Rinat Abdullin Jun 05 '09 at 04:24
244

Alright, here's mine: nanocrunch.cpp and the CMakeLists.txt file to build it using CMake. It relies on the Magick++ ImageMagick API for most of its image handling. It also requires the GMP library for bignum arithmetic for its string encoding.

I based my solution off of fractal image compression, with a few unique twists. The basic idea is to take the image, scale down a copy to 50% and look for pieces in various orientations that look similar to non-overlapping blocks in the original image. It takes a very brute force approach to this search, but that just makes it easier to introduce my modifications.

The first modification is that instead of just looking at ninety degree rotations and flips, my program also considers 45 degree orientations. It's one more bit per block, but it helps the image quality immensely.

The other thing is that storing a contrast/brightness adjustment for each of color component of each block is way too expensive. Instead, I store a heavily quantized color (the palette has only 4 * 4 * 4 = 64 colors) that simply gets blended in in some proportion. Mathematically, this is equivalent to a variable brightness and constant contrast adjustment for each color. Unfortunately, it also means there's no negative contrast to flip the colors.

Once it's computed the position, orientation and color for each block, it encodes this into a UTF-8 string. First, it generates a very large bignum to represent the data in the block table and the image size. The approach to this is similar to Sam Hocevar's solution -- kind of a large number with a radix that varies by position.

Then it converts that into a base of whatever the size of the character set available is. By default, it makes full use of the assigned unicode character set, minus the less than, greater than, ampersand, control, combining, and surrogate and private characters. It's not pretty but it works. You can also comment out the default table and select printable 7-bit ASCII (again excluding <, >, and & characters) or CJK Unified Ideographs instead. The table of which character codes are available is stored a run-length encoded with alternating runs of invalid and valid characters.

Anyway, here are some images and times (as measured on my old 3.0GHz P4), and compressed to 140 characters in the full assigned unicode set described above. Overall, I'm fairly pleased with how they all turned out. If I had more time to work on this, I'd probably try to reduce the blockiness of the decompressed images. Still, I think the results are pretty good for the extreme compression ratio. The decompressed images are bit impressionistic, but I find it relatively easy to see how bits correspond to the original.

Stack Overflow Logo (8.6s to encode, 7.9s to decode, 485 bytes):
http://i44.tinypic.com/2w7lok1.png

Lena (32.8s to encode, 13.0s to decode, 477 bytes):
http://i42.tinypic.com/2rr49wg.png http://i40.tinypic.com/2rhxxyu.png

Mona Lisa (43.2s to encode, 14.5s to decode, 490 bytes):
http://i41.tinypic.com/ekgwp3.png http://i43.tinypic.com/ngsxep.png

Edit: CJK Unified Characters

Sam asked in the comments about using this with CJK. Here's a version of the Mona Lisa compressed to 139 characters from the CJK Unified character set:

http://i43.tinypic.com/2yxgdfk.png 咏璘驞凄脒鵚据蛥鸂拗朐朖辿韩瀦魷歪痫栘璯緍脲蕜抱揎頻蓼債鑡嗞靊寞柮嚛嚵籥聚隤慛絖銓馿渫櫰矍昀鰛掾撄粂敽牙稉擎蔍螎葙峬覧絀蹔抆惫冧笻哜搀澐芯譶辍澮垝黟偞媄童竽梀韠镰猳閺狌而羶喙伆杇婣唆鐤諽鷍鴞駫搶毤埙誖萜愿旖鞰萗勹鈱哳垬濅鬒秀瞛洆认気狋異闥籴珵仾氙熜謋繴茴晋髭杍嚖熥勳縿餅珝爸擸萿

The tuning parameters at the top of the program that I used for this were: 19, 19, 4, 4, 3, 10, 11, 1000, 1000. I also commented out the first definition of number_assigned and codes, and uncommented out the last definitions of them to select the CJK Unified character set.

Community
  • 1
  • 1
Boojum
  • 6,326
  • 1
  • 28
  • 32
  • Wow! Nice job. I was skeptical of fractal image compression for images this small, but it actually does produce pretty decent results. It was also pretty easy to compile and run. – Brian Campbell May 30 '09 at 15:33
  • 1
    Thanks guys! Sam, do you mean results with just 140 CJK characters? If so, then yes, you'll need to tune the numbers at the top. The final size in bits is around log2(steps_in_x*steps_in_y*steps_in_red*steps_in_green*steps_in_blue)*blocks_in_x*blocks_in_y+log2(maximum_width*maximum_height). – Boojum May 30 '09 at 18:04
  • Edit: There's a * 16 in the first log2() that I left out. That's for the possible orientations. – Boojum May 30 '09 at 18:13
  • 20
    Have anyone twitter'd an image using this yet? – dbr May 31 '09 at 16:16
199

My full solution can be found at http://caca.zoy.org/wiki/img2twit. It has the following features:

  • Reasonable compression time (around 1 minute for high quality)
  • Fast decompression (a fraction of a second)
  • Keeps the original image size (not just the aspect ratio)
  • Decent reconstruction quality (IMHO)
  • Message length and character set (ASCII, CJK, Symbols) can be chosen at runtime
  • Message length and character set are autodetected at decompression time
  • Very efficient information packing

http://caca.zoy.org/raw-attachment/wiki/img2twit/so-logo.png http://caca.zoy.org/raw-attachment/wiki/img2twit/twitter4.png

蜥秓鋖筷聝诿缰偺腶漷庯祩皙靊谪獜岨幻寤厎趆脘搇梄踥桻理戂溥欇渹裏軱骿苸髙骟市簶璨粭浧鱉捕弫潮衍蚙瀹岚玧霫鏓蓕戲債鼶襋躻弯袮足庭侅旍凼飙驅據嘛掔倾诗籂阉嶹婻椿糢墤渽緛赐更儅棫武婩縑逡荨璙杯翉珸齸陁颗鳣憫擲舥攩寉鈶兓庭璱篂鰀乾丕耓庁錸努樀肝亖弜喆蝞躐葌熲谎蛪曟暙刍镶媏嘝驌慸盂氤缰殾譑

Here is a rough overview of the encoding process:

  • The number of available bits is computed from desired message length and usable charset
  • The source image is segmented into as many square cells as the available bits permit
  • A fixed number of points (currently 2) is affected to each cell, with initial coordinates and colour values
  • The following is repeated until a quality condition is met:
    • A point is chosen a random
    • An operation is performed at random on this point (moving it inside its cell, changing its colour)
    • If the resulting image (see the decoding process below) is closer to the source image, the operation is kept
  • The image size and list of points is encoded in UTF-8

And this is the decoding process:

  • The image size and points are read from the UTF-8 stream
  • For each pixel in the destination image:
    • The list of natural neigbours is computed
    • The pixel's final colour is set as a weighted average of its natural neighbours' colours

What I believe is the most original part of the program is the bitstream. Instead of packing bit-aligned values (stream <<= shift; stream |= value), I pack arbitrary values that are not in power-of-two ranges (stream *= range; stream += value). This requires bignum computations and is of course a lot slower, but it gives me 2009.18 bits instead of 1960 when using the 20902 main CJK characters (that's three more points I can put in the data). And when using ASCII, it gives me 917.64 bits instead of 840.

I decided against a method for the initial image computation that would have required heavy weaponry (corner detection, feature extraction, colour quantisation...) because I wasn't sure at first it would really help. Now I realise convergence is slow (1 minute is acceptable but it's slow nonetheless) and I may try to improve on that.

The main fitting loop is loosely inspired from the Direct Binary Seach dithering algorithm (where pixels are randomly swapped or flipped until a better halftone is obtained). The energy computation is a simple root-mean-square distance, but I perform a 5x5 median filter on the original image first. A Gaussian blur would probably better represent the human eye behaviour, but I didn't want to lose sharp edges. I also decided against simulated annealing or other difficult to tune methods because I don't have months to calibrate the process. Thus the "quality" flag just represents the number of iterations that are performed on each point before the encoder ends.

http://caca.zoy.org/raw-attachment/wiki/img2twit/Mona_Lisa_scaled.jpg http://caca.zoy.org/raw-attachment/wiki/img2twit/twitter2.png

苉憗揣嶕繠剳腏篮濕茝霮墧蒆棌杚蓳縳樟赒肴飗噹砃燋任朓峂釰靂陴貜犟掝喗讄荛砙矺敨鷾瓔亨髎芟氲簵鸬嫤鉸俇激躙憮鄴甮槺骳佛愚猪駪惾嫥綖珏矯坼堭颽箽赭飉訥偁箝窂蹻熛漧衆橼愀航玴毡裋頢羔恺墎嬔鑹楄瑥鶼呍蕖抲鸝秓苾绒酯嵞脔婺污囉酼俵菛琪棺则辩曚鸸職銛蒝礭鱚蟺稿纡醾陴鳣尥蟀惘鋁髚忩祤脤养趯沅况

Even though not all images compress well, I'm surprised by the results and I really wonder what other methods exist that can compress an image to 250 bytes.

I also have small movies of the encoder state's evolution from a random initial state and from a "good" initial state.

Edit: here is how the compression method compares with JPEG. On the left, jamoes's above 536-byte picture. On the right, Mona Lisa compressed down to 534 bytes using the method described here (the bytes mentioned here refer to data bytes, therefore ignoring bits wasted by using Unicode characters):

http://caca.zoy.org/raw-attachment/wiki/img2twit/minimona.jpg http://caca.zoy.org/raw-attachment/wiki/img2twit/minimona2.png

Edit: just replaced CJK text with the newest versions of the images.

sam hocevar
  • 11,037
  • 5
  • 42
  • 59
  • I don't actually need to be able to run the code (I put the part about running it in the guidelines, as a suggestion, not the rules); I'd prefer to be able to run it, but I'll be judging this more on the quality of the images you generate, the code, and any interesting tricks or algorithms. If I want to run it and it requires packages I don't have or don't want to install on my main system, I can just boot up an Amazon EC2 instance and install it. As long as you're working with libraries that are packaged for one of the major distros, I should be able to run it. Feel free to use CGAL. – Brian Campbell May 25 '09 at 03:45
  • 2
    Okay, here's my solution (source code): http://caca.zoy.org/browser/libpipi/trunk/examples/img2twit.cpp My explanation attempt and a few examples are at http://caca.zoy.org/wiki/img2twit – sam hocevar May 25 '09 at 14:17
  • Great! That's the first full solution. Do you suppose you could edit your answer (the one where you asked your first question) to include some or all of the explanation and one or two of the example images? It's a lot nicer to have it inline here than have it linked to from a comment. – Brian Campbell May 25 '09 at 21:44
  • 2
    I really like your solution. You should try reducing the number of values assigned to the blue channel as the human eye can't resolve blue very well: http://nfggames.com/games/ntsc/visual.shtm; this will allow you to have more detail at the expense of some color information being lost. Or perhaps assign it to green? – rpetrich May 26 '09 at 00:54
  • 5
    Good point. I did try a few variations of this idea (see the comments before the RANGE_X definition) but not very thoroughly. As you can see, using 5 blue values instead of 6 increased the error slightly less than using 7 values of green decreased it. I didn't try doing both out of laziness. Another problem I have is that I don't have a very good error function. I currently use ∑(∆r²+∆g²+∆b²)/3, which works OK. I tried ∑(0.299∆r²+0.587∆g²+0.114∆b²), based (with no physical justification) on YUV's Y component, but it was too tolerant with blue errors. I'll try to find papers about this issue. – sam hocevar May 26 '09 at 09:14
  • Actually, that's jamoes's 536 byte jpeg; I just edited his answer to improve the formatting. – Brian Campbell May 26 '09 at 13:03
  • This looks really nice, but I'm very disappointed. img2twit? Really? REALLY? That's all you could come up with? What happened to your literary, poetic genius? I was expecting something to perpetuate the great lineage of libcaca, libcucul, libpipi and toilet. (How about 'pcul,' picture compression utility for line-blogging?) – niXar May 26 '09 at 14:26
  • @Brian: oops, I'll reattribute accordingly. @niXar: you're right. I have a more appropriate name ready, but this is a family website. – sam hocevar May 26 '09 at 15:22
  • What application can I use to view your .ogm movies? – An̲̳̳drew May 26 '09 at 20:39
  • @Andrew: VLC should work, and so should MPlayer. If you are on Windows and wish to use your usual player, I believe FFdshow may help (it's a DirectShow codec wrapping a lot of opensource codecs). – sam hocevar May 26 '09 at 21:04
  • 2
    @rpetrich: I modified the program to make it increase r/g/b ranges dynamically as long as there are enough bits available. This makes sure that we never waste more than 13 bits in the whole bitstream (but in practice it's usually 1 or 2). And the images look slightly better. – sam hocevar May 27 '09 at 08:39
  • @Sam do you have before and after images for that last change? – Brian Campbell May 27 '09 at 21:36
  • @Brian: I'm afraid I seem to have broken something else in the process when putting all the pieces back together. I will fix it tonight after work and post new images (don't hold your breath though, the improvement is not groundbreaking). – sam hocevar May 28 '09 at 10:06
  • Sounds like your bitpacking method is a variant of arithmetic coding, without the probabilistic part. You could probably gain quality by actually using arithmetic coding (or at least reduce output size) – derobert May 29 '09 at 04:11
  • @derobert: trouble is, if arithmetic coding saves me bits, I will not know how many bits until after the compression is done, and I won't be able to use these bits unless I do another compression run, which might very well not save any bits... – sam hocevar May 29 '09 at 08:55
45

The following isn't a formal submission, since my software hasn't been tailored in any way for the indicated task. DLI can be described as an optimizing general purpose lossy image codec. It's the PSNR and MS-SSIM record holder for image compression, and I thought it would be interesting to see how it performs for this particular task. I used the reference Mona Lisa image provided and scaled it down to 100x150 then used DLI to compress it to 344 bytes.

Mona Lisa DLI http://i40.tinypic.com/2md5q4m.png

For comparison with the JPEG and IMG2TWIT compressed samples, I used DLI to compress the image to 534 bytes as well. The JPEG is 536 bytes and IMG2TWIT is 534 bytes. Images have been scaled up to approximately the same size for easy comparison. JPEG is the left image, IMG2TWIT is center, and DLI is the right image.

Comparison http://i42.tinypic.com/302yjdg.png

The DLI image manages to preserve some of the facial features, most notably the famous smile :).

Brian Campbell
  • 289,867
  • 55
  • 346
  • 327
  • 6
    Oops. The above should be credited to Dennis Lee, who submitted it originally. I just edited it to embed the images inline & link to the reference I found by Googling. And I must say, wow, I'm impressed by the compression. I will have to check out DLI compression. – Brian Campbell May 29 '09 at 06:48
  • Encoding a DLI image to Unicode would definitely give the best results. Could you also show the results for 251 bytes of data? That's how many information bytes there are in 140 CJK characters. – sam hocevar May 29 '09 at 08:26
  • 1
    By the way, the DLI author mentions a "long processing time". As I am unable to run his software, could you give us rough compression time numbers? – sam hocevar May 29 '09 at 08:57
  • 1
    Using an AMD Athlon64 2.4Ghz, compression of the 100x150 Mona Lisa image takes 38sec and decompression 6sec. Compressing to a maximum of 251 bytes is tougher, the output quality is significantly reduced. Using the reference Mona Lisa image, I scaled it down to 60x91 then used DLI to compress it to 243 bytes (closest to 251 without going over). This is the output i43.tinypic.com/2196m4g.png The detail isn't near the 534 byte DLI even though bitrate has only been reduced by ~50%. The structure of the image has been maintained fairly well however. –  May 29 '09 at 20:54
  • 1
    Decided to make it easier to compare the 250 byte compressed samples. The 243 byte DLI was scaled up and placed beside the IMG2TWIT sample. IMG2TWIT on the left, DLI on the right. Here's the image i40.tinypic.com/30ndks6.png –  May 29 '09 at 21:25
  • That's very impressive. I wasn't aware of DLI, let's hope the guy releases some information about what it does. One last question: does DLI allow you to specify a target size, or do you have to try and guess if you want a given number of bytes? – sam hocevar May 29 '09 at 21:33
  • 1
    DLI uses a quality parameter like JPEG, so trial-and-error is needed if a target output size is desired. –  May 29 '09 at 23:03
  • @Dennis Do you have any source code available, or any writeup on the techniques used? I'm very impressed by the level of detail you get here, and I'd love to have some more information on how it works. – Brian Campbell May 30 '09 at 16:50
  • @Sam If you want to run dli, I've found that it works just fine under Wine. – Brian Campbell May 30 '09 at 23:24
  • Sorry, source code and a description of DLI's technology is currently not available. On another note, I was surprised you chose the fractal solution over IMG2TWIT. Personally, I prefer IMG2TWIT's solution and output quality so I'm looking forward to the details on your evaluation. –  May 31 '09 at 01:21
21

The general overview of my solution would be:

  1. I start with calculating the maximum amount of raw data that you can fit into 140 utf8 characters.
    • (I am assuming utf8, which is what the original website claimed twitter stored it's messages in. This differs from the problem statement above, which asks for utf16.)
    • Using this utf8 faq, I calculate that the maximum number of bits you can encode in a single utf8 character is 31 bits. In order to do this, I would use all characters that are in the U-04000000 – U-7FFFFFFF range. (1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx, there are 31 x's, therefore I could encode up to 31 bits).
    • 31 bits times 140 characters equals 4340 bits. Divide that by 8 to get 524.5, and round that down to 542 bytes.
    • (If we restrict ourselves to utf16, then we could only store 2 bytes per character, which would equal 280 bytes).
  2. Compress the image down using standard jpg compression.
    • Resize the image to be approximately 50x50px, then attempt to compress it at various compression levels until you have an image that is as close to 542 bytes as possible without going over.
    • This is an example of the mona lisa compressed down to 536 bytes.
  3. Encode the raw bits of the compressed image into utf-8 characters.
    • Replace each x in the following bytes: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx, with the bits from the image.
    • This part would probably be the part where the majority of the code would need to be written, because there isn't anything that currently exists that does this.

I know that you were asking for code, but I don't really want to spend the time to actually code this up. I figured that an efficient design might at least inspire someone else to code this up.

I think the major benefit of my proposed solution is that it is reusing as much existing technology as possible. It may be fun to try to write a good compression algorithm, but there is guaranteed to be a better algorithm out there, most likely written by people who have a degree in higher math.

One other important note though is that if it is decided that utf16 is the preferred encoding, then this solution falls apart. jpegs don't really work when compressed down to 280 bytes. Although, maybe there is a better compression algorithm than jpg for this specific problem statement.

Brian Campbell
  • 289,867
  • 55
  • 346
  • 327
  • I'm at work now, but I'm definitivelly implement this solution when I got home. – Paulo Santos May 21 '09 at 13:59
  • 2
    From my experimentation, it appears that UTF-16 is indeed how Twitter counts characters; BMP characters count as 1, and higher plane characters count as 2. It is not documented, but that's how their JavaScript character counter counts when you type into the input box. It's also mentioned in the comments in the original thread. I haven't tried submitting via the API to see if the counter is broken; if it is, I'll update the problem for the actual constraints. You're not likely to be able to use arbitrary UTF-8 however, since many of those longer sequences you can encode are not valid Unicode. – Brian Campbell May 21 '09 at 14:37
  • 4
    After testing with their API, it turns out that they do count by Unicode characters (code points), not UTF-16 code units (it's the JavaScript character counter that counts via UTF-16, since apparently that's what the JavaScript length method does). So you can get a bit more information in there; valid Unicode characters are in the range U+0000 to U+10FFFF (a bit more than 20 bits per character; 2^20 + 2^16 possible values per character). UTF-8 allows encoding of more values than are allowed in Unicode, so if you restrict yourself to Unicode, you can get about 350 bytes of space, not 542. – Brian Campbell May 21 '09 at 18:28
  • 3
    That 536-byte mona lisa looks surprisingly good, given the extreme compression! – Chris May 21 '09 at 22:57
  • Looks like a MonaCyclops though - some form of Uni-browe.... ;-) – Gineer May 22 '09 at 13:13
  • 3
    We can currently encode 129,775 different (assigned, non-control, non-private) Unicode characters. If we restrict ourselves to that subset, it's a total of 2377 bits, or 297 bytes. Code here: http://porg.es/blog/what-can-we-fit-in-140-characters – porges May 27 '09 at 07:13
  • How many BPP is the mona lisa you did? – Chris S May 29 '09 at 09:46
20

Okay, I'm late to the game, but nevertheless I made my project.

It's a toy genetic algorithm that uses translucent colorful circles to recreate the initial image.

Features:

  • pure Lua. Runs anywhere where a Lua interpreter runs.
  • uses netpbm P3 format
  • comes with a comprehensive suite of unit tests
  • preserves original image size

Mis-feautres:

  • slow
  • at this space constraints it preserves only the basic color scheme of the initial image and a general outline of few features thereof.

Here's an example twit that represents Lena: 犭楊谷杌蒝螦界匘玏扝匮俄归晃客猘摈硰划刀萕码摃斢嘁蜁嚎耂澹簜僨砠偑婊內團揕忈義倨襠凁梡岂掂戇耔攋斘眐奡萛狂昸箆亲嬎廙栃兡塅受橯恰应戞优猫僘瑩吱賾卣朸杈腠綍蝘猕屐稱悡詬來噩压罍尕熚帤厥虤嫐虲兙罨縨炘排叁抠堃從弅慌螎熰標宑簫柢橙拃丨蜊缩昔儻舭勵癳冂囤璟彔榕兠摈侑蒖孂埮槃姠璐哠眛嫡琠枀訜苄暬厇廩焛瀻严啘刱垫仔

original lena encoded Lena

The code is in a Mercurial repository at bitbucket.org. Check out http://bitbucket.org/tkadlubo/circles.lua

Tadeusz A. Kadłubowski
  • 7,292
  • 1
  • 27
  • 35
  • 2
    Awesome! Creates neat, artistic looking images. I'm glad people are still working on this; it's been loads of fun to see all of the different approaches. – Brian Campbell Aug 22 '10 at 19:21
  • 1
    I'd like to see this used as like a transparent overlay on the original, giving something like the bokeh effect. – Nick Radford Jul 19 '11 at 22:34
19

The following is my approach to the problem and I must admit that this was quite an interesting project to work on, it is definitely outside of my normal realm of work and has given me a something new to learn about.

The basic idea behind mine is as follows:

  1. Down-sample the image gray-scale such that there were a total of 16 different shades
  2. Preform RLE on the image
  3. Pack the results into the UTF-16 characters
  4. Preform RLE on the packed results to remove any duplication of characters

It turns out that this does work, but only to a limited extent as you can see from the sample images below. In terms of output, what follows is a sample tweet, specifically for the Lena image shown in the samples.

乤乤万乐唂伂倂倁企儂2企倁3企倁2企伂8企伂3企伂5企倂倃伂倁3企儁企2伂倃5企倁3企倃4企倂企倁企伂2企伂5企倁企伂쥹皗鞹鐾륶䦽阹럆䧜椿籫릹靭욶옷뎷歩㰷歉䴗鑹㞳鞷㬼獴鏙돗鍴祳㭾뤶殞焻�乹Ꮛ靆䍼

As you can see, I did try and constrain the character set a bit; however, I ran into issues doing this when storing the image color data. Also, this encoding scheme also tends to waste a bunch of bits of data that could be used for additional image information.

In terms of run times, for small images the code is extremely fast, about 55ms for the sample images provided, but the time does increase with larger images. For the 512x512 Lena reference image the running time was 1182ms. I should note that the odds are pretty good that the code itself isn't very optimized for performance (e.g. everything is worked with as a Bitmap) so the times could go down a bit after some refactoring.

Please feel free to offer me any suggestions on what I could have done better or what might be wrong with the code. The full listing of run times and sample output can be found at the following location: http://code-zen.info/twitterimage/

Update One

I've updated the the RLE code used when compressing the tweet string to do a basic look back and if so so use that for the output. This only works for the number value pairs, but it does save a couple of characters of data. The running time is more or less the same as well as the image quality, but the tweets tend to be a bit smaller. I will update the chart on the website as I complete the testing. What follows is one of the example tweet strings, again for the small version of Lena:

乤乤万乐唂伂倂倁企儂2企倁3企倁ウ伂8企伂エ伂5企倂倃伂倁グ儁企2伂倃ガ倁ジ倃4企倂企倁企伂ツ伂ス倁企伂쥹皗鞹鐾륶䦽阹럆䧜椿籫릹靭욶옷뎷歩㰷歉䴗鑹㞳鞷㬼獴鏙돗鍴祳㭾뤶殞焻�乹Ꮛ靆䍼

Update Two

Another small update, but I modified the code to pack the color shades into groups of three as opposed to four, this uses some more space, but unless I'm missing something it should mean that "odd" characters no longer appear where the color data is. Also, I updated the compression a bit more so it can now act upon the entire string as opposed to just the color count block. I'm still testing the run times, but they appear to be nominally improved; however, the image quality is still the same. What follows is the newest version of the Lena tweet:

2乤万乐唂伂倂倁企儂2企倁3企倁ウ伂8企伂エ伂5企倂倃伂倁グ儁企2伂倃ガ倁ジ倃4企倂企倁企伂ツ伂ス倁企伂坹坼坶坻刾啩容力吹婩媷劝圿咶坼妛啭奩嗆婣冷咛啫凃奉佶坍均喳女媗决兴宗喓夽兴唹屹冷圶埫奫唓坤喝奎似商嗉乃

StackOverflow Logo http://code-zen.info/twitterimage/images/stackoverflow-logo.bmp Cornell Box http://code-zen.info/twitterimage/images/cornell-box.bmp Lena http://code-zen.info/twitterimage/images/lena.bmp Mona Lisa http://code-zen.info/twitterimage/images/mona-lisa.bmp

rjzii
  • 13,621
  • 12
  • 79
  • 116
  • 1
    Great, thanks for the entry! Grayscale actually works fairly well for most of these, though Lena is a bit hard to make out. I was looking for your source but got a 404; could you make sure it's up there? – Brian Campbell May 30 '09 at 18:22
  • Double check it now, I was updating the site so you might have caught me between updates. – rjzii May 30 '09 at 18:35
  • Yep, I can download it now. Now of course I need to figure out if I can get Mono to compile it. – Brian Campbell May 30 '09 at 19:46
  • Yep! Works under Mono, I compiled with "gmcs -r System.Drawing TwitterImage.cs Program.cs" and run with "mono TwitterImage.exe encode lena.png lena.txt" – Brian Campbell May 30 '09 at 20:17
  • Cool! I did double check to make sure the libraries I were using were listed for Mono, but I haven't actually worked with Mono yet so I wasn't sure if it would or not. – rjzii May 30 '09 at 21:23
  • Sample images are not visible – Jakub Narębski Jun 18 '09 at 07:13
15

This genetic algorithm that Roger Alsing wrote has a good compression ratio, at the expense of long compression times. The resulting vector of vertices could be further compressed using a lossy or lossless algorithm.

http://rogeralsing.com/2008/12/07/genetic-programming-evolution-of-mona-lisa/

Would be an interesting program to implement, but I'll give it a miss.

CiscoIPPhone
  • 9,286
  • 3
  • 35
  • 41
12

In the original challenge the size limit is defined as what Twitter still allows you to send if you paste your text in their textbox and press "update". As some people correctly noticed this is different from what you could send as a SMS text message from your mobile.

What is not explictily mentioned (but what my personal rule was) is that you should be able to select the tweeted message in your browser, copy it to the clipboard and paste it into a text input field of your decoder so it can display it. Of course you are also free to save the message as a text file and read it back in or write a tool which accesses the Twitter API and filters out any message that looks like an image code (special markers anyone? wink wink). But the rule is that the message has to have gone through Twitter before you are allowed to decode it.

Good luck with the 350 bytes - I doubt that you will be able to make use of them.

Quasimondo
  • 2,347
  • 20
  • 29
  • 1
    Yes, I've added a scoring rubric that indicates that tighter restrictions on the character set are preferred, but not required. I would like to have a rule that requires that messages pass through Twitter unscathed, but that would take a lot of trial and error to figure out the precise details of what works, and I wanted to leave some leeway to allow for creative uses of the code space. So, the only requirement in my challenge is 140 valid Unicode characters. By the way, thanks for stopping by! I really like your solution, and want to see if any of the kibitzers can actually improve on it. – Brian Campbell May 22 '09 at 21:01
12

Posting a Monochrome or Greyscale image should improve the size of the image that can be encoded into that space since you don't care about colour.

Possibly augmenting the challenge to upload three images which when recombined give you a full colour image while still maintaining a monochrome version in each separate image.

Add some compression to the above and It could start looking viable...

Nice!!! Now you guys have piqued my interest. No work will be done for the rest of the day...

Brian Campbell
  • 289,867
  • 55
  • 346
  • 327
Gineer
  • 2,280
  • 4
  • 25
  • 40
9

Regarding the encoding/decoding part of this challenge. base16b.org is my attempt to specify a standard method for safely and efficiently encoding binary data in the higher Unicode planes.

Some features :

  • Uses only Unicode's Private User Areas
  • Encodes up to 17 bits per character; nearly three times more efficient than Base64
  • A reference Javascript implementation of encode/decode is provided
  • Some sample encodings are included, including Twitter and Wordpress

Sorry, this answer comes way too late for the original competition. I started the project independently of this post, which I discovered half-way into it.

8

Here this compression is good.

http://www.intuac.com/userport/john/apt/

http://img86.imageshack.us/img86/4169/imagey.jpg http://img86.imageshack.us/img86/4169/imagey.jpg

I used the following batch file:

capt mona-lisa-large.pnm out.cc 20
dapt out.cc image.pnm
Pause

The resulting filesize is 559 bytes.

kay
  • 23,543
  • 10
  • 89
  • 128
8

The idea of storing a bunch of reference images is interesting. Would it be so wrong to store say 25Mb of sample images, and have the encoder try and compose an image using bits of those? With such a minuscule pipe, the machinery at either end is by necessity going to be much greater than the volume of data passing through, so what's the difference between 25Mb of code, and 1Mb of code and 24Mb of image data?

(note the original guidelines ruled out restricting the input to images already in the library - I'm not suggesting that).

  • 1
    That would be fine, as long as you have a fixed, finite amount of data at either endpoint. Of course, you would need to demonstrate that it works with images that are not in the training set, just like any statistical natural language process problem. I'd love to see something that takes a statistical approach to image encoding. – Brian Campbell May 27 '09 at 01:53
  • 16
    I, for one, would love to see Mona Lisa redone using only Boba Fett fan art as source. – Nosredna May 27 '09 at 02:20
  • I agree - the photomosaic approach seems to be within the rules & would be extremely interesting to see someone take a stab at. – An̲̳̳drew May 27 '09 at 03:33
8

Stupid idea, but sha1(my_image) would result in a "perfect" representation of any image (ignoring collisions). The obvious problem is the decoding process requires inordinate amounts of brute-forcing..

1-bit monochrome would be a bit easier.. Each pixel becomes a 1 or 0, so you would have 1000 bits of data for a 100*100 pixel image. Since the SHA1 hash is 41 characters, we can fit three into one message, only have to brute force 2 sets of 3333 bits and one set of 3334 (although even that is probably still inordinate)

It's not exactly practical. Even with the fixed-length 1-bit 100*100px image there is.., assuming I'm not miscalculating, 49995000 combinations, or 16661667 when split into three.

def fact(maxu):
        ttl=1
        for i in range(1,maxu+1):
                ttl=ttl*i
        return ttl

def combi(setsize, length):
    return fact(length) / (fact(setsize)*fact(length-setsize))

print (combi(2, 3333)*2) + combi(2, 3334)
# 16661667L
print combi(2, 10000)
# 49995000L
dbr
  • 153,498
  • 65
  • 266
  • 333
  • 10
    The issue with sha1(my_image) is that if you spent your time brute forcing it, you'd probably find man many collisions before you found the real image; and of course brute forcing sha1 is pretty much computationally infeasible. – Brian Campbell May 30 '09 at 23:26
  • 5
    Even better than SHA1 compression: my "flickr" compression algorithm! Step 1: upload image to flickr. Step 2: post a link to it on twitter. Tadda! Only 15 bytes uses! – niXar Jun 19 '09 at 16:29
  • 2
    niXar: Nope, rule 3.4: "The decoding process may have no access to any other output of the encoding process other than the output specified above; that is, you can't upload the image somewhere and output the URL for the decoding process to download, or anything silly like that." – dbr Jun 19 '09 at 22:41
  • 6
    I know, I was being sarcastic. – niXar Jun 26 '09 at 16:21
0

Idea: Could you use a font as a palette? Try to break an image in a series of vectors trying to describe them with a combination of vector sets (each character is essentially a set of vectors). This is using the font as a dictionary. I could for instance use a l for a vertical line and a - for a horizontal line? Just an idea.