25

What is the schema of a image-file at Tumblr? (I don't mean HTTP) I've only figured out that the domain of the servers where images are stored is <n>.media.tumblr.com, where n is a number between 1 and 31 and the name of the image file is prefixed with "tumblr_.

I'm asking because I want to find URLs that refer to the same image.

EDIT: I'm also processing URLs from other sources, not only Tumblr.

TylerH
  • 19,065
  • 49
  • 65
  • 86
Jimmy T.
  • 3,683
  • 2
  • 19
  • 34

2 Answers2

54

Overview

When you upload an image to Tumblr, multiple sizes (of the same image) are generated and stored across their network.

Once uploaded, you can use template tags to request this image at the following sizes: 75, 100, 250, 400, 500 and 1280.

It's worth mentioning the following:

  1. The value in the template tag is the maximum size the requested image will be. Example: A 400 version of an image could be anywhere between 251px and 400px wide / high.
  2. There may not be a high res or 1280 version of an image available. If the original image is 500px or less, a 1280 version isn't generated.
  3. Photosets don't produce a 100 version.

Image URL

The image URL will be either of the two below. The first URL seems to be associated with images upload more than 6 months ago (this is a guess), the second URL seems to be for newer images:

http://36.media.tumblr.com/tumblr_o4qxa0n2BP1r6ec7zo1_500.jpg

or

http://36.media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1_500.jpg

URL Schema

This can be split into three parts, two variables, one constant.

  1. http://36
  2. .media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1
  3. _500.jpg

1 This is a server number and can differ for each image size. AFAIK there is no guarantee that an image size will be available on all servers. @Ally mentioned in the comments you can remove this part from the URL and the image will still be found.
2 This is the Tumblr subdomain, directory (if applicable) and partial file name. This will be the same for all sizes.
3 This is the requested size (which matches the template tag) and file extension.

Generating URLs for all sizes available using template tags.

The only foolproof method I have found is to use the corresponding template tags and assign them to a data- attribute.

Example Template Code:

<img src="{PhotoURL-100}" data-250u="{PhotoURL-250}" data-400u="{PhotoURL-400}" data-500u="{PhotoURL-500}" data-1280u="{block:HighRes}{PhotoURL-HighRes}{/block:HighRes}" />

Example Rendered Code:

<img src="http://36.media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1_100.jpg" data-250u="http://36.media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1_250.jpg" data-400u="http://36.media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1_400.jpg" data-500u="http://36.media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1_500.jpg" data-1280u="http://36.media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1_1280.jpg" >

With this method, you can be certain you have the correct URLs and you know what sizes are available.

Hacking all size URLs based on just one URL.

Using this information the URL would become:

http://36.media.tumblr.com/83099a60d4e0cbeeb30d90394e222878/tumblr_o4qxa0n2BP1r6ec7zo1_500.jpg

Below is a test to confirm we access all the available sizes:

You still wouldn't know if the 1280 size has been generated, but its a step closer. With this method you could replace the value (part 3) with an new size and you should be able to get the image.

Community
  • 1
  • 1
mikedidthis
  • 4,826
  • 3
  • 26
  • 43
  • Thanks for the good explanation. The problem is that I only get the URL of the image. Is something like a reverse-lookup possible? – Jimmy T. May 30 '13 at 14:04
  • 1
    It is possible, but it is not fail safe. Originally I was taking the URL and replacing (part 3) with a different size. However it turned out that in some cases that image with the new size wasn't on the same server (part 1) as the previous size. You could probably do something like testing a URL for a response, if 404, increment the server number, rinse and repeat, but it is hacky. Can I ask why you can only get the img URL? – mikedidthis May 30 '13 at 14:30
  • I get the image-URLs from diffrent sources, not only tubmlr. Sometimes it's just a file on a webserver. Do you think I should better force users to give the post-URL? – Jimmy T. May 30 '13 at 14:34
  • 2
    Generally speaking, you don't need the number for the server. `http://media.tumblr.com/...` should work. If you dead set on using the server number though, I'm not sure how to grab that outside of Tumblr. I'll give it some thought. – Ally May 30 '13 at 14:57
  • Well, you learn something new everyday! Thanks @Ally I will update my answer. – mikedidthis May 30 '13 at 15:09
  • @JimmyT. I think Ally's comment may be what you need. Afaik, getting the post URL is little help as the images are detached from the post ID etc. If you can get one URL, strip the server part, you should be able to get all the sizes. – mikedidthis May 30 '13 at 15:18
  • What about that directory? I've never seen that before. – Jimmy T. May 30 '13 at 15:24
  • @JimmyT. sorry you lost me. – mikedidthis May 30 '13 at 15:32
  • Your URLs contain this part: `/c0d47ade54475ccb18a5e35a790f149d/`. All URLs I've processed so far are images which are located directly on the server. Did Tubmlr changed something? A sample URL which has the format I mean (a picture I googled now): http://24.media.tumblr.com/tumblr_mcu2jq7ruq1r84p84o1_500.jpg – Jimmy T. May 30 '13 at 15:36
  • Now I follow. Images I pull directly from Tumblr seem to have a directory. One random Tumblr image: `http://media.tumblr.com/cd2dc02de75f51490ec84a954b73c3d4/tumblr_mniv5aSwOz1rd1n1oo1_250.jpg` (directly from the dashboard). If I remove the directory from the URL above, it fails. Again, as I mentioned, the schema is a mythical beast. Also the example image doesn't have a 1280 size (http://media.tumblr.com/tumblr_mcu2jq7ruq1r84p84o1_1280.jpg) something you would have to test for. Are your images all taken from Google Images? – mikedidthis May 30 '13 at 15:46
  • @mikedidthis No most are taken from blogs, I just googled to get an example fast. But the pictures are often from reposts, maybe that's the cause for the missing directory? – Jimmy T. May 30 '13 at 15:55
  • @JimmyT. I dug back and I can find an image with no directory: http://24.media.tumblr.com/tumblr_m1c3k5kyCH1ro5vpyo1_1280.jpg (I posted this myself). This is from a photoset. It seems photoset images don't get a directory, whilst single images do? Can you test this theory? – mikedidthis May 30 '13 at 16:06
  • 1
    @mikedidthis I tested you theory but it is not true. But I got curious and so I went through a big blog and checked the URL of the images. I came to the point that 5/6 months ago Tubmlr must have changed the system, because before that time no picture was located in a directory. – Jimmy T. May 30 '13 at 16:26
  • @JimmyT. Yeah that was my gut feeling, that there was a change. I will update my answer to include the directory info, but I think the safest way is to either generate the links via the template tag. If that isn't possible Ally's solution should fit your needs. Again thanks for the help on this. – mikedidthis May 30 '13 at 16:29
  • 1
    However, it seems that the directory name has no special meaning. – Jimmy T. May 30 '13 at 16:47
  • 2
    Atention! I just realized, tumblr does not allow to use part1 without server number anymore. this answer just became outdated. – David Mabodo Apr 20 '15 at 13:27
  • @DavidMabodo seems you maybe correct. I will leave it a few days (as Tumblr servers can do funny things). If it's still the case at the end of the week I will update the answer. – mikedidthis Apr 20 '15 at 13:45
  • 1
    @mikedidthis I just ping you, two months later it keeps the same. – David Mabodo Jun 23 '15 at 13:42
  • @DavidMabodo Sorry I haven't had time to test ( I wanted to include the new sizes as well), but it seems the server numbers are now requred. I will try and get a 2015 update at some point. However, feel free to edit the answer if you want to. – mikedidthis Jun 23 '15 at 14:21
  • any way to get the original uploaded file? the resizes lose their EXIF data. – Tom Roggero Dec 29 '15 at 08:27
  • @TomRoggero afaik, no. Once the file is uploaded, the original is no longer available. However, Tumblr provide template tags to display the EXIF data. https://www.tumblr.com/docs/en/custom_themes#photo-posts – mikedidthis Dec 30 '15 at 13:55
  • 2
    FYI: as @DavidMabodo said, server number is still mandatory. Could you refresh the text a bit? It's not critical though. All the rest of information seems to be valid. – quetzalcoatl Apr 13 '16 at 13:55
  • 1
    @quetzalcoatl Done and done. Thanks to everyone for the comments / input, made maintaining this answer easier. – mikedidthis Apr 13 '16 at 14:48
  • 1
    Also link can have `_r1` or `_r2` before `_.jpg` and deleting this substring sometimes does nothing, sometimes, edits md5 of image and sometimes gives Nginx error 404 – Nakilon Apr 09 '17 at 20:29
  • Does it mean that tumblr does not store images greater than 1200px? I would like to download the original sizes... – basZero Aug 04 '17 at 07:43
  • @basZero I don't believe Tumblr stores or provides a link to the original file. – mikedidthis Aug 06 '17 at 12:38
  • 1
    This answer is now outdated. New images are following a different schema. – pfdint Nov 03 '19 at 04:34
1

Do keep in mind that original files (in their full resolution) are stored with the '_raw' suffix, instead of _1280, _500, _250, etc.

They are usually stored on data.tumblr.com currently (their CDN domain).

The path in the URL scheme is generated from the original (raw) file's SHA1 checksum.

Hernn0
  • 89
  • 1
  • 7
  • Do you have an example of what you're saying? I'm unable to make it work... EDIT: I think it changed [just yesterday](https://greasyfork.org/en/forum/discussion/41099/x)... – mcont Aug 10 '18 at 10:40
  • 1
    Yes, unfortunately, since two days ago Tumblr is now denying access to _raw files. One more reason to never use this trash site. – Hernn0 Aug 11 '18 at 12:24