69

The common method to store images in a database is to convert the image to base64 data before storing the data. This process will increase the size by 33%. Alternatively it is possible to directly store the image as a BLOB; for example:

$image = new Imagick("image.jpg");
$data = $image->getImageBlob();
$data = $mysqli->real_escape_string($data);
$mysqli->query("INSERT INTO images (data) VALUES ('$data')");

and then display the image with

<img src="data:image/jpeg;base64,' .  base64_encode($data)  . '" />

With the latter method, we save 1/3 storage space. Why is it more common to store images as base64 in MySQL databases?

UPDATE: There are many debates about advantages and disadvantages of storing images in databases, and most people believe it is not a practical approach. Anyway, here I assume we store image in database, and discussing the best method to do so.

PdC
  • 1,438
  • 10
  • 25
Googlebot
  • 13,096
  • 38
  • 113
  • 210
  • 8
    Save the data to a file and store only the file location or url in the database – Fredrik Mar 15 '12 at 15:17
  • 1
    @Fredrik If deciding to store data into file, why as base64 data? We can simply save the original image file. – Googlebot Mar 15 '12 at 15:21
  • 1
    I figured you were sending it from device like iPhone or something else. Then you don't want to send raw data but base64 string in JSON or something instead. – Fredrik Mar 15 '12 at 15:30
  • 1
    Ask anyone, never store that kind of data directly in a database. Common sense. – Fredrik Mar 15 '12 at 15:31
  • @Fredrik No I am talking about a website. Even in that case, it would be better to convert the base64 data to image BLOB and save as an oridinary image file by ImageMagick. – Googlebot Mar 15 '12 at 15:33
  • Store it base64 encoded in a character field using UTF8 and see how many bytes are wasted. – Marcus Adams Mar 15 '12 at 16:39
  • if you encode the data and put it in db encoded, shouldn't you *decode* it when using it in ` – hummingBird Apr 13 '17 at 13:38

5 Answers5

97

I contend that images (files) are NOT usually stored in a database base64 encoded. Instead, they are stored in their raw binary form in a binary (blob) column (or file).

Base64 is only used as a transport mechanism, not for storage. For example, you can embed a base64 encoded image into an XML document or an email message.

Base64 is also stream friendly. You can encode and decode on the fly (without knowing the total size of the data).

While base64 is fine for transport, do not store your images base64 encoded.

Base64 provides no checksum or anything of any value for storage.

Base64 encoding increases the storage requirement by 33% over a raw binary format. It also increases the amount of data that must be read from persistent storage, which is still generally the largest bottleneck in computing. It's generally faster to read less bytes and encode them on the fly. Only if your system is CPU bound instead of IO bound, and you're regularly outputting the image in base64, then consider storing in base64.

Inline images (base64 encoded images embedded in HTML) are a bottleneck themselves--you're sending 33% more data over the wire, and doing it serially (the web browser has to wait on the inline images before it can finish downloading the page HTML).

If you still wish to store images base64 encoded, please, whatever you do, make sure you don't store base64 encoded data in a UTF8 column then index it.

Marcus Adams
  • 49,523
  • 8
  • 81
  • 132
  • 2
    good clarification; but if you search, you'll find many tutorial for storing as base64 and a few for binary storage. – Googlebot Mar 15 '12 at 16:22
  • 5
    Base64 provides no checksum or anything of any value for storage. If you provide a link with an argument for its use as storage, I'll debunk it for you. :) – Marcus Adams Mar 15 '12 at 16:28
  • 2
    I don't get how this is not the accepted answer, storing images on an external CDNs/services that are optimised for storing images like S3 also gives you other advantages like caching, etc.. instead of having to retrieve a large string for each document in your database every time you run a query – Khaled Osman Sep 05 '19 at 15:20
  • You don't always want to store files on CDNs, maybe they are private internally to a company... storing binary in databases do have advantages, such as convenience for the developer.. we store blobs for caching binary pdf reports temporarily (48 hours) so database storage space is not an issue.. it all depends on the use case. No right and wrong. – Spock Dec 05 '19 at 12:10
  • Files can be both on a CDN and private internal files to a company. CloudFront for instance (since S3 was mentioned), allows for private CDNs. – Jacrys May 01 '20 at 13:55
74
  • Pro base64: the encoded representation you handle is a pretty safe string. It contains neither control chars nor quotes. The latter point helps against SQL injection attempts. I wouldn't expect any problem to just add the value to a "hand coded" SQL query string.

  • Pro BLOB: the database manager software knows what type of data it has to expect. It can optimize for that. If you'd store base64 in a TEXT field it might try to build some index or other data structure for it, which would be really nice and useful for "real" text data but pointless and a waste of time and space for image data. And it is the smaller, as in number of bytes, representation.

Adrien Brunelat
  • 3,784
  • 4
  • 27
  • 41
user1252434
  • 1,963
  • 1
  • 14
  • 21
  • 2
    very useful comparison. My worry is mainly about security. I am not sure if saving binary can open any security hole for SQL injection. – Googlebot Mar 15 '12 at 16:24
  • That should depend on "proper" and "safe" handling of the data to the database. As I'm not familiar with PHP, what you seem to use, I can't give you tips on that. In java the tools I use (Hibernate / JPA) take care of that for me. :) – user1252434 Mar 15 '12 at 16:28
  • 3
    Escaping the input protects against SQL injection attacks, not the storage mechanism. I admit that I've never needed to enter an image by hand into a query. – Marcus Adams Mar 15 '12 at 16:43
  • Right, it doesn't matter what data lies in the database. The transport, especially storing data, needs to be protected against injections. With hand coded string I mean code like `db->query("INSERT INTO Images VALUES (".id.",'".data."');");`, which can be often found in PHP code. I strongly prefer to use placeholders and pass the data separately, knowing that the used library takes care of proper escaping. Usually in the form of prepared statements. Or use a higher level ORM toolkit that takes care for me. – user1252434 Mar 15 '12 at 17:01
  • 2
    The "safe string" argument might be a reason to use base64 for transport, but not for storage. – Marcus Adams Apr 13 '17 at 13:41
  • 1
    Images should never be stored directly in the database imo. better to use a database/service that is optimised for storing these kind of files like Amazon S3, that can provide their own CDN & caching. that way you dont have to worry about slowing your database performance by retrieving such long strings every time u do a query or worry about storage/memory of your db – Khaled Osman Sep 05 '19 at 15:23
5

Just want to give one example why we decided to store image in DB not files or CDN, it is storing images of signatures.

We have tried to do so via CDN, cloud storage, files, and finally decided to store in DB and happy about the decision as it was proven us right in our subsequent events when we moved, upgraded our scripts and migrated the sites serveral times.

For my case, we wanted the signatures to be with the records that belong to the author of documents.

Storing in files format risks missing them or deleted by accident.

We store it as a blob binary format in MySQL, and later as based64 encoded image in a text field. The decision to change to based64 was due to smaller size as result for some reason, and faster loading. Blob was slowing down the page load for some reason.

In our case, this solution to store signature images in DB, (whether as blob or based64), was driven by:

  1. Most signature images are very small.
  2. We don't need to index the signature images stored in DB.
  3. Index is done on the primary key.
  4. We may have to move or switch servers, moving physical images files to different servers, may cause the images not found due to links change.
  5. it is embarrassed to ask the author to re-sign their signatures.
  6. it is more secured saving in the DB as compared to exposing it as files which can be downloaded if security is compromised. Storing in DB allows us better control over its access.
  7. any future migrations, change of web design, hosting, servers, we have zero worries about reconcilating the signature file names against the physical files, it is all in the DB!

AC

ACPan
  • 51
  • 1
  • 2
1

I recommend looking at modern databases like NoSQL and also I agree with user1252434's post. For instance I am storing a few < 500kb PNGs as base64 on my Mongo db with binary set to true with no performance hit at all. Mongo can be used to store large files like 10MB videos and that can offer huge time saving advantages in metadata searches for those videos, see storing large objects and files in mongodb.

msmfsd
  • 641
  • 6
  • 19
  • From what I'm seing in the bson spec, MongoDb then stores byte arrays not base64-encoded, but as raw bytes, prefixed with their length. – Volker Aug 09 '16 at 13:22
0

I came across this really great talk by Facebook engineers about the Efficient Storage of Billions of Photos in a database

user3495363
  • 219
  • 1
  • 10