3

Yesterday I've read the question Algorithm to create costum Template/Code from String. Because the question was not formulated that well it was downvoted just instantly. However, the question itself was in my opinion not that bad, so I decided to ask a hopefully better version of this question again.

Ok, Im wondering how's the string encryption e.g. of the new Spotify codes is working. See the image below:

Spotify Codes

I would be super interested in the extent to which it is possible to implement something like this pattern-encryption in javascript.

The Spotify codes - I've already mentioned above - are structured in a row that is divided into different sized bars.

So let's say there is a row that is divided into 24 bars and all of the bars can have the size '3', '5', '7' or '9'.

 string = 'hello'   -->  pattern = '3,3,5,7,9,3,7,9,9,3,3,5,3,9,5,3,3,7,5,9,3,9,3,9'


What's a good method / easy way to translate a string (lets say 5 characters) into a unique pattern, that afterwards is also convertible back and read as a string?

This is my code I've developed till now, but in this code I used a key-array that includes 10 different possibilities (--> bar sizes) , but I just like to us 4 different sizes.


Explanation:

I' converting my string 'hello' to binary format and splitting the string up into groups of maximum 3 to get something like this: ['001', '110', '0'].

Afterwards I'm using the result array above and find the matches in my Key-array below and get the indexes (10 different indexed --> 10 different possibilities) and use them as bar-sizes.

But, There MUST BE a way more efficient method to translate a string into a unique pattern. I hope somebody can help me improve my small algorithm. Thanks a million in advance.

var key = ['0', '1', '000','001','010','100','110','101','011','111']


String.prototype.encode = function() {
  var code = this, result = [],encryped_string=[]
  for (var i=0; i<code.length;i++) result.push(code[i].charCodeAt(0).toString(2).match(/.{1,3}/g));
  for (var i=0; i<result.length; i++) for (var j=0; j<result[i].length; j++) encryped_string.push(key.indexOf(result[i][j]))
  return encryped_string
}



var code = 'hello';
console.log(code.encode())
Jonas0000
  • 915
  • 7
  • 30
  • 1. What is up with 10? You do not need thefirst two entries. 2. If you want to handle unicode you need to handle all 8 bits of each byte and multiple bytes per character, UTF-8 is probably the best choice. Unicode is needed for most of the world and emoji . – zaph Nov 13 '17 at 16:03

2 Answers2

4

It appears that you're making the assumption that there is a direct mapping from the string "Coffee" to the graphic that's shown. That assumption is almost certainly incorrect.

First, consider what would happen if there are two different songs called "Coffee." Your proposed algorithm would assign them both the same code. That seems unreasonable. You want the code to uniquely identify the song.

Second, song names can be arbitrarily long. For example, there's a song by Pink Floyd called "Several Species of Small Furry Animals Gathered Together in a Cave and Grooving with a Pict." Your encoding algorithm probably won't be able to fit that into 24 bars. Even if it can, I can always find a longer song title.

Given the letters a-z, there are 11,881,376 possible 5-character strings. If you just want to uniquely encode all possible, you can do that with just 23 bits. Just treat the string as a base-26 number and do the conversion.

Most likely, Spotify is assigning a unique number to each song, and then encoding that number. There is no direct mapping between the string "Coffee" and the graphical code you see on your screen.

Jim Mischel
  • 122,159
  • 16
  • 161
  • 305
4

Update: I asked a similar question and it was answered by someone linking the patent for this barcode. To summarize, they use an intermediate look-up-table to link the barcode to the unique Spotify ID.

I have been digging into Spotify Codes some to try and understand them.

Spotify has URIs for each song, album, artists, user, playlist, etc. They look something like this:

spotify:playlist:37i9dQZF1DXcBWIGoYBM5M

If you visit Spotify Codes you can generate a code from the URI. The code for the above URI looks like this:

Image of a spotify barcode

As you noted, they encode the information in the heights of each of the bars, in the same way that the United States Postal Service does in their barcodes (see Intelligent Mail barcode).

The bars in Spotify Codes have 8 different heights they can be. The logo is the max height, and the first and the last bars are always the lowest height. In the image above, the max height is 96 pixels, and the bars fall into 8 different height bins: [96, 84, 74, 62, 52, 40, 28, 18].

Using this (sort of messy Python) code I can grab the octal sequence from the barcode image:

from skimage import io
from skimage.filters import threshold_otsu
from skimage.measure import label, regionprops
from skimage.morphology import square
from skimage.color import label2rgb, rgb2gray

def get_sequence(filename):
    image = io.imread(filename)
    image = rgb2gray(image)
    b_and_w = image > threshold_otsu(image)
    labeled = label(b_and_w)
    bar_dims = [r.bbox for r in regionprops(labeled)]
    bar_dims.sort(key=lambda x: x[1], reverse=False)
    spotify_logo = bar_dims[0]
    max_height = spotify_logo[2] - spotify_logo[0]
    sequence = []
    for bar in bar_dims[1:]:
        height = bar[2] - bar[0]
        ratio = height / max_height
        if ratio < 0.25:
            sequence.append(0)
        elif ratio < 0.33:
            sequence.append(1)
        elif ratio < 0.46:
            sequence.append(2)
        elif ratio < 0.5625:
            sequence.append(3)
        elif ratio < 0.677:
            sequence.append(4)
        elif ratio < 0.8:
            sequence.append(5)
        elif ratio < 0.9:
            sequence.append(6)
        elif ratio < 1.1:
            sequence.append(7)
        else:
            raise ValueError('ratio is too high')
    return sequence

The sequence maps like this: 37i9dQZF1DXcBWIGoYBM5M -> [0, 6, 0, 2, 4, 5, 1, 4, 5, 2, 3, 7, 3, 7, 1, 5, 6, 2, 5, 7, 4, 3, 0]

The weird thing about this is the amount of information in the URI and the spotify code do not match up. The URI is 22 characters long, and contains 0-9 a-z A-Z. This means 62^22 potential URIs, or 2.7 e39. There are 23 bars in the spotify code, but the first and last are always 0, so there are only 21 usable bars. This means 8^21 or 9.22 e18 potential codes. The URI to code mapping is not straightforward since there is not 1 code to 1 URI.

I do not know how they map the URIs to the codes. My guess would be that they have a separate database/lookup table that they use to map the codes to URIs. When creating a code, they hash the URI to a code and store that to look up later. When someone looks up a code, they check that database and map it to the URI. Since there are so many more potential URIs, they just don't ever get used and they don't have to worry about them.

Peter Boone
  • 696
  • 5
  • 16