1

I am doing the following programming exercise: Return String As Sorted Blocks. The statement is:

Task

You will receive a string consisting of lowercase letters, uppercase letters and digits as input. Your task is to return this string as blocks separated by dashes ("-"). The elements of a block should be sorted with respect to the hierarchy listed below, and each block cannot contain multiple instances of the same character.

The hierarchy is:

lowercase letters (a - z), in alphabetic order
uppercase letters (A - Z), in alphabetic order
digits (0 - 9), in ascending order

Examples

"21AxBz" -> "xzAB12" - since input does not contain repeating characters, you only need 1 block
"abacad" -> "abcd-a-a" - character "a" repeats 3 times, thus 3 blocks are needed
"" -> "" - an empty input should result in an empty output

Good luck!

I have written the following code:

def blocks(s):
    print("s: "+s)
    lowers = []
    uppers = []
    digits = []
    for c in s:
        if c.islower():
            lowers.append(c)
        if c.isupper():
            uppers.append(c)
        if c.isdigit():
            digits.append(c)
    lowers.sort()
    uppers.sort()
    digits.sort()
    print("lowers: ")
    print(lowers)
    print("uppers: ")
    print(uppers)
    print("digits: ")
    print(digits)
    result = ""
    sorted = lowers+uppers+digits
    removedLetters = 0
    needsNextBlock = False
    nextBlock = "-"
    while len(sorted) > 0:
        for i, c in enumerate(sorted):
            print(i, c)
            print("result: ")
            print(result)
            if c not in result:
                result += c
                print("we want to delete: ")
                print(c)
                sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                removedLetters += 1
                print("new sorted: ")
                print(sorted)
            else:
                if c not in nextBlock:
                    needsNextBlock = True
                    nextBlock += c
                    sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                    removedLetters += 1

                    print("new sorted: ")
                    print(sorted)

        if needsNextBlock:
            result += nextBlock
        needsNextBlock = False
        nextBlock = "-"
    return result

And there is a bug, because of when we have the following test:

Test.assert_equals(blocks("abacad"), "abcd-a-a")

The trace is:

s: abacad
lowers: 
['a', 'a', 'a', 'b', 'c', 'd']
uppers: 
[]
digits: 
[]
0 a
result: 

we want to delete: 
a
new sorted: 
['a', 'a', 'b', 'c', 'd']
1 a
result: 
a
new sorted: 
['a', 'b', 'c', 'd']
2 a
result: 
a
3 b
result: 
a
we want to delete: 
b
new sorted: 
['a', 'c', 'd']
4 c
result: 
ab
we want to delete: 
c
new sorted: 
['a', 'd']
5 d
result: 
abc
we want to delete: 
d
new sorted: 
['a']
0 a
result: 
abcd-a
new sorted: 
['a']
0 a
result: 
abcd-a-a
new sorted: 
['a']
0 a
result: 
abcd-a-a-a
new sorted: 
['a']
0 a
result: 
abcd-a-a-a-a
new sorted: 
['a']
0 a
(infinite loop)

So as we see the difficulty is created when we execute:

sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
removedLetters += 1

Because we have previously passed over the repeated letter, in this case 'a', but we have not counted it, so the calculus for the new sorted substring keeps being the same.

I tried a naive approach:

def blocks(s):
    print("\n\n\ns: "+s)
    lowers = []
    uppers = []
    digits = []
    for c in s:
        if c.islower():
            lowers.append(c)
        if c.isupper():
            uppers.append(c)
        if c.isdigit():
            digits.append(c)
    lowers.sort()
    uppers.sort()
    digits.sort()
    print("lowers: ")
    print(lowers)
    print("uppers: ")
    print(uppers)
    print("digits: ")
    print(digits)
    result = ""
    sorted = lowers+uppers+digits
    removedLetters = 0
    needsNextBlock = False
    nextBlock = "-"
    while len(sorted) > 0:
        initialIterationLength = len(sorted)
        for i, c in enumerate(sorted):
            print(i, c)
            print("result: ")
            print(result)
            if c not in result:
                result += c
                print("we want to delete: ")
                print(c)
                sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                removedLetters += 1
                print("new sorted: ")
                print(sorted)
            else:
                if c not in nextBlock:
                    needsNextBlock = True
                    nextBlock += c
                    sorted = sorted[0:i-removedLetters] + sorted[i+1-removedLetters:]
                    removedLetters += 1
                    if initialIterationLength == len(sorted):
                        sorted = []
                    print("new sorted: ")
                    print(sorted)

        if needsNextBlock:
            result += nextBlock
        needsNextBlock = False
        nextBlock = "-"
    return result

As you see, I added when we start the while loop the sentence: initialIterationLength = len(sorted) and inside the loop, in the if condition:

if initialIterationLength == len(sorted):
    sorted = []

It does work for the test being discussed, however for larger inputs it won't work.

For example when input is:

ZrXx2VpVJMgPs54SwwxSophZEWvwKUxzqNxaxlgY0T

Our result is:

aghlopqrsvwxzEJKMNPSTUVWXYZ0245-gpwxSVZ-wx

Expected is:

aghlopqrsvwxzEJKMNPSTUVWXYZ0245-gpwxSVZ-wx-x-x

I think there should be a better algorithm.

I have read:

How could we sort non repeated characters and then repeated ones?

Zoe
  • 23,712
  • 16
  • 99
  • 132
Enoy
  • 1,724
  • 3
  • 19
  • 39

3 Answers3

1

You could use a Counter to keep track of the iterations you need according to the repeated digits.

import string
from collections import Counter

ORDER = {s:i for i, s in enumerate(string.ascii_letters + string.digits)}

def my_sorted(s):

    c = Counter(s)
    res = []
    it = 1

    to_sort = set(c)
    while len(to_sort) > 0:
        res.append(sorted(to_sort ,key=lambda x:ORDER[x]))
        to_sort = [k for k in c if c[k] > it]
        it+=1

    return "-".join(["".join(l) for l in res])

Example:

>>> s="ZrXx2VpVJMgPs54SwwxSophZEWvwKUxzqNxaxlgY0T"
>>> my_sorted(s)
aghlopqrsvwxzEJKMNPSTUVWXYZ0245-gpwxSVZ-gpwxSVZ-wx-x-x
abc
  • 10,273
  • 2
  • 21
  • 42
0

Stealing from @abc's answer...

import string
from collections import Counter

ORDER = {s:i for i, s in enumerate(string.ascii_letters + string.digits)}

def my_sorted(s):

    c = Counter(s)
    res = []

    while c:
        res.append(''.join(sorted(c, key=ORDER.get)))
        c -= Counter(set(c))

    return "-".join(res)

Example:

>>> s = "ZrXx2VpVJMgPs54SwwxSophZEWvwKUxzqNxaxlgY0T"
>>> my_sorted(s)
'aghlopqrsvwxzEJKMNPSTUVWXYZ0245-gpwxSVZ-wx-x-x'
Kelly Bundy
  • 5,629
  • 1
  • 5
  • 31
0

Still stealing @abc's setup, but completely different solution. I append the needed number of dashes and then sort everything based on 1) the how-many-eth occurrence a character is and 2) the aA0- order.

import string
from collections import Counter

ORDER = {s:i for i, s in enumerate(string.ascii_letters + string.digits + '-')}

def my_sorted(s):
    return ''.join(sorted(s + '-' * (max(Counter(s).values()) - 1),
                          key=lambda c, ctr=Counter(): (ctr.update(c) or ctr[c], ORDER[c])))

Example:

>>> s = "ZrXx2VpVJMgPs54SwwxSophZEWvwKUxzqNxaxlgY0T"
>>> my_sorted(s)
'aghlopqrsvwxzEJKMNPSTUVWXYZ0245-gpwxSVZ-wx-x-x'
Kelly Bundy
  • 5,629
  • 1
  • 5
  • 31