84

I'd like my dictionary to be case insensitive.

I have this example code:

text = "practice changing the color"

words = {'color': 'colour',
        'practice': 'practise'}

def replace(words,text):

    keys = words.keys()

    for i in keys:
        text= text.replace(i ,words[i])
    return  text

text = replace(words,text)

print text

Output = practise changing the colour

I'd like another string, "practice changing the Color", (where Color starts with a capital) to also give the same output.

I believe there is a general way to convert to lowercase using mydictionary[key.lower()] but I'm not sure how to best integrate this into my existing code. (If this would be a reasonable, simple approach anyway).

martineau
  • 99,260
  • 22
  • 139
  • 249
Kim
  • 2,840
  • 4
  • 22
  • 27
  • 4
    See [PEP-455](https://www.python.org/dev/peps/pep-0455/): this is scheduled for standard library inclusion in Python 3.5 (as `collections.TransformDict`, provided the transform is `str.casefold` or similar) – Nick T Jan 12 '15 at 21:50
  • 6
    @NickT This PEP has been rejected. https://www.python.org/dev/peps/pep-0455/#rejection – user1556435 Apr 08 '16 at 15:17

10 Answers10

71

The currently approved answer doesn't work for a lot of cases, so it cannot be used as a drop-in dict replacement. Some tricky points in getting a proper dict replacement:

  • overloading all of the methods that involve keys
  • properly handling non-string keys
  • properly handling the constructor of the class

The following should work much better:

class CaseInsensitiveDict(dict):
    @classmethod
    def _k(cls, key):
        return key.lower() if isinstance(key, basestring) else key

    def __init__(self, *args, **kwargs):
        super(CaseInsensitiveDict, self).__init__(*args, **kwargs)
        self._convert_keys()
    def __getitem__(self, key):
        return super(CaseInsensitiveDict, self).__getitem__(self.__class__._k(key))
    def __setitem__(self, key, value):
        super(CaseInsensitiveDict, self).__setitem__(self.__class__._k(key), value)
    def __delitem__(self, key):
        return super(CaseInsensitiveDict, self).__delitem__(self.__class__._k(key))
    def __contains__(self, key):
        return super(CaseInsensitiveDict, self).__contains__(self.__class__._k(key))
    def has_key(self, key):
        return super(CaseInsensitiveDict, self).has_key(self.__class__._k(key))
    def pop(self, key, *args, **kwargs):
        return super(CaseInsensitiveDict, self).pop(self.__class__._k(key), *args, **kwargs)
    def get(self, key, *args, **kwargs):
        return super(CaseInsensitiveDict, self).get(self.__class__._k(key), *args, **kwargs)
    def setdefault(self, key, *args, **kwargs):
        return super(CaseInsensitiveDict, self).setdefault(self.__class__._k(key), *args, **kwargs)
    def update(self, E={}, **F):
        super(CaseInsensitiveDict, self).update(self.__class__(E))
        super(CaseInsensitiveDict, self).update(self.__class__(**F))
    def _convert_keys(self):
        for k in list(self.keys()):
            v = super(CaseInsensitiveDict, self).pop(k)
            self.__setitem__(k, v)
Community
  • 1
  • 1
m000
  • 5,222
  • 3
  • 27
  • 28
  • 3
    This is great, but there is one minor problem. The super definition of `update` is `update(self, E=None, **F)`, meaning `E` is optional. You've re-defined it to make `E` required. Add in the `=None` and this will be perfect. – Nick Williams Nov 16 '15 at 14:44
  • Nice catch @NickWilliams. Thanks! – m000 Nov 16 '15 at 22:58
  • 23
    Python is easy, they said. Python is fun, they said. – rr- Nov 29 '15 at 20:59
  • 2
    @rr-. To be totally fair, imagine doing this in say C. – Mad Physicist Jul 05 '16 at 14:38
  • 1
    Nitpick, but this does not have proper support for unicode normalization. – Mad Physicist Jul 05 '16 at 14:39
  • 1
    @MadPhysicist Not sure, but it should be straightforward to add. Just modify `_k()` to also normalise as desired. – m000 Jul 05 '16 at 18:48
  • @m000. Agreed. I upvoted because your answer is not only the most complete, but also the most modular exactly for the reason you stated. – Mad Physicist Jul 05 '16 at 18:50
  • return super(CaseInsensitiveDict, self).setdefault(self.__class__._k(key), *args, **kwargs) TypeError: super(type, obj): obj must be an instance or subtype of type – Denny Weinberg Sep 27 '16 at 09:05
  • 6
    In python 3 the abstract type `basestring` was removed. `str` can be used as a replacement. – Jan Schatz Jan 19 '18 at 09:47
  • Great answer, and it still works in Python 3, as long as you change basestring to str. I have modified it slightly to support tuples. If a tuple is used as a key, it will match the tuple regardless of case of the items in the tuple (even if nested). It's too long to post as a comment. Basically, if it's a tuple, it calls a method that converts all strings in the tuple to lowercase recursively. – Troy Hoffman Dec 26 '18 at 20:08
  • for those who need python2/3 compatibility with respect to basestring - you can see this answer as a basis for modifying the `_k` method: https://stackoverflow.com/a/22679982 – ara.hayrabedian May 22 '19 at 12:43
  • what about encoding every key to its base32 representation? Will that ensure consistency across any encoding the string is in? – Itamar Jun 20 '19 at 18:01
  • Very similar to what I did here: https://stackoverflow.com/a/43457369/281545 - main (important) difference is that I retain case info, but this needs a dedicated string class – Mr_and_Mrs_D Oct 02 '20 at 13:12
  • ideally, a true case-insensitive dictionary would also be case-preserving of the last setter. ie: this is collation only, not data-loss – Erik Aronesty Mar 03 '21 at 22:32
60

Just for the record. I found an awesome impementation on Requests:

https://github.com/kennethreitz/requests/blob/v1.2.3/requests/structures.py#L37

santiagobasulto
  • 10,542
  • 9
  • 61
  • 85
44

If I understand you correctly and you want a way to key dictionaries in a non case-sensitive fashion, one way would be to subclass dict and overload the setter / getter:

class CaseInsensitiveDict(dict):
    def __setitem__(self, key, value):
        super(CaseInsensitiveDict, self).__setitem__(key.lower(), value)

    def __getitem__(self, key):
        return super(CaseInsensitiveDict, self).__getitem__(key.lower())
jkp
  • 70,446
  • 25
  • 98
  • 102
  • 1
    Isn't there a special builtin that is called for 'in' as well? – Omnifarious Jan 17 '10 at 19:00
  • 26
    Here is a complete list of methods that may need overloading: __setitem__, __getitem__, __contains__, get, has_key, pop, setdefault, and update. __init__ and fromkeys should also possibly be overloaded to make sure the dictionary is initialized properly. Maybe I'm wrong and somewhere Python promises that get, hash_key, pop, setdefault, update and __init__ will be implemented in terms of __getitem__, __setitem__ and __contains__ if they've been overloaded, but I don't think so. – Omnifarious Jan 17 '10 at 19:08
  • 4
    added `__contains__, get, and has_key` to the answer since I ended up coding them :) – Michael Merchant Apr 08 '11 at 23:29
  • 7
    This solution is very limited as it doesn't work for a **lot** of common uses of `dict`. **Don't use it in your code - it will break all but the simplest uses.** Apparently @MichaelMerchant attempted to add the missing stuff, but moderation dissaproved the changes (same thing happened to me). I added a new answer which should be usable as a drop-in `dict` replacement [here](http://stackoverflow.com/a/32888599/277172). – m000 Oct 01 '15 at 13:26
  • Like the others said, setdefault as example is broken! "descriptor 'setdefault' requires a 'dict' object but received a 'str'" – Denny Weinberg Sep 27 '16 at 09:01
  • 2
    Better off subclassing `UserDict` than `dict` https://docs.python.org/3.5/library/collections.html#userdict-objects – rite2hhh Aug 30 '19 at 19:37
18

In my particular instance, I needed a case insensitive lookup, however, I did not want to modify the original case of the key. For example:

>>> d = {}
>>> d['MyConfig'] = 'value'
>>> d['myconfig'] = 'new_value'
>>> d
{'MyConfig': 'new_value'}

You can see that the dictionary still has the original key, however it is accessible case-insensitively. Here's a simple solution:

class CaseInsensitiveKey(object):
    def __init__(self, key):
        self.key = key
    def __hash__(self):
        return hash(self.key.lower())
    def __eq__(self, other):
        return self.key.lower() == other.key.lower()
    def __str__(self):
        return self.key

The __hash__ and __eq__ overrides are required for both getting and setting entries in the dictionary. This is creating keys that hash to the same position in the dictionary if they are case-insensitively equal.

Now either create a custom dictionary that initializes a CaseInsensitiveKey using the provided key:

class CaseInsensitiveDict(dict):
    def __setitem__(self, key, value):
        key = CaseInsensitiveKey(key)
        super(CaseInsensitiveDict, self).__setitem__(key, value)
    def __getitem__(self, key):
        key = CaseInsensitiveKey(key)
        return super(CaseInsensitiveDict, self).__getitem__(key)

or simply make sure to always pass an instance of CaseInsensitiveKey as the key when using the dictionary.

yobiscus
  • 399
  • 2
  • 10
  • Nice, thanks! :) (Note that this class doesn't implement the case-insensitive "dict(iterable)" constructor so if you need it you have to add it) – Joril Mar 27 '18 at 09:15
  • 3
    You should use `.casefold()` instead of `.lower()` for comparisons, `self.key.casefold() == other.key.casefold()`, to allow `"ß"` and `"ss"` to equate as true, among others. – AJNeufeld Sep 24 '19 at 20:00
12

Would you consider using string.lower() on your inputs and using a fully lowercase dictionary? It's a bit of a hacky solution, but it works

inspectorG4dget
  • 97,394
  • 22
  • 128
  • 222
5

I've modified the simple yet good solution by pleasemorebacon (thanks!) making it slightly more compact, self-contained and with minor updates to allow construction from {'a':1, 'B':2} and support __contains__ protocol. Finally, since the CaseInsensitiveDict.Key is expected to be string (what else can be case-sensitive or not), it is a good idea to derive Key class from the str, then it is possible, for instance, to dump CaseInsensitiveDict with json.dumps out of the box.

# caseinsensitivedict.py
class CaseInsensitiveDict(dict):

    class Key(str):
        def __init__(self, key):
            str.__init__(key)
        def __hash__(self):
            return hash(self.lower())
        def __eq__(self, other):
            return self.lower() == other.lower()

    def __init__(self, data=None):
        super(CaseInsensitiveDict, self).__init__()
        if data is None:
            data = {}
        for key, val in data.items():
            self[key] = val
    def __contains__(self, key):
        key = self.Key(key)
        return super(CaseInsensitiveDict, self).__contains__(key)
    def __setitem__(self, key, value):
        key = self.Key(key)
        super(CaseInsensitiveDict, self).__setitem__(key, value)
    def __getitem__(self, key):
        key = self.Key(key)
        return super(CaseInsensitiveDict, self).__getitem__(key)

Here is a basic test script for those who like to check things in action:

# test_CaseInsensitiveDict.py
import json
import unittest
from caseinsensitivedict import *

class Key(unittest.TestCase):
    def setUp(self):
        self.Key = CaseInsensitiveDict.Key
        self.lower = self.Key('a')
        self.upper = self.Key('A')

    def test_eq(self):
        self.assertEqual(self.lower, self.upper)

    def test_hash(self):
        self.assertEqual(hash(self.lower), hash(self.upper))

    def test_str(self):
        self.assertEqual(str(self.lower), 'a')
        self.assertEqual(str(self.upper), 'A')

class Dict(unittest.TestCase):
    def setUp(self):
        self.Dict = CaseInsensitiveDict
        self.d1 = self.Dict()
        self.d2 = self.Dict()
        self.d1['a'] = 1
        self.d1['B'] = 2
        self.d2['A'] = 1
        self.d2['b'] = 2

    def test_contains(self):
        self.assertIn('B', self.d1)
        d = self.Dict({'a':1, 'B':2})
        self.assertIn('b', d)

    def test_init(self):
        d = self.Dict()
        self.assertFalse(d)
        d = self.Dict({'a':1, 'B':2})
        self.assertTrue(d)

    def test_items(self):
        self.assertDictEqual(self.d1, self.d2)
        self.assertEqual(
            [v for v in self.d1.items()],
            [v for v in self.d2.items()])

    def test_json_dumps(self):
        s = json.dumps(self.d1)
        self.assertIn('a', s)
        self.assertIn('B', s)

    def test_keys(self):
        self.assertEqual(self.d1.keys(), self.d2.keys())

    def test_values(self):
        self.assertEqual(
            [v for v in self.d1.values()],
            [v for v in self.d2.values()])
mloskot
  • 33,165
  • 10
  • 97
  • 122
  • 1
    You should use `.casefold()` instead of `.lower()` for comparisons, `self.casefold() == other.key.casefold()` and `hash(self.casefold())`, to allow "ß" and "ss" to equate as true, among others. – AJNeufeld Sep 24 '19 at 20:11
3

While a case insensitive dictionary is a solution, and there are answers to how to achieve that, there is a possibly easier way in this case. A case insensitive search is sufficient:

import re

text = "Practice changing the Color"
words = {'color': 'colour', 'practice': 'practise'}

def replace(words,text):
        keys = words.keys()
        for i in keys:
                exp = re.compile(i, re.I)
                text = re.sub(exp, words[i], text)
        return text

text = replace(words,text)
print text
Jakob Borg
  • 21,033
  • 6
  • 45
  • 47
  • 3
    It's far better to use the built-in string methods than the regular expression module when the built-ins can easily handle it, which they can in this case. – John Y Jan 17 '10 at 19:44
  • thanks calmh. I'm short on time right now, so your quick and simple solution suits me nicely. thanks – Kim Jan 17 '10 at 19:54
  • @John Y: What would be the regexp-less solution to this? I don't see it. – Jakob Borg Jan 17 '10 at 19:57
  • Kim already mentioned it: use the string.lower() method. Other answers also mentioned it. Comments are no good for posting code, so I guess I will post my own answer. – John Y Jan 18 '10 at 05:33
  • +1 This solution worked best for me, since in my case, the case of the dictionary key matters, and simply lowercasing the key on set is not sufficient. – yobiscus May 13 '15 at 14:22
1

You can do a dict key case insensitive search with a one liner:

>>> input_dict = {'aBc':1, 'xyZ':2}
>>> search_string = 'ABC'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
1
>>> search_string = 'EFG'
>>> next((value for key, value in input_dict.items() if key.lower()==search_string.lower()), None)
>>>

You can place that into a function:


def get_case_insensitive_key_value(input_dict, key):
    return next((value for dict_key, value in input_dict.items() if dict_key.lower() == key.lower()), None)


Note that only the first match is returned.

Fred
  • 556
  • 6
  • 17
0

If you only need to do this once in your code (hence, no point to a function), the most straightforward way to deal with the problem is this:

lowercase_dict = {key.lower(): value for (key, value) in original_dict}

I'm assuming here that the dict in question isn't all that large--it might be inelegant to duplicate it, but if it's not large, it isn't going to hurt anything.

The advantage of this over @Fred's answer (though that also works) is that it produces the same result as a dict when the key isn't present: a KeyError.

MTKnife
  • 73
  • 8
-1

I just set up a function to handle this:

def setLCdict(d, k, v):
    k = k.lower()
    d[k] = v
    return d

myDict = {}

So instead of

myDict['A'] = 1
myDict['B'] = 2

You can:

myDict = setLCdict(myDict, 'A', 1)
myDict = setLCdict(myDict, 'B', 2)

You can then either lower case the value before looking it up or write a function to do so.

    def lookupLCdict(d, k):
        k = k.lower()
        return d[k]

    myVal = lookupLCdict(myDict, 'a')

Probably not ideal if you want to do this globally but works well if its just a subset you wish to use it for.

SFox
  • 1
  • 2