244

Is there a way to set up a global variable inside of a module? When I tried to do it the most obvious way as appears below, the Python interpreter said the variable __DBNAME__ did not exist.

...
__DBNAME__ = None

def initDB(name):
    if not __DBNAME__:
        __DBNAME__ = name
    else:
        raise RuntimeError("Database name has already been set.")
...

And after importing the module in a different file

...
import mymodule
mymodule.initDB('mydb.sqlite')
...

And the traceback was:

... UnboundLocalError: local variable 'DBNAME' referenced before assignment ...

Any ideas? I'm trying to set up a singleton by using a module, as per this fellow's recommendation.

themefield
  • 2,454
  • 25
  • 28
daveslab
  • 9,280
  • 21
  • 55
  • 86

5 Answers5

279

Here is what is going on.

First, the only global variables Python really has are module-scoped variables. You cannot make a variable that is truly global; all you can do is make a variable in a particular scope. (If you make a variable inside the Python interpreter, and then import other modules, your variable is in the outermost scope and thus global within your Python session.)

All you have to do to make a module-global variable is just assign to a name.

Imagine a file called foo.py, containing this single line:

X = 1

Now imagine you import it.

import foo
print(foo.X)  # prints 1

However, let's suppose you want to use one of your module-scope variables as a global inside a function, as in your example. Python's default is to assume that function variables are local. You simply add a global declaration in your function, before you try to use the global.

def initDB(name):
    global __DBNAME__  # add this line!
    if __DBNAME__ is None: # see notes below; explicit test for None
        __DBNAME__ = name
    else:
        raise RuntimeError("Database name has already been set.")

By the way, for this example, the simple if not __DBNAME__ test is adequate, because any string value other than an empty string will evaluate true, so any actual database name will evaluate true. But for variables that might contain a number value that might be 0, you can't just say if not variablename; in that case, you should explicitly test for None using the is operator. I modified the example to add an explicit None test. The explicit test for None is never wrong, so I default to using it.

Finally, as others have noted on this page, two leading underscores signals to Python that you want the variable to be "private" to the module. If you ever do an import * from mymodule, Python will not import names with two leading underscores into your name space. But if you just do a simple import mymodule and then say dir(mymodule) you will see the "private" variables in the list, and if you explicitly refer to mymodule.__DBNAME__ Python won't care, it will just let you refer to it. The double leading underscores are a major clue to users of your module that you don't want them rebinding that name to some value of their own.

It is considered best practice in Python not to do import *, but to minimize the coupling and maximize explicitness by either using mymodule.something or by explicitly doing an import like from mymodule import something.

EDIT: If, for some reason, you need to do something like this in a very old version of Python that doesn't have the global keyword, there is an easy workaround. Instead of setting a module global variable directly, use a mutable type at the module global level, and store your values inside it.

In your functions, the global variable name will be read-only; you won't be able to rebind the actual global variable name. (If you assign to that variable name inside your function it will only affect the local variable name inside the function.) But you can use that local variable name to access the actual global object, and store data inside it.

You can use a list but your code will be ugly:

__DBNAME__ = [None] # use length-1 list as a mutable

# later, in code:  
if __DBNAME__[0] is None:
    __DBNAME__[0] = name

A dict is better. But the most convenient is a class instance, and you can just use a trivial class:

class Box:
    pass

__m = Box()  # m will contain all module-level values
__m.dbname = None  # database name global in module

# later, in code:
if __m.dbname is None:
    __m.dbname = name

(You don't really need to capitalize the database name variable.)

I like the syntactic sugar of just using __m.dbname rather than __m["DBNAME"]; it seems the most convenient solution in my opinion. But the dict solution works fine also.

With a dict you can use any hashable value as a key, but when you are happy with names that are valid identifiers, you can use a trivial class like Box in the above.

steveha
  • 67,444
  • 18
  • 86
  • 112
  • 7
    Two leading underscores would lead to name mangling. Usually a single underscore is sufficient to indicate that a variable should be considered private. https://stackoverflow.com/questions/6930144/underscore-vs-double-underscore-with-variables-and-methods#6930223 – H.Rabiee Jul 30 '17 at 11:45
  • Concerning the Box class, wouldn't be better to define dbname = None in a __init__ function than to to this outside, as in the example? – SuperGeo Oct 01 '17 at 10:47
  • 1
    Python doesn't care how the variables get set up. There are recipes for the `Box` class or similar that define an `__init__()` function, that grabs all the values from `kwargs` and sets them up in the class dictionary. Then you could just do `_m = Box(dbname="whatever")` and it's tidy. Since Python 3.3, there is now `types.SimpleNameSpace` which is a full-featured implementation of the `Box` class; see: https://docs.python.org/3/library/types.html#additional-utility-classes-and-functions – steveha Oct 16 '17 at 09:07
95

Explicit access to module level variables by accessing them explicity on the module


In short: The technique described here is the same as in steveha's answer, except, that no artificial helper object is created to explicitly scope variables. Instead the module object itself is given a variable pointer, and therefore provides explicit scoping upon access from everywhere. (like assignments in local function scope).

Think of it like self for the current module instead of the current instance !

# db.py
import sys

# this is a pointer to the module object instance itself.
this = sys.modules[__name__]

# we can explicitly make assignments on it 
this.db_name = None

def initialize_db(name):
    if (this.db_name is None):
        # also in local function scope. no scope specifier like global is needed
        this.db_name = name
        # also the name remains free for local use
        db_name = "Locally scoped db_name variable. Doesn't do anything here."
    else:
        msg = "Database is already initialized to {0}."
        raise RuntimeError(msg.format(this.db_name))

As modules are cached and therefore import only once, you can import db.py as often on as many clients as you want, manipulating the same, universal state:

# client_a.py
import db

db.initialize_db('mongo')
# client_b.py
import db

if (db.db_name == 'mongo'):
    db.db_name = None  # this is the preferred way of usage, as it updates the value for all clients, because they access the same reference from the same module object
# client_c.py
from db import db_name
# be careful when importing like this, as a new reference "db_name" will
# be created in the module namespace of client_c, which points to the value 
# that "db.db_name" has at import time of "client_c".

if (db_name == 'mongo'):  # checking is fine if "db.db_name" doesn't change
    db_name = None  # be careful, because this only assigns the reference client_c.db_name to a new value, but leaves db.db_name pointing to its current value.

As an additional bonus I find it quite pythonic overall as it nicely fits Pythons policy of Explicit is better than implicit.

timmwagener
  • 1,778
  • 1
  • 14
  • 21
  • 1
    I like that you can use the more precise "from db import" in the second module, even though you have to do the larger "import db" in the main. This seems to be true if you skip the 'sys' magic and use 'global' in initialize_db. Can you comment on the pros/cons of global vs. your answer, since they both seem to work the same? – Alain Collins Dec 12 '16 at 17:57
  • The **pro** to me is that you don't need scope manipulation anymore. You explicitly give the scope by accessing the variable db_name from an object, which happens to be the module. You don't have to declare where the object that you want to work with lives, before using it anytime. Also you can have local variables named db_name in handler functions, next to this.db_name as well. – timmwagener Dec 12 '16 at 18:44
  • 3
    To me, this seems to be the cleanest way to do this, but my linters are balking at it. Am I doing something wrong or do you/others have this issue as well? Thanks a ton, Chris – ThePosey Sep 09 '17 at 17:25
  • @beeb There is a slight catch with my example for `client_b.py`. On import time, it will create a new variable in the module scope of `client_b` that gets assigned the current value of `db_name` from `client_a`. You can check against it like in the example, but if the value changes via assignment in `client_a` for example by calling `initialize_db()`, that means the reference `client_a.db_name` points to a new value, other references, for example `client_b.db_name` still point to the old value assigned on import, as we didn't reassign those. That's a bit misleading, I will update the answer. – timmwagener Dec 12 '18 at 10:23
  • Binding module global variables right on the module itself looks super-cool, but now if clients want to change the module's global, they're limited to `import db` only and cannot use the more explicit `from db import something` anymore. Not so cool from the usability point, isn't it? – Alex Che Jan 20 '21 at 14:50
  • Been using this successfully in production code for a while. Recently started using mypy, which really does not like this pattern. Since the variables are declared and set to None when the module is first initialized and don't get their values until after the "create" function gets called, it complains that the attribute doesn't exist and that types can't be declare on "non-self attributes". Taking out `this` gets rid of the second error, at least. – pyansharp Mar 25 '21 at 22:15
  • Agree, i'd also like a kind of guideline of how this pattern is used properly with mypy/static analysis. Is it not possible (yet)? Could it be possible once? Is it possible but unreasonable effort? Is it just a harsh mismatch of valid usage of dynamic features and static type checking here? – timmwagener Mar 26 '21 at 17:20
31

Steveha's answer was helpful to me, but omits an important point (one that I think wisty was getting at). The global keyword is not necessary if you only access but do not assign the variable in the function.

If you assign the variable without the global keyword then Python creates a new local var -- the module variable's value will now be hidden inside the function. Use the global keyword to assign the module var inside a function.

Pylint 1.3.1 under Python 2.7 enforces NOT using global if you don't assign the var.

module_var = '/dev/hello'

def readonly_access():
    connect(module_var)

def readwrite_access():
    global module_var
    module_var = '/dev/hello2'
    connect(module_var)
AsheKetchum
  • 928
  • 2
  • 12
  • 27
Brad Dre
  • 2,459
  • 15
  • 20
8

For this, you need to declare the variable as global. However, a global variable is also accessible from outside the module by using module_name.var_name. Add this as the first line of your module:

global __DBNAME__
Chinmay Kanchi
  • 54,755
  • 21
  • 79
  • 110
  • is there any way to make it accessible to the whole module, but not available to being called by module_name.__DBNAME__? – daveslab Dec 29 '09 at 23:01
  • Yes... you can put the global statement inside your function to make it "global" within the module (within that function... you'd have to repeat the global declaration in every function that uses this global). For example (forgive the code in comments): `def initDB(name):\n global __DBNAME__` – Jarret Hardie Dec 29 '09 at 23:11
  • Thanks, Jarret. Unfortunately, when I try that, and run dir(mymodule) on the console, it shows the variables as available and I can access them. Am I misunderstanding you? – daveslab Dec 29 '09 at 23:36
  • Remember, in Python `_DBNAME` (single underscore) is considered to be a private variable by convention. This is only semi-enforced for classes and not at all for "naked" code, but most decent programmers will treat `_var` as private. – Chinmay Kanchi Dec 29 '09 at 23:50
  • Thats true, @cgkanchi, but I'd like to see if it's strictly enforceable. – daveslab Dec 30 '09 at 00:15
  • 1
    Put the whole thing in a class. That way, at least someone who wants to access the private variable has to do some work. – Chinmay Kanchi Dec 30 '09 at 00:37
  • 3
    It's not enforceable daveslab. The idea in Python is that we're all adults and that private and protected variables are best accomplished by contract and convention that any strict compiler-enforced mechanism. – Jarret Hardie Dec 30 '09 at 00:40
-9

You are falling for a subtle quirk. You cannot re-assign module-level variables inside a python function. I think this is there to stop people re-assigning stuff inside a function by accident.

You can access the module namespace, you just shouldn't try to re-assign. If your function assigns something, it automatically becomes a function variable - and python won't look in the module namespace.

You can do:

__DB_NAME__ = None

def func():
    if __DB_NAME__:
        connect(__DB_NAME__)
    else:
        connect(Default_value)

but you cannot re-assign __DB_NAME__ inside a function.

One workaround:

__DB_NAME__ = [None]

def func():
    if __DB_NAME__[0]:
        connect(__DB_NAME__[0])
    else:
        __DB_NAME__[0] = Default_value

Note, I'm not re-assigning __DB_NAME__, I'm just modifying its contents.

Ry-
  • 199,309
  • 51
  • 404
  • 420
wisty
  • 6,587
  • 1
  • 27
  • 27