72

I discovered this pattern (or anti-pattern) and I am very happy with it.

I feel it is very agile:

def example():
    age = ...
    name = ...
    print "hello %(name)s you are %(age)s years old" % locals()

Sometimes I use its cousin:

def example2(obj):
    print "The file at %(path)s has %(length)s bytes" % obj.__dict__

I don't need to create an artificial tuple and count parameters and keep the %s matching positions inside the tuple.

Do you like it? Do/Would you use it? Yes/No, please explain.

flybywire
  • 232,954
  • 184
  • 384
  • 491

7 Answers7

90

It's OK for small applications and allegedly "one-off" scripts, especially with the vars enhancement mentioned by @kaizer.se and the .format version mentioned by @RedGlyph.

However, for large applications with a long maintenance life and many maintainers this practice can lead to maintenance headaches, and I think that's where @S.Lott's answer is coming from. Let me explain some of the issues involved, as they may not be obvious to anybody who doesn't have the scars from developing and maintaining large applications (or reusable components for such beasts).

In a "serious" application, you would not have your format string hard-coded -- or, if you had, it would be in some form such as _('Hello {name}.'), where the _ comes from gettext or similar i18n / L10n frameworks. The point is that such an application (or reusable modules that can happen to be used in such applications) must support internationalization (AKA i18n) and locatization (AKA L10n): you want your application to be able to emit "Hello Paul" in certain countries and cultures, "Hola Paul" in some others, "Ciao Paul" in others yet, and so forth. So, the format string gets more or less automatically substituted with another at runtime, depending on the current localization settings; instead of being hardcoded, it lives in some sort of database. For all intents and purposes, imagine that format string always being a variable, not a string literal.

So, what you have is essentially

formatstring.format(**locals())

and you can't trivially check exactly what local names the formatting is going to be using. You'd have to open and peruse the L10N database, identify the format strings that are going to be used here in different settings, verify all of them.

So in practice you don't know what local names are going to get used -- which horribly crimps the maintenance of the function. You dare not rename or remove any local variable, as it might horribly break the user experience for users with some (to you) obscure combinaton of language, locale and preferences

If you have superb integration / regression testing, the breakage will be caught before the beta release -- but QA will scream at you and the release will be delayed... and, let's be honest, while aiming for 100% coverage with unit tests is reasonable, it really isn't with integration tests, once you consider the combinatorial explosion of settings [[for L10N and for many more reasons]] and supported versions of all dependencies. So, you just don't blithely go around risking breakages because "they'll be caught in QA" (if you do, you may not last long in an environment that develops large apps or reusable components;-).

So, in practice, you'll never remove the "name" local variable even though the User Experience folks have long switched that greeting to a more appropriate "Welcome, Dread Overlord!" (and suitably L10n'ed versions thereof). All because you went for locals()...

So you're accumulating cruft because of the way you've crimped your ability to maintain and edit your code -- and maybe that "name" local variable only exists because it's been fetched from a DB or the like, so keeping it (or some other local) around is not just cruft, it's reducing your performance too. Is the surface convenience of locals() worth that?-)

But wait, there's worse! Among the many useful services a lint-like program (like, for example, pylint) can do for you, is to warn you about unused local variables (wish it could do it for unused globals as well, but, for reusable components, that's just a tad too hard;-). This way you'll catch most occasional misspellings such as if ...: nmae = ... very rapidly and cheaply, rather than by seeing a unit-test break and doing sleuth work to find out why it broke (you do have obsessive, pervasive unit tests that would catch this eventually, right?-) -- lint will tell you about an unused local variable nmae and you will immediately fix it.

But if you have in your code a blah.format(**locals()), or equivalently a blah % locals()... you're SOL, pal!-) How is poor lint going to know whether nmae is in fact an unused variable, or actually it does get used by whatever external function or method you're passing locals() to? It can't -- either it's going to warn anyway (causing a "cry wolf" effect that eventually leads you to ignore or disable such warnings), or it's never going to warn (with the same final effect: no warnings;-).

Compare this to the "explicit is better than implicit" alternative...:

blah.format(name=name)

There -- none of the maintenance, performance, and am-I-hampering-lint worries, applies any more; bliss! You make it immediately clear to everybody concerned (lint included;-) exactly what local variables are being used, and exactly for what purposes.

I could go on, but I think this post is already pretty long;-).

So, summarizing: "γνῶθι σεαυτόν!" Hmm, I mean, "know thyself!". And by "thyself" I actually mean "the purpose and scope of your code". If it's a 1-off-or-thereabouts thingy, never going to be i18n'd and L10n'd, will hardly need future maintenance, will never be reused in a broader context, etc, etc, then go ahead and use locals() for its small but neat convenience; if you know otherwise, or even if you're not entirely certain, err on the side of caution, and make things more explicit -- suffer the small inconvenience of spelling out exactly what you're going, and enjoy all the resulting advantages.

BTW, this is just one of the examples where Python is striving to support both "small, one-off, exploratory, maybe interactive" programming (by allowing and supporting risky conveniences that extend well beyond locals() -- think of import *, eval, exec, and several other ways you can mush up namespaces and risk maintenance impacts for the sake of convenience), as well as "large, reusable, enterprise-y" apps and components. It can do a pretty good job at both, but only if you "know thyself" and avoid using the "convenience" parts except when you're absolutely certain you can in fact afford them. More often than not, the key consideration is, "what does this do to my namespaces, and awareness of their formation and use by the compiler, lint &c, human readers and maintainers, and so on?".

Remember, "Namespaces are one honking great idea -- let's do more of those!" is how the Zen of Python concludes... but Python, as a "language for consenting adults", lets you define the boundaries of what that implies, as a consequence of your development environment, targets, and practices. Use this power responsibly!-)

Alex Martelli
  • 762,786
  • 156
  • 1,160
  • 1,345
  • 1
    An excellent answer. I think most programs are not internationalized, and so this is not a problem in a very large number of cases. But in those cases, yes, string interpolation is bad. – Paul Biggar Oct 11 '09 at 17:25
  • 5
    @Paul, I beg to differ: string interpolation is _excellent_, _especially_ for i18n/L10n support -- it just needs to happen on explicitly named variables! The problem is not with interpolation, it's with passing `locals()` to external functions or methods. BTW, Python's growing support for Unicode (now the default text string in `3.*`) is exactly an attempt to help change the fact that "most programs are not i18n'd" -- many more _should_ be, than currently _are_; as the 'net (via smartphones, netbooks, &c) booms in China &c, anglocentric assumptions become weirder and weirder;-). – Alex Martelli Oct 11 '09 at 17:39
  • 2
    I think there is the possibility for fiddling with locales/gettext to insert `{self}`,`{password}` or other objects you don't expect shown into the format string. This may be a security risk. Best to be explicit for real code – John La Rooy Oct 11 '09 at 21:19
  • @Alex: I would define `"hello %(name)s you are %(age)s years old" % locals()` to be string interpolation (in the style of Perl or PHP anyway). Have you a different definition? – Paul Biggar Oct 14 '09 at 01:18
  • 2
    @Paul, I call this "string formatting" -- the names are totally arbitrary (it's obviously the same statement whether you're using a specially made dict or an existing one), and there is no interpolation involved -- see the definition of interpolation at http://en.wikipedia.org/wiki/Interpolation . Merging an 'a' and a 'c' to get a 'b', now **that** would be an interpolation of strings. – Alex Martelli Oct 14 '09 at 03:15
  • 4
    @Alex: In PHP or Perl, `"hello {$name}s you are $age years old"` is string interpolation. This is the same pattern in Python. I presume you're kidding about the interpolation definition--"string interpolation" has long been a term to describe this type of string formatting. Its even the title of PEP215. – Paul Biggar Oct 14 '09 at 14:04
  • Consider why PEP 215 was rejected: by looking up identifiers directly, and requiring a literal string, it heavily stood in the way of i18n/L10N: today's format method OTOH is quite i18n/L10n-friendly, as I explain above. But the underlying functionality is the same whether you call it formatting or interpolation, e.g. http://search.cpan.org/~nobull/String-Interpolate-0.3/lib/String/Interpolate.pm has so-called "interpolation" that's close to the `.format` (more rich and complex, of course, but the key point, explicit hashes [& ties, as it's Perl], is shared). – Alex Martelli Oct 14 '09 at 15:12
  • In principle lint programs could recognize the idiom, and count usages in format strings. – Mechanical snail Mar 22 '12 at 07:13
  • but wait. does `_('Hello {name}.')` perform the same function as `.format(**locals)` while *also* doing internationalization? Because that's a lot easier to type and it would be nice if that were the standard way to do this. See http://stackoverflow.com/q/19549980/125507 – endolith Dec 08 '13 at 23:16
  • Agreed with Alex. Also, using `**locals()` only accesses local identifiers. Depending on the identifiers referenced in your format string, you would have to use `**locals()` or `**globals()` or both in a combined dict. You would usually want any identifier you reference to follow normal Python scoping rules. If you move an object from global to local scope or vice versa, a format string with explicit identifiers would continue to work, but one using `**locals()` or `**globals()` would break. The `**locals()` approach forces you to be aware of scoping in a way you usually don't need to be. – Chris Johnson Feb 20 '14 at 02:15
  • Regarding i18n and `gettext`: Seems like one could easily just call the underscore function on any string literals that were going to be interpolated though a format string. – martineau Mar 15 '14 at 18:51
10

I think it is a great pattern because you are leveraging built-in functionality to reduce the code you need to write. I personally find it quite Pythonic.

I never write code that I don't need to write - less code is better than more code and this practice of using locals() for example allows me to write less code and is also very easy to read and understand.

Andrew Hare
  • 320,708
  • 66
  • 621
  • 623
  • I like using it at the top of a function, when I need a dict built out of the input params. Its quite effective and I feel its pythonic as well. It can be misused at times, I can understand that. – radtek Jul 22 '16 at 14:44
10

Never in a million years. It's unclear what the context for formatting is: locals could include almost any variable. self.__dict__ is not as vague. Perfectly awful to leave future developers scratching their heads over what's local and what's not local.

It's an intentional mystery. Why saddle your organization with future maintenance headaches like that?

S.Lott
  • 359,791
  • 75
  • 487
  • 757
  • The context is the function in which locals() appears. If it's a nice short function, you can just look at the vars. If it's a very long function, it should be refactored. How is self.__dict__ any clearer? – foosion Oct 11 '09 at 16:40
  • 6
    I don't understand why referencing a local variable name in a format string is any more unclear than referencing a local variable name in code. – Robert Rossney Oct 11 '09 at 17:10
  • `self.__dict__` is based on the class definition -- which is often tied up with the `__init__()` method and carefully documented in the doc string. `locals()` is often a fairly random collection of names. – S.Lott Oct 11 '09 at 17:22
  • I don't particularly like it either, but not because it's unclear. Python's scoping is simpler than many other languages, but a design pattern like this still seems to be asking for trouble, because you WILL find yourself getting lazy and using outside of a specifically defined function, and then you WILL run into scoping problems. – Paul McMillan Oct 11 '09 at 17:24
  • @Robert Rossney: I never said a variable _name_ was unclear. I said `locals()` is unclear because the variable names are harder to find. they're buried in the format string. – S.Lott Oct 12 '09 at 22:32
  • @PaulMcMillan: What kind of problems would you run into? – endolith Mar 06 '12 at 22:41
  • @endolith: you run into problems where you change the scope later, or rename a variable to match the new semantic meaning, or someone else injects something into your scope, or... it's just not explicit enough to be maintainable in a real project. – Paul McMillan Mar 19 '12 at 18:50
10

Regarding the "cousin", instead of obj.__dict__, it looks a lot better with new string formatting:

def example2(obj):
    print "The file at {o.path} has {o.length} bytes".format(o=obj)

I use this a lot for repr methods, e.g.

def __repr__(self):
    return "{s.time}/{s.place}/{s.warning}".format(s=self)
pfctdayelise
  • 4,677
  • 2
  • 29
  • 49
8

The "%(name)s" % <dictionary> or even better, the "{name}".format(<parameters>) have the merit of

  • being more readable than "%0s"
  • being independent from the argument order
  • not compelling to use all the arguments in the string

I would tend to favour the str.format(), since it should be the way to do that in Python 3 (as per PEP 3101), and is already available from 2.6. With locals() though, you would have to do this:

print("hello {name} you are {age} years old".format(**locals()))
RedGlyph
  • 10,439
  • 5
  • 34
  • 44
6

Using the built-in vars([object]) (documentation) might make the second look better to you:

def example2(obj):
    print "The file at %(path)s has %(length)s bytes" % vars(obj)

The effect is of course just the same.

u0b34a0f6ae
  • 42,509
  • 13
  • 86
  • 97
1

There is now an official way to do this, as of Python 3.6.0: formatted string literals.

It works like this:

f'normal string text {local_variable_name}'

E.g. instead of these:

"hello %(name)s you are %(age)s years old" % locals()
"hello {name} you are {age} years old".format(**locals())
"hello {} you are {} years old".format(name, age)

just do this:

f"hello {name} you are {age} years old"

Here's the official example:

>>> name = "Fred"
>>> f"He said his name is {name}."
'He said his name is Fred.'
>>> width = 10
>>> precision = 4
>>> value = decimal.Decimal("12.34567")
>>> f"result: {value:{width}.{precision}}"  # nested fields
'result:      12.35'

Reference:

Jacktose
  • 589
  • 5
  • 16
  • 1
    this just made my day. i'm going to use this for all of my python code that is not restrained by supporting 2.x, or in this case <3.6! – svenevs Apr 10 '18 at 07:57