3

I've been loving the tuple comprehensions added to Python3.5:

In [128]: *(x for x in range(5)),
Out[128]: (0, 1, 2, 3, 4)

However, when I try to return a tuple comprehension directly I get an error:

In [133]: def testFunc():
     ...:     return *(x for x in range(5)),
     ...: 
  File "<ipython-input-133-e6dd0ba638b7>", line 2
    return *(x for x in range(5)),
           ^
SyntaxError: invalid syntax    

This is just a slight inconvenience since I can simply assign the tuple comprehension to a variable and return the variable. However, if I try and put a tuple comprehension inside a dictionary comprehension I get the same error:

In [130]: {idx: *(x for x in range(5)), for idx in range(5)}
  File "<ipython-input-130-3e9a3eee879c>", line 1
    {idx: *(x for x in range(5)), for idx in range(5)}
          ^
SyntaxError: invalid syntax

I feel like this is a bit more of a problem since comprehsions can be important for performance in some situations.

I have no problem using dictionary and list comprehensions in these situations. How many other situations is the tuple comprehension not going to work when others do? Or perhaps I'm using it wrong?

It makes me wonder what the point was if it's use is so limited or perhaps I am doing something wrong? If I'm not doing something wrong then what is the fastest/most pythonic way to create a tuple that is versitile enough to be used in the same way as list and dictionary comprehensions?

ojunk
  • 627
  • 6
  • 17
  • 13
    That’s not a tuple comprehension, it is a generator expression. – miradulo Aug 26 '18 at 12:52
  • 6
    The ``*`` is not part of a generator expression! It is the symbol for unpacking. – MisterMiyagi Aug 26 '18 at 12:58
  • 1
    You _could_ add parentheses: `return (*(...),)`, but instead you should rather use `tuple(...)`. – tobias_k Aug 26 '18 at 13:21
  • 1
    star-unpacking inside comprehensions was explicitly considered and rejected: see [PEP 448](https://www.python.org/dev/peps/pep-0448/) for details. The invalid syntax in the `return` surprises me, though; it's possible that this is simply an oversight. – Mark Dickinson Aug 26 '18 at 13:38

2 Answers2

9

TLDR: If you want a tuple, pass a generator expression to tuple:

{idx: tuple(x for x in range(5)) for idx in range(5)}

There are no "tuple comprehensions" in Python. This:

x for x in range(5)

is a generator expression. Adding parentheses around it is merely used to separate it from other elements. This is the same as in (a + b) * c, which does not involve a tuple either.

The * symbol is for iterator packing/unpacking. A generator expression happens to be an iterable, so it can be unpacked. However, there must be something to unpack the iterable into. For example, one can also unpack a list into the elements of an assignment:

*[1, 2]                         # illegal - nothing to unpack into
a, b, c, d = *[1, 2], 3, 4      # legal - unpack into assignment tuple

Now, doing *<iterable>, combines * unpacking with a , tuple literal. This is not useable in all situations, though - separating elements may take precedence over creating a tuple. For example, the last , in [*(1, 2), 3] separates, whereas in [(*(1, 2), 3)] it creates a tuple.

In a dictionary the , is ambiguous since it is used to separate elements. Compare {1: 1, 2: 2} and note that {1: 2,3} is illegal. For a return statement, it might be possible in the future.

If you want a tuple, you should use () whenever there might be ambiguity - even if Python can handle it, it is difficult to parse for humans otherwise.

When your source is a large statement such as a generator expression, I suggest to convert to a tuple explicitly. Compare the following two valid versions of your code for readability:

{idx: tuple(x for x in range(5)) for idx in range(5)}
{idx: (*(x for x in range(5)),) for idx in range(5)}

Note that list and dict comprehensions also work similar - they are practically like passing a generator expression to list, set or dict. They mostly serve to avoid looking up list, set or dict in the global namespace.


I feel like this is a bit more of a problem since comprehsions can be important for performance in some situations.

Under the covers, both generator expressions and list/dict/set comprehensions create a short-lived function. You should not rely on comprehensions for performance optimisation unless you have profiled and tested them. By default, use whatever is most readable for your use case.

dis.dis("""[a for a in (1, 2, 3)]""")
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x10f730ed0, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_CONST               5 ((1, 2, 3))
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE
MisterMiyagi
  • 26,337
  • 5
  • 60
  • 79
  • "Neither a return statement nor dictionary value are appropriate for this." – Mark Dickinson Aug 26 '18 at 14:06
  • @jpp Yes, the disassembly shows the ``MAKE_FUNCTION``. A list comprehension consumes the intermediate generator, so you never get to access it. – MisterMiyagi Aug 26 '18 at 15:16
  • 1
    @jpp You are correct that it is not a generator function for comprehensions, but a function. The abstraction is leaky, and the function can be turned into a generator function: https://stackoverflow.com/a/50214024/5349916 Note that I do not mean to say list comprehensions are slower than generator expressions + list. They may be slower than other means, though. Using ``a = [];a.extend(map(func, range(10000)))`` is faster than``[func(i) for i in range(10000)]`` in my tests. – MisterMiyagi Aug 26 '18 at 15:40
  • @MarkDickinson There is some ambiguity in Python whether ``,`` creates a tuple or separates elements. Compare ``[*(1, 2), 3]`` versus ``[(*(1, 2), 3)]``. Likewise, it is illegal to say ``{1: 2,3}``. Note that ``return (*(1, 2),)`` works. – MisterMiyagi Aug 26 '18 at 15:54
  • 1
    @MisterMiyagi: Yep, but there's no good reason that I can see for making `return *(1, 2), 3` illegal that wouldn't apply equally well to `x = *(1, 2), 3` (which _is_ legal). There's no ambiguity in the `return`, and it seems at least plausible that the implementors of PEP 448 simply didn't consider this case. – Mark Dickinson Aug 26 '18 at 15:57
  • 1
    @MisterMiyagi: FWIW, it turns out to be easy to remove the restriction: https://github.com/python/cpython/pull/8941 – Mark Dickinson Aug 26 '18 at 16:08
  • @MarkDickinson That is great! I have update the post to indicate this may become possible in the future. Indeed, I did not find a scenario that conflicts with this. – MisterMiyagi Aug 26 '18 at 16:24
1

Pass a generator expression into the tuple() constructor, since there are no tuple-comprehensions:

{idx: tuple(x for x in range(5)) for idx in range(5)}

Tuple-comprehensions don't exist, but even though list-comprehensions do ([... for ... in ...]) they are similar to*: list(... for ... in ...).


*list comprehensions are actually faster than a generator expression passed into a constructor function as executing functions is expensive in Python

Joe Iddon
  • 18,600
  • 5
  • 29
  • 49
  • @jpp I concede I wasn't sure they were so similar, but I read [this](https://stackoverflow.com/questions/16940293/why-is-there-no-tuple-comprehension-in-python#comment24458510_16940351) by Martijn Peters which claimed their differences were negligible. What do you mean by the generator expression using "`__next__` calls internally"? It is the `list()` constructor which will call `iter()` on its argument to convert the iterable to an iterator and then call `next()` until it receives a `StopIteration`. In contrast, a list-comp is built-in syntax, so it will be optimised. – Joe Iddon Aug 26 '18 at 14:16
  • [Syntactic sugar](https://en.wikipedia.org/wiki/Syntactic_sugar) implies a the same except for a readability improvement. Martijn's comment was in 2013, I'm not sure it stands in 3.6+. In fact, this is the kind of reason I don't like these old comments. They can't be qualified/editted/updated as things change. Certainly, they aren't the same, as can be seen by the performance differential. See [Understanding generators in Python](https://stackoverflow.com/questions/1756096/understanding-generators-in-python) for what I mean by `next`. – jpp Aug 26 '18 at 14:23
  • Of course, performance can be different by O(1) for `d.__getitem__` and `d[]` (trivial example). But I call this syntactic sugar because it's an O(1) difference. `[x for i in y]` vs `list(x for i in y)` isn't in the same category. – jpp Aug 26 '18 at 14:31
  • 2
    @jpp Yes, I was wrong to say it was syntactic sugar, but it is rare that it will behave noticeably different to if it were. – Joe Iddon Aug 26 '18 at 14:52