Python 3.8 assignment expression in a list comprehension

Question

I'm trying to use the new assignment expression for the first time and could use some help.

Given three lines of log outputs:

sin = """Writing 93 records to /data/newstates-900.03-07_07/top100.newstates-900.03-07_07/Russia.seirdc.March6-900.12.csv ..
Writing 100 records to /data/newstates-900.03-07_07/top100.newstates-900.03-07_07/India.seirdc.March6-900.6.csv ..
Writing 100 records to /data/newstates-900.03-07_07/top100.newstates-900.03-07_07/US.seirdc.March6-900.15.csv ..
"""

The intent is to extract just the State (Russia, India and US) and the record count (93,100,100) . So the desired result is:

[['Russia',93],['India',100],['US',100]]

This requires the following steps to be translated into Python:

Convert each line into a list element
Split by space e.g. ['Writing', '93', 'records', 'to', '/data/newstates-900.03-07_07/top100.newstates-900.03-07_07/Russia.seirdc.March6-900.12.csv', '..']
Split the fifth such token by '/' and retain the last element: e.g. Russia.seirdc.March6-900.12.csv
Split that element by '.' and retain the first (0'th) element e.g. Russia

Here is my incorrect attempt:

import fileinput
y = [[ z[4].split('/')[-1].split('.')[0],z[1]] 
     for (z:=x.split(' ')) in 
     (x:=sin if sin else fileinput.input()).splitlines())]

Jab · Answer 1 · 2020-03-09T18:34:25.020

3

For what it's worth you can also get this using regex as well which would probably be more preferred/efficient.

[list(reversed(l)) for l in re.findall(r'Writing (\d+).+\/([A-Z,a-z]+)\.', sin)]

Or more accurately (to convert the int) and for readability (as per @chepner in comments):

[[country, int(count)] for count, country in re.findall(r'Writing (\d+).+\/([A-Z,a-z]+)\.', sin)]

edited Mar 09 '20 at 18:34

answered Mar 09 '20 at 18:22

Jab

21,612
20
66
111

`[ [country, int(count)] for count, country in ... ]` would be more readable (and match the requested output better). – chepner Mar 09 '20 at 18:24
Useful approach. I do want to use the `walrus` for many other data munging tasks that do not lend to clever heuristics: but specifically for test parsing your way makes much sense. The addition by @chepner is also helpful. – StephenBoesch Mar 09 '20 at 18:27
Oh you just removed the `reverse(list)`. I think that is also helpful to mention (and not just lose completely) – StephenBoesch Mar 09 '20 at 18:27
I removed the reverse as it's more readable and it makes converting the count to int easier as well – Jab Mar 09 '20 at 18:31
Ya i "got" that - but the trick of doing reverse() is actually an additional one to keep in toolkit . Well I absorbed it already - but future readers will see the end product and not that (interesting) intermediate solution. – StephenBoesch Mar 09 '20 at 18:32

score 2 · Accepted Answer · answered Mar 09 '20 at 18:12

2

Is this good enough?

[[(wrds := line.split())[4].split("/")[-1].split('.')[0], wrds[1]] for line in sin.splitlines()]

I find using assignment expression redundant. You can also do this:

[[line.split('/')[-1].split('.')[0], line.split()[1]] for line in sin.splitlines()]

answered Mar 09 '20 at 18:12

ori6151

533
5
11

The assignment is not redundant: your second one does the `split()` twice. Imagine if that were an expensive operation. I'm awarding because this is a good answer. Actually the second one is kind of clever but also cheating: it makes use of the '/' is only existing in the last space-delimited token. – StephenBoesch Mar 09 '20 at 18:14
@javadba I agree with you but what I meant was in this specific problem. Also if my answer is what you were looking for you can check the checkmark near my answer. Thanks :) – ori6151 Mar 09 '20 at 18:16
If you have a chance: I would actually like to see the doubly nested structure. Consider not using a trick to collapse this into a single level: i.e. the OP is a toy example but the intent is to understand how to do multiple levels of nesting (where the multi levels are truly needed) and with the assignment expression. – StephenBoesch Mar 09 '20 at 18:52
re: doubly nested structure. You can wait on this : i'm going to create a separate question in which it will _not_ be possible to squash the levels. It will involve grouping and aggregation operations on numerical data. – StephenBoesch Mar 09 '20 at 19:22

David Diamond · Answer 3 · 2020-03-09T18:24:11.540

0

Here's one way:

results = []
for line in sin.split('..'):
    if len(z := line.split(' ')) > 1 :
        results.append([line.split('/')[-1].split('.')[0], z[1]])

edited Mar 09 '20 at 18:24

answered Mar 09 '20 at 18:11

David Diamond

79
3

pls see above: the accepted answer uses it in a `for` comprehension. You can also refer to the linked PEP docs. – StephenBoesch Mar 09 '20 at 18:20

Python 3.8 assignment expression in a list comprehension

3 Answers3