0

I'm looking to create successive word sequence combinations from a list.

news = ['Brendan', 'Rodgers', 'has', 'wasted', 'no', 'time', 'in', 'playing', 'mind', '
games', 'with', 'Louis', 'van', 'Gaal', 'by', 'warning', 'the', 'new', 'Manchest
er', 'United', 'manager', 'that', 'the', 'competitive', 'nature', 'of', 'the', '
Premier', 'League', 'will', 'make', 'it', 'extremely', 'difficult', 'for', 'the'
, 'Dutchman', 'to', 'win', 'the', 'title', 'in', 'his', 'first', 'season.']

The following code I guess is not efficient. Is there a one-liner or a more pythonic way of achieving this?

wordseq = []
for i,j in enumerate(news):
    if len(news)-1 != i:
        wordseq.append((j, news[i+1]))

The result I want is this;

[('Brendan', 'Rodgers'), ('Rodgers', 'has'), ('has', 'wasted'), ('wasted', 'no')
, ('no', 'time'), ('time', 'in'), ('in', 'playing'), ('playing', 'mind'), ('mind
', 'games'), ('games', 'with'), ('with', 'Louis'), ('Louis', 'van'), ('van', 'Ga
al'), ('Gaal', 'by'), ('by', 'warning'), ('warning', 'the'), ('the', 'new'), ('n
ew', 'Manchester'), ('Manchester', 'United'), ('United', 'manager'), ('manager',
 'that'), ('that', 'the'), ('the', 'competitive'), ('competitive', 'nature'), ('
nature', 'of'), ('of', 'the'), ('the', 'Premier'), ('Premier', 'League'), ('Leag
ue', 'will'), ('will', 'make'), ('make', 'it'), ('it', 'extremely'), ('extremely
', 'difficult'), ('difficult', 'for'), ('for', 'the'), ('the', 'Dutchman'), ('Du
tchman', 'to'), ('to', 'win'), ('win', 'the'), ('the', 'title'), ('title', 'in')
, ('in', 'his'), ('his', 'first'), ('first', 'season.'), ('Brendan', 'Rodgers'),
 ('Rodgers', 'has'), ('has', 'wasted'), ('wasted', 'no'), ('no', 'time'), ('time
', 'in'), ('in', 'playing'), ('playing', 'mind'), ('mind', 'games'), ('games', '
with'), ('with', 'Louis'), ('Louis', 'van'), ('van', 'Gaal'), ('Gaal', 'by'), ('
by', 'warning'), ('warning', 'the'), ('the', 'new'), ('new', 'Manchester'), ('Ma
nchester', 'United'), ('United', 'manager'), ('manager', 'that'), ('that', 'the'
), ('the', 'competitive'), ('competitive', 'nature'), ('nature', 'of'), ('of', '
the'), ('the', 'Premier'), ('Premier', 'League'), ('League', 'will'), ('will', '
make'), ('make', 'it'), ('it', 'extremely'), ('extremely', 'difficult'), ('diffi
cult', 'for'), ('for', 'the'), ('the', 'Dutchman'), ('Dutchman', 'to'), ('to', '
win'), ('win', 'the'), ('the', 'title'), ('title', 'in'), ('in', 'his'), ('his',
 'first'), ('first', 'season.')]
richie
  • 12,586
  • 15
  • 44
  • 64
  • possible duplicate of [Rolling or sliding window iterator in Python](http://stackoverflow.com/questions/6822725/rolling-or-sliding-window-iterator-in-python) – vaultah Aug 06 '14 at 05:48

3 Answers3

3

using zip:

wordseq = zip(news,news[1:])
WeaselFox
  • 6,770
  • 6
  • 39
  • 72
  • 1
    Just to make it better, I would have used `itertools.izip` and `itertools.islice`. That should improve speed & memory usage. – bgusach Aug 06 '14 at 07:21
  • 1
    `zip` itself is iterable in python 3. `islice` is still around though. – Jason S Aug 06 '14 at 08:48
3

You can use zip(news[:-1], news[1:])

BlackMamba
  • 9,026
  • 6
  • 36
  • 58
1
wordseq = [(news[i-1], news[i]) for i in range(1, len(news))]
dexter
  • 131
  • 3