0

I have a problem with Python. I'm trying to understand which are the information stored in an object that I discovered be a generator. I don't know anything about Python, but I have to understand how this code works in order to convert it to Java. The code is the following:

def segment(text):
    "Return a list of words that is the best segmentation of text."
    if not text: return []
    candidates = ([first]+segment(rem) for first,rem in splits(text))
    return max(candidates, key=Pwords)

def splits(text, L=20):
    "Return a list of all possible (first, rem) pairs, len(first)<=L."
    pairs = [(text[:i+1], text[i+1:]) for i in range(min(len(text), L))]
    return pairs

def Pwords(words): 
    "The Naive Bayes probability of a sequence of words."
    productw = 1
    for w in words:
      productw = productw * Pw(w)
    return productw

while I understood how the methods Pwords and splits work (the function Pw(w) simply get a value from a matrix), I'm still trying to understand how the "candidates" object, in the "segment" method is built and what it contains. As well as, how the "max()" function analyzes this object.

I hope that someone could help me because I didn't find any feasible solution here to print this object. Thanks a lot to everybody. Mauro.

Mauro
  • 91
  • 4
  • possible duplicate of [Understanding Generators in Python?](http://stackoverflow.com/questions/1756096/understanding-generators-in-python) Related question: [The Python yield keyword explained](http://stackoverflow.com/questions/231767/the-python-yield-keyword-explained) – Bakuriu May 16 '13 at 11:40

1 Answers1

0

generator is quite simple abstraction. It looks like single-use custom iterator.

gen = (f(x) for x in data)

means that gen is iterator which each next value is equal to f(x) where x is corresponding value of data

nested generator is similar to list comprehension with small differences:

  • it is single use
  • it doesn't create whole sequence
  • code runs only during iterations

for easier debugging You can try to replace nested generator with list comprehension

def segment(text):
    "Return a list of words that is the best segmentation of text."
    if not text: return []
    candidates = [[first]+segment(rem) for first,rem in splits(text)]
    return max(candidates, key=Pwords)
oleg
  • 3,619
  • 14
  • 14
  • Thank you very much!!! I changed the code and I've starting understanding how this list is built... ;) – Mauro May 16 '13 at 12:27