So I learned how to use pickle this morning to dump lists to a text file bc you can not use .write to send lists to a file. I am watching a video on youtube, Natural Language Processing With Python and NLTK p.4. You can see what the full output should be there. He does not push the data to a txt file but I wanted to take it farther to learn more.
Sample Terminal Output: [('PRESIDENT', 'NNP'), ('GEORGE', 'NNP'), ('W.', 'NNP'), ('BUSH', 'NNP'), ("'S", 'POS') Note: This is suppose to go on for the whole speech and does in the terminal.
Full File Output: €]q (X (qh†qX ApplauseqX NNPq†qX .qh†qX )qh†q e.
My Code:
import nltk
from nltk.corpus import state_union
from nltk.tokenize import PunktSentenceTokenizer
import pickle
output = open('stoutput.txt', 'wb')
train_text = state_union.raw('2005-GWBush.txt')
sample_text = state_union.raw('2006-GWBush.txt')
custom_sent_tokenizer = PunktSentenceTokenizer(train_text)
tokenized = custom_sent_tokenizer.tokenize(sample_text)
def process_content():
try:
for i in tokenized:
words = nltk.word_tokenize(i)
tagged = nltk.pos_tag(words)
print(tagged)
pickle.dump(tagged, open('stoutput.txt', 'wb'))
except Exception as e:
pickle.dump(e, open('stoutput.txt', 'wb'))
print(str(e))
process_content()
Any help is greatly appreciated as I know it takes time. Thanks for reading.