I've tried change wordcount example using mrjob. My structure project is:
├── input_text.txt
├── store_xml_dir
│ ├── xml_file.xml
│ └── xml_parse.py
└── wordcount.py
and content of wordcount.py is:
import os
import sys
cwdir = os.path.dirname(__file__)
sys.path.append(cwdir)
sys.path.append(os.path.join(cwdir, "store_xml_dir"))
import xml_parse
# print dir(xml_parse) <- it works here if i'd comment the rest code
from mrjob.job import MRJob
class MRWordFrequencyCount(MRJob):
def mapper(self, _, line):
getxml = xml_parse.GetXML()
print '>>>', getxml.get_strings()
yield "chars", len(line)
yield "words", len(line.split())
yield "lines", 1
def reducer(self, key, values):
yield key, sum(values)
if __name__ == '__main__':
MRWordFrequencyCount.run()
When i run, i've got error: ImportError: No module named xml_parse
. Why python can not import xml_parse
in this case?