10

I am new to tensorflow and neural network. I started a project which is about detecting errors in persian texts. I used the code in this address and developed the code in here. please check the code because I can not put all the code here.

What I want to do is to give several persian sentences to the model for training and then see if model can detect wrong sentences. The model works fine with english data but when I use it for persian data I encounter this issue.

The code is too long to be written here so I try to point to the part I think might be causing the issue. I used these lines in train.py which works fine and stores vocabularies:

x_text, y = data_helpers.load_data_labels(datasets)
# Build vocabulary
max_document_length = max([len(x.split(" ")) for x in x_text])
vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length)
x = np.array(list(vocab_processor.fit_transform(x_text)))

however after training when I try this code in eval.py:

vocab_path = os.path.join(FLAGS.checkpoint_dir, "..", "vocab")
vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path)
x_test = np.array(list(vocab_processor.transform(x_raw)))

this error happens:

vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path)
File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\contrib\learn\python\learn\preprocessing\text.py", line 226, in restore
return pickle.loads(f.read())
File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 118, in read
self._preread_check()
 File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 78, in _preread_check
  compat.as_bytes(self.__name), 1024 * 512, status)
 File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\contextlib.py", line 66, in __exit__
 next(self.gen)
 File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: ..\vocab : The system cannot find the file specified.

I think the problem is because it can not read the vocabulary stored after training ,as the data is in unicode and it's not english. Can anyone help me please

Hamed Temsah
  • 399
  • 1
  • 3
  • 17

2 Answers2

5

The reason why this problem happens is because vocab address is not correct. In train.py after line 144 which the out_dir is set, I added this:

file = open('model_dir.txt', 'w')
file.write(out_dir)
file.close()

After training the model, address is saved in the directory in a file named as model_dir.txt.

Then in eval.py I added this:

model_dir = open('model_dir.txt').readline()
vocab_path = model_dir + "/vocab"

Now, The address is set correctly and the code is working with no problem.

3

Have you tried adding this at the top of your file?

# -*- coding: utf-8 -*-
Myles Hollowed
  • 581
  • 3
  • 15