I've been getting some irregular behavior from an LDA topic model program and right now, it seems like my file won't save the lda model it creates... I'm really not sure why.
Here's a code snippet, albeit it's going to take me more time before I could write code that's reproducible since I'm really just trying to load certain files I created beforehand.
def naive_LDA_implementation(name_of_lda, create_dict=False, remove_low_freq=False):
LDA_MODEL_PATH = "lda_dir/" + str(name_of_lda) +"/model_dir/" # for some reason this location doesn't work entirely... and yes, I have made a directory in a the folder of this name.
# This ends up saving the .state, .id2word, and .expEblogbeta.npy files... But normally when saving an lda model actually works, a fourth file is included that's to my understanding the model itself.
# LDA_MODEL_PATH = "models/" # This is what I originally had as the location for LDA_MODEL_PATH. I was using a directory called models for multiple lda models. This no longer works.
doc_df = getCorpus(name_of_lda, cleaned=True) # returns a dataframe containing a row for each text record and an extra column that contains the tokenized version of the text's post/string of words.
dict_path = "lda_dir/" + str(name_of_lda) + "/dict_of_tokens.dict"
docs_of_tokens = convert_cleaned_tokens_entries(doc_df['cleaned_tokens'])
if create_dict != False:
doc_dict = corpora.Dictionary(docs_of_tokens) :
if remove_low_freq==True:
doc_dict.filter_extremes(no_below=5, no_above=0.6)
doc_dict.save(dict_path)
print("Finished saving")
else:
doc_dict = corpora.Dictionary.load(dict_path)
doc_term_matrix = [doc_dict.doc2bow(doc) for doc in docs_of_tokens] # gives a unique id for each word in corpus_arr
Lda = gensim.models.ldamodel.LdaModel
ldamodel = Lda(doc_term_matrix, num_topics=15, id2word = doc_dict, passes=20, chunksize=10000)
ldamodel.save(LDA_MODEL_PATH)
To put it sraightforwardly... I have no clue why permission is being denied when I try to save my lda model to a particular location. Right now even the original models/
directory location is giving me "permission denied" with this error message. It's seeming like any and all directories I could use just... won't work. This is odd behavior and I can't really find asks that talk about this error in the same context. I have found posts of people getting this error message when they actually tried storing in locations that did not exist. But for me that isn't really a question.
When I first got this error... I actually started to wonder if it was because I had another lda topic model that I named topic_model_1. It was stored in the models/
subdirectory. I started to wonder if the name was a potential cause, and changed it to lda_model_topic_1
to see if that could change results... but nothing is working.
Even if you can't really figure out what solution applies to my situation (especially since right now I don't have reproducible code, I just have my work)... Can someone tell me what this error message means? When and why does it come up? Maybe that's a start.
Traceback (most recent call last):
File "C:\Users\biney\Miniconda3\lib\site-packages\gensim\utils.py", line 679,
in save
_pickle.dump(self, fname_or_handle, protocol=pickle_protocol)
TypeError: file must have a 'write' attribute
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "text_mining.py", line 461, in <module>
main()
File "text_mining.py", line 453, in main
naive_LDA_implementation(name_of_lda="lda_model_topic_1", create_dict=True,
remove_low_freq=True)
File "text_mining.py", line 411, in naive_LDA_implementation
ldamodel.save(LDA_MODEL_PATH)
File "C:\Users\biney\Miniconda3\lib\site-packages\gensim\models\ldamodel.py",
line 1583, in save
super(LdaModel, self).save(fname, ignore=ignore, separately=separately, *arg
s, **kwargs)
File "C:\Users\biney\Miniconda3\lib\site-packages\gensim\utils.py", line 682,
in save
self._smart_save(fname_or_handle, separately, sep_limit, ignore, pickle_prot
ocol=pickle_protocol)
File "C:\Users\biney\Miniconda3\lib\site-packages\gensim\utils.py", line 538,
in _smart_save
pickle(self, fname, protocol=pickle_protocol)
File "C:\Users\biney\Miniconda3\lib\site-packages\gensim\utils.py", line 1337,
in pickle
with smart_open(fname, 'wb') as fout: # 'b' for binary, needed on Windows
File "C:\Users\biney\Miniconda3\lib\site-packages\smart_open\smart_open_lib.py
", line 181, in smart_open
fobj = _shortcut_open(uri, mode, **kw)
File "C:\Users\biney\Miniconda3\lib\site-packages\smart_open\smart_open_lib.py
", line 287, in _shortcut_open
return io.open(parsed_uri.uri_path, mode, **open_kwargs)
PermissionError: [Errno 13] Permission denied: 'lda_dir/lda_model_topic_1/model_
dir/'