So I'm rather new to R, and I'm learning how to mine text from this handy website: https://eight2late.wordpress.com/2015/05/27/a-gentle-introduction-to-text-mining-using-r/
I do have my own text set of .doc, .docx, and .xlsx files and I'm trying to mine them. They're located in a folder in my working directory called 'files', but I have already encountered an error after simply writing a few lines of code.
The code I have so far is:
library(tm)
library(readtext)
data = readtext('files')
At this point, after waiting for 25 seconds or so, I get the error:
Error: System call to 'antiword' failed (1): The Big Block Depot is damaged
and the code stops running there.
I have tried searching online for solutions but it seems like a fairly rare error and so I only found 1 possible solution at https://github.com/ropensci/antiword/issues/1 but that did not work for me.
This solution suggested that one of my files were corrupt, and suggested using the code
fixInNamespace(antiword, pos="package:antiword")
to change the error to a warning to not interrupt the reading of the files. I tried that, and at first it raised the error of
Error in as.environment(pos):
no item called "package:antiword" on the search list
After which, I loaded the antiword library with a library(antiword)
and changed the stop(
to a warning(
. However, when I ran the data = readtext('files')
line again, it immediately raised the error
Error in is_windows() : could not find function "is_windows"
I'm at a loss here! Any help would be appreciated. Should I be using another package in this case?