3

I have a django app that calls unoconv using subprocess. It works fine when I run in my development environment, but errors out when run in the production environment.

It gives this error

"unoconv: Cannot find a suitable pyuno library and python binary combination in /usr/lib/libreoffice ERROR: No module named uno

unoconv: Cannot find a suitable office installation on your system. ERROR: Please locate your office installation and send your feedback to: http://github.com/dagwieers/unoconv/issues"

But unoconv runs just fine from the command line.

Since I run the django app as a uwsgi vassal, it might have something to do with privileges, though for the life of me I can't figure out how to fix it.

PS - The django app does not start a unoconv listener, there one already running.

EDIT - It wasn't a permissions issue rather, but a path issue as unoconv was being called from the wrong python interpreter(the virtualenv one)

elssar
  • 4,822
  • 6
  • 43
  • 69

3 Answers3

5

Since the app is running in an virtualenv, unoconv is being called with the virtualenv python interpreter instead of the system one.

The fix is pretty simple if you have virtualenvwrapper - just call the add2virtualenv command with the path to the directory containing uno.py and unohelper.py as the argument (/usr/share/pyshared) in my case.

elssar
  • 4,822
  • 6
  • 43
  • 69
  • Yes ! Thank you. In my case, `uno.py` and `unohelper.py` are in "the system site-packages directory" (`/usr/lib/python3/dist-packages`), like described in the virtualenvwrapper doc. If it can help anyone :) – Jeb Apr 16 '18 at 09:27
  • Hi elssar, I am experiencing the exact same issue. Can you describe the fix a little bit more specifically. I have trouble understanding the virtualenvwrapper part. – Iftieaq May 06 '19 at 17:41
3

Are you sure that you absolutely need unoconv for your use case? It is powerful, but since it needs a full-fledged LibreOffice to run, it is: 1) somewhat slow to convert files; 2) slow to start; 3) uses a lot of RAM; 4) not very scalable.

Why don't you try Apache Tika (which is based on Apache POI)? It is somewhat more lightweight and more than good enough for most of the day-to-day tasks.

Launch Tika to process PDF files too, or use magic to distinguish between file types and go with a separate pdftotext utility or something similar. Here's a simplified version of what you can use to convert office files to, let's say, text:

import subprocess
from django.db import models
import magic  # https://github.com/ahupp/python-magic

PDFTOTEXT_COMMAND = '/usr/bin/pdftotext'
JAVA_COMMAND = '/usr/bin/java'
TIKA_PATH = '/path/to/tika.jar' 
PDFTOTEXT_OPTIONS = [u'-', ]
JAVA_OPTIONS = [ u'-jar', TIKA_PATH, u'--text', ]

mime = magic.Magic(mime=True)

class UploadedFileModel(models.Model):
    file = models.FileField(upload_to='files/')

    def get_txt(self):
        if not ('application/pdf' in mime.from_file(
                self.file.path.encode('utf-8'))):
            option_list = [JAVA_COMMAND, ] + JAVA_OPTIONS + [self.file.path, ]
        else:
            option_list = [PDFTOTEXT_COMMAND, ] + [self.file.path, ] +\
                PDFTOTEXT_OPTIONS

        pipe = subprocess.Popen(option_list, stdout=subprocess.PIPE)
        txt = pipe.communicate()[0]
        if pipe.returncode:
            return None
        else:
            return txt

P.S. The error unoconv: Cannot find a suitable pyuno library and python binary combination can be related to a broad number of issues. It is impossible to tell for sure without you providing additional information. For example, it could be a problem with paths.

Be sure to check out the relevant unoconv troubleshooting guides:

Community
  • 1
  • 1
Ivan Kharlamov
  • 1,749
  • 2
  • 22
  • 30
  • Yes, we do need unoconv. The performance issues have been a non-issue so far, in development, and don't think they will be a problem in production. Also, figured out what was causing the error, it was a problem with the virtualenv. Thanks for the suggestions though – elssar Oct 07 '13 at 06:35
  • @elssar, thanks! IMO, you should create your own answer and accept it, also you should probably edit the title of the question and exclude *running as uwsgi vassal* to help out other people if they stumble upon the same issue to actually find your answer. When dealing with `unoconv` in production, be sure to incorporate multiple retries, since libreoffice does occasionally crash and it takes time to warm the instance up. BTW, it would be interesting to know, why you need `unoconv` in particular. – Ivan Kharlamov Oct 07 '13 at 07:46
  • 1
    Also, according to your statement, it looks like it was the problem with the `python` path and the fact that `unoconv` was called with wrong `python` interpreter in mind (the one from `virtualenv` instead of the system one). A path issue, indeed. – Ivan Kharlamov Oct 07 '13 at 08:02
0

Just try adding this in ur linux termimnal(after activating the environment) URE_BOOTSTRAP=vnd.sun.star.pathname:/usr/lib64/libreoffice/program/fundamentalrc UNO_PATH=/usr/lib64/libreoffice/program PATH=/usr/lib64/libreoffice/program:/home/graaff/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.3:/opt/android-sdk-update-manager/tools:/opt/android-sdk-update-manager/platform-tools:/usr/games/bin ,or atleast try UNO_PATH and PATH

yunus
  • 1,821
  • 1
  • 10
  • 11