321

I'm thinking about putting the virtualenv for a Django web app I am making inside my git repository for the app. It seems like an easy way to keep deploy's simple and easy. Is there any reason why I shouldn't do this?

allyourcode
  • 19,438
  • 17
  • 73
  • 100
Lyle Pratt
  • 4,808
  • 3
  • 24
  • 26

9 Answers9

349

I use pip freeze to get the packages I need into a requirements.txt file and add that to my repository. I tried to think of a way of why you would want to store the entire virtualenv, but I could not.

jojo
  • 7,737
  • 2
  • 39
  • 64
RyanBrady
  • 5,603
  • 4
  • 24
  • 31
  • Great suggestion! I understand that I can use a requirements.txt file to keep track of packages. However, the reason why I would like to keep the virtualenv in the repository is to make deploying new servers easier. If the virtualenv is contained inside the application, I don't have to create a new virtualenv in new instances I bring up, all I have to do is pull the code. I know I could use something like fabric to set up the env for me, but it just seems like an unnecessary step if I can just keep the env in my repo. – Lyle Pratt Jul 06 '11 at 02:16
  • 88
    You can save the unnecessary space in your repo and still deploy to a new server in a single command: virtualenv --no-site-packages --distribute .env && source .env/bin/activate && pip install -r requirements.txt – RyanBrady Jul 06 '11 at 02:39
  • Awesome, I'll keep that example in mind. But, it sounds like, beyond the extra space requirements, there's no reason why I can't just include the env in the repo. – Lyle Pratt Jul 06 '11 at 02:44
  • 2
    I'm giving you the answer to this question, since it is probably the "best practice" and you offered it first. I have definitely encountered some of the problems that everyone has mentioned. I'm estimating I give myself another day messing with it before I just do what you guys have been suggesting all along and use pip and a requirements file. Thanks for your help! – Lyle Pratt Jul 06 '11 at 15:19
  • Take a look at buildout approach. It's like virtualenv but declarative. You create one file buildout.cfg (requirements.txt on steroids) that defines required packages, versions, package sources (including git repos) and than just build the environment. [buildout.org](http://www.buildout.org/), [my example usage](https://github.com/munhitsu/cr3template-django-minimal) – Munhitsu Jul 06 '11 at 17:07
  • 12
    If you, say `pip install mysql-python`, on a 64 bit machine, and then someone with a 32 bit machine tries to use it, it will not work. It uses a C module, like many Python modules do, to increase performance. I imagine Windows->Linux would also not work. – Matt Williamson Jul 25 '12 at 15:41
  • 7
    just a remark: we got bit in the past because somehow libraries are becoming unavailable from pip (version too old), forcing an upgrade while the site was down. so... I will now never rely on `pip freeze` again to do this. the issue is that during your forced upgrade redeploy, no one pays for it, and for intermediate upgrades ("best practice" maintenance) no one does either. – CONTRACT SAYS I'M RIGHT Jan 10 '15 at 10:09
  • 5
    Note on the @RayanBrady comment: The `--distribute` and `--setuptools` options are now no-op. (distribute, that was a fork of setuptools has been merged back long ago). `--no-site-packages` is DEPRECATED, it is now the default behavior – JackNova Jul 16 '16 at 05:13
  • 1
    @CONTRACTSAYSI'MRIGHT Yes, I had the same situation. I spent hours to match the new environment's packages with the old one's when I tried to setup the new environment by requirements file. The project was 2 years old and most of the packages were old. However, I do not have any idea about how virtual environment behaves on different architectures. – Onur Demir Mar 02 '17 at 07:11
  • 1
    @aod nowadays, hard drive space is SO cheap, we resorted to just set up another repo in these cases, and store the whole virtualenv thing (I also do that for node apps, where tbh the problem isn't as common, with node_modules), if it is a production app. Sure, it is cool to not have to do it, but if the app constitutes your customer's livelihood, it's just not worth the space savings. Worst to worst, you *need* to be able to restore it ASAP, build artifacts and everything. – CONTRACT SAYS I'M RIGHT Mar 02 '17 at 12:17
  • I do the same, however I was trying to find a way to export my environment Ina file such as I can create my env with the same name from this file instead only to get my dependencies from my requirements.txt that is to avoid to let git automatically stage the env folder once created – 3nomis Oct 01 '19 at 16:18
62

Storing the virtualenv directory inside git will, as you noted, allow you to deploy the whole app by just doing a git clone (plus installing and configuring Apache/mod_wsgi). One potentially significant issue with this approach is that on Linux the full path gets hard-coded in the venv's activate, django-admin.py, easy_install, and pip scripts. This means your virtualenv won't entirely work if you want to use a different path, perhaps to run multiple virtual hosts on the same server. I think the website may actually work with the paths wrong in those files, but you would have problems the next time you tried to run pip.

The solution, already given, is to store enough information in git so that during the deploy you can create the virtualenv and do the necessary pip installs. Typically people run pip freeze to get the list then store it in a file named requirements.txt. It can be loaded with pip install -r requirements.txt. RyanBrady already showed how you can string the deploy statements in a single line:

# before 15.1.0
virtualenv --no-site-packages --distribute .env &&\
    source .env/bin/activate &&\
    pip install -r requirements.txt

# after deprecation of some arguments in 15.1.0
virtualenv .env && source .env/bin/activate && pip install -r requirements.txt

Personally, I just put these in a shell script that I run after doing the git clone or git pull.

Storing the virtualenv directory also makes it a bit trickier to handle pip upgrades, as you'll have to manually add/remove and commit the files resulting from the upgrade. With a requirements.txt file, you just change the appropriate lines in requirements.txt and re-run pip install -r requirements.txt. As already noted, this also reduces "commit spam".

icedwater
  • 4,280
  • 3
  • 31
  • 47
  • 4
    Note that --distribute is now deprecated (at least in 15.1.0): ```--distribute DEPRECATED. Retained only for backward compatibility. This option has no effect.``` – AnthonyC Aug 07 '17 at 14:13
  • 1
    `--no-site-packages` is deprecated in 15.1.0 as well, as that's now the default. – cjs Jan 17 '18 at 05:29
37

I used to do the same until I started using libraries that are compiled differently depending on the environment such as PyCrypto. My PyCrypto mac wouldn't work on Cygwin wouldn't work on Ubuntu.

It becomes an utter nightmare to manage the repository.

Either way I found it easier to manage the pip freeze & a requirements file than having it all in git. It's cleaner too since you get to avoid the commit spam for thousands of files as those libraries get updated...

Yuji 'Tomita' Tomita
  • 104,917
  • 24
  • 272
  • 232
  • Hmm. I definitely won't have problems with stuff being compiled differently in different environments. I guess its probably worth not doing it just to avoid the commit spam. – Lyle Pratt Jul 06 '11 at 03:23
  • @LylePratt: I think the opposite: better not include whole virtualenv in the repository just to avoid issues with having such great tools as PyCrypto or PIL. – Tadeck Dec 18 '12 at 00:21
20

I think one of the main problems which occur is that the virtualenv might not be usable by other people. Reason is that it always uses absolute paths. So if you virtualenv was for example in /home/lyle/myenv/ it will assume the same for all other people using this repository (it must be exactly the same absolute path). You can't presume people using the same directory structure as you.

Better practice is that everybody is setting up their own environment (be it with or without virtualenv) and installing libraries there. That also makes you code more usable over different platforms (Linux/Windows/Mac), also because virtualenv is installed different in each of them.

isherwood
  • 46,000
  • 15
  • 100
  • 132
Torsten Engelbrecht
  • 12,227
  • 4
  • 39
  • 47
  • This is right-on as to why it's a bad idea to keep a virtualenv in SCM, but it's worth considering either something like @RJBrady's suggestion or event [a bootstrap.py script](http://www.virtualenv.org/en/latest/), as having some means of recreating the same environment across machines is a serious need when working with other people. – ig0774 Jul 06 '11 at 02:05
  • I'm not really sure the problem you mentioned would be a problem in my situation exactly. My Django app contains a .wsgi file that defines where the virtualenv is relative to it's location (2 directories up '../../env'). So, in my scenario, the absolute path problem should not negatively effect me...right? – Lyle Pratt Jul 06 '11 at 02:13
  • If you run your app always with WSGI then you might get away with it. If you use the development server (via `manage.py`) you will run into problems for sure. – Torsten Engelbrecht Jul 06 '11 at 03:44
  • spot on, all the legit reasons and highly increases the flexibility of the code (making it more viable, Specifically on Windows due to its differences in architechture) – malik bagwala Jul 16 '20 at 04:47
9

It's not a good idea to include any environment-dependent component or setting in your repos as one of the key aspects of using a repo, is perhaps, sharing it with other developers. Here is how I would setup my development environment on a Windows PC (say, Win10).

  1. Open Pycharm and on the first page, choose to check out the project from your Source Control System (in my case, I am using github)

  2. In Pycharm, navigate to settings and choose "Project Interpreter" and choose the option to add a new virtual environment , you can call it "venv".

  3. Choose the base python interpreter which is located at C:\Users{user}\AppData\Local\Programs\Python\Python36 (make sure you choose the appropriate version of Python based on what you have installed)

  4. Note that Pycharm will create the new virtual environment and copy python binaries and required libraries under your venv folder inside your project folder.

  5. Let Pycharm complete its scanning as it needs to rebuild/refresh your project skeleton

  6. exclude venv folder from your git interactions (add venv\ to .gitignore file in your project folder)

Bonus: If you want people to easily (well, almost easily) install all the libraries your software needs, you can use

pip freeze > requirements.txt

and put the instruction on your git so people can use the following command to download all required libraries at once.

pip install -r requirements.txt 
3

I use what is basically David Sickmiller's answer with a little more automation. I create a (non-executable) file at the top level of my project named activate with the following contents:

[ -n "$BASH_SOURCE" ] \
    || { echo 1>&2 "source (.) this with Bash."; exit 2; }
(
    cd "$(dirname "$BASH_SOURCE")"
    [ -d .build/virtualenv ] || {
        virtualenv .build/virtualenv
        . .build/virtualenv/bin/activate
        pip install -r requirements.txt
    }
)
. "$(dirname "$BASH_SOURCE")/.build/virtualenv/bin/activate"

(As per David's answer, this assumes you're doing a pip freeze > requirements.txt to keep your list of requirements up to date.)

The above gives the general idea; the actual activate script (documentation) that I normally use is a bit more sophisticated, offering a -q (quiet) option, using python when python3 isn't available, etc.

This can then be sourced from any current working directory and will properly activate, first setting up the virtual environment if necessary. My top-level test script usually has code along these lines so that it can be run without the developer having to activate first:

cd "$(dirname "$0")"
[[ $VIRTUAL_ENV = $(pwd -P) ]] || . ./activate

Sourcing ./activate, not activate, is important here because the latter will find any other activate in your path before it will find the one in the current directory.

cjs
  • 21,799
  • 6
  • 79
  • 93
  • Loving this approach! Sounds very reasonable, thank you for sharing. – Esolitos Feb 20 '19 at 13:10
  • I had to change the first line to `[[ $_ != $0 ]] || { echo 1>&2 "source (.) this script with Bash."; exit 2; }` to detect if the script was being executed as opposed to sourced – Chris Snow Nov 17 '19 at 02:50
2

If you know which operating systems your application will be running on, I would create one virtualenv for each system and include it in my repository. Then I would make my application detect which system it is running on and use the corresponding virtualenv.

The system could e.g. be identified using the platform module.

In fact, this is what I do with an in-house application I have written, and to which I can quickly add a new system's virtualenv in case it is needed. This way, I do not have to rely on that pip will be able to successfully download the software my application requires. I will also not have to worry about compilation of e.g. psycopg2 which I use.

If you do not know which operating system your application may run on, you are probably better off using pip freeze as suggested in other answers here.

fredrik
  • 7,730
  • 10
  • 56
  • 104
0

I think is that the best is to install the virtual environment in a path inside the repository folder, maybe is better inclusive to use a subdirectory dedicated to the environment (I have deleted accidentally my entire project when force installing a virtual environment in the repository root folder, good that I had the project saved in its latest version in Github).

Either the automated installer, or the documentation should indicate the virtualenv path as a relative path, this way you won't run into problems when sharing the project with other people. About the packages, the packages used should be saved by pip freeze -r requirements.txt.

Obsidian
  • 2,760
  • 6
  • 14
  • 24
Lucioric2000
  • 331
  • 2
  • 6
-1

If you just setting up development env, then use pip freeze file, caz that makes the git repo clean.

Then if doing production deployment, then checkin the whole venv folder. That will make your deployment more reproducible, not need those libxxx-dev packages, and avoid the internet issues.

So there are two repos. One for your main source code, which includes a requirements.txt. And a env repo, which contains the whole venv folder.

Shuo
  • 6,777
  • 3
  • 26
  • 34