45

I want something like sys.builtin_module_names except for the standard library. Other things that didn't work:

  • sys.modules - only shows modules that have already been loaded
  • sys.prefix - a path that would include non-standard library modules EDIT: and doesn't seem to work inside a virtualenv.

The reason I want this list is so that I can pass it to the --ignore-module or --ignore-dir command line options of trace http://docs.python.org/library/trace.html

So ultimately, I want to know how to ignore all the standard library modules when using trace or sys.settrace.

EDIT: I want it to work inside a virtualenv. http://pypi.python.org/pypi/virtualenv

EDIT2: I want it to work for all environments (i.e. across operating systems, inside and outside of a virtualenv.)

saltycrane
  • 6,129
  • 5
  • 31
  • 43
  • This is a rough duplicate of http://stackoverflow.com/questions/5632980/list-of-all-imports-in-python-3, which unfortunately did not attract any particularly useful answers so far. – Adam Spiers Jan 24 '12 at 19:05

8 Answers8

33

If anyone's still reading this in 2015, I came across the same issue, and didn't like any of the existing solutions. So, I brute forced it by writing some code to scrape the TOC of the Standard Library page in the official Python docs. I also built a simple API for getting a list of standard libraries (for Python version 2.6, 2.7, 3.2, 3.3, and 3.4).

The package is here, and its usage is fairly simple:

>>> from stdlib_list import stdlib_list
>>> libraries = stdlib_list("2.7")
>>> libraries[:10]
['AL', 'BaseHTTPServer', 'Bastion', 'CGIHTTPServer', 'ColorPicker', 'ConfigParser', 'Cookie', 'DEVICE', 'DocXMLRPCServer', 'EasyDialogs']
17

Why not work out what's part of the standard library yourself?

import distutils.sysconfig as sysconfig
import os
std_lib = sysconfig.get_python_lib(standard_lib=True)
for top, dirs, files in os.walk(std_lib):
    for nm in files:
        if nm != '__init__.py' and nm[-3:] == '.py':
            print os.path.join(top, nm)[len(std_lib)+1:-3].replace(os.sep, '.')

gives

abc
aifc
antigravity
--- a bunch of other files ----
xml.parsers.expat
xml.sax.expatreader
xml.sax.handler
xml.sax.saxutils
xml.sax.xmlreader
xml.sax._exceptions

Edit: You'll probably want to add a check to avoid site-packages if you need to avoid non-standard library modules.

zahypeti
  • 71
  • 1
  • 8
Caspar
  • 5,639
  • 3
  • 25
  • 39
  • `sysconfig.get_python_lib(standard_lib=True)` also gives me the path of my virtualenv which doesn't have all the standard library modules. – saltycrane Jun 24 '11 at 06:20
  • In the case of virtualenv, you can reduce the problem to finding the location of the virtualenv, which [this thread](http://groups.google.com/group/python-virtualenv/browse_thread/thread/e30029b2e50ae17a) suggests can be done using `sys.real_prefix` (although I don't have a virtualenv handy to test on) – Caspar Jun 24 '11 at 06:53
  • Using `sys.real_prefix` with virtualenv is also mentioned in [this SO answer](http://stackoverflow.com/questions/1871549/python-determine-if-running-inside-virtualenv/1883251#1883251) – Caspar Jun 24 '11 at 06:56
  • Using `sys.real_prefix` works inside a virtualenv. But I do want it to work both inside and outside a virtualenv. I guess I just need some logic to decide which prefix to use. – saltycrane Jun 24 '11 at 16:24
  • I don't know what I was smoking last night, but I tried it again, and `sysconfig.get_python_lib(standard_lib=True)` works for me even when using virtualenv. – saltycrane Jun 24 '11 at 19:23
  • 2
    It turns out that if my virtualenv is activated and my current working directory is `/usr/lib/python2.6` when I invoke the Python interpreter, `sysconfig.get_python_lib(standard_lib=True)` returns the path for my virtualenv (e.g. `~/.virtualenvs/myenv/lib/python2.6`). However, if I my current working directory is something other than `/usr/lib/python2.6`, it returns the correct path, `/usr/lib/python2.6`. So I was not smoking drugs last night. – saltycrane Jun 25 '11 at 06:20
  • 2
    This is a pretty good solution, but it doesn't include core libraries such as `sys`. – Adam Spiers Jan 24 '12 at 19:00
  • This doesn't print modules like `datetime`, `itertools` or `math`, which are in the `lib-dynload/` directory and have a `.so` extension (on my machine anyway). Neither is `importlib` printed, which is in `importlib/__init__.py` – zahypeti Nov 23 '19 at 20:02
13

Take a look at this, https://docs.python.org/3/py-modindex.html They made an index page for the standard modules.

Edmund
  • 149
  • 1
  • 3
6

Here's a 2014 answer to a 2011 question -

The author of isort, a tool which cleans up imports, had to grapple this same problem in order to satisfy the pep8 requirement that core library imports should be ordered before third party imports.

I have been using this tool and it seems to be working well. You can use the method place_module in the file isort.py, since it's open source I hope the author would not mind me reproducing the logic here:

def place_module(self, moduleName):
    """Tries to determine if a module is a python std import, third party import, or project code:

    if it can't determine - it assumes it is project code

    """
    if moduleName.startswith("."):
        return SECTIONS.LOCALFOLDER

    index = moduleName.find('.')
    if index:
        firstPart = moduleName[:index]
    else:
        firstPart = None

    for forced_separate in self.config['forced_separate']:
        if moduleName.startswith(forced_separate):
            return forced_separate

    if moduleName == "__future__" or (firstPart == "__future__"):
        return SECTIONS.FUTURE
    elif moduleName in self.config['known_standard_library'] or \
            (firstPart in self.config['known_standard_library']):
        return SECTIONS.STDLIB
    elif moduleName in self.config['known_third_party'] or (firstPart in self.config['known_third_party']):
        return SECTIONS.THIRDPARTY
    elif moduleName in self.config['known_first_party'] or (firstPart in self.config['known_first_party']):
        return SECTIONS.FIRSTPARTY

    for prefix in PYTHONPATH:
        module_path = "/".join((prefix, moduleName.replace(".", "/")))
        package_path = "/".join((prefix, moduleName.split(".")[0]))
        if (os.path.exists(module_path + ".py") or os.path.exists(module_path + ".so") or
           (os.path.exists(package_path) and os.path.isdir(package_path))):
            if "site-packages" in prefix or "dist-packages" in prefix:
                return SECTIONS.THIRDPARTY
            elif "python2" in prefix.lower() or "python3" in prefix.lower():
                return SECTIONS.STDLIB
            else:
                return SECTIONS.FIRSTPARTY

    return SECTION_NAMES.index(self.config['default_section'])

Obviously you need to use this method in the context of the class and the settings file. That is basically a fallback on a static list of known core lib imports.

# Note that none of these lists must be complete as they are simply fallbacks for when included auto-detection fails.
default = {'force_to_top': [],
           'skip': ['__init__.py', ],
           'line_length': 80,
           'known_standard_library': ["abc", "anydbm", "argparse", "array", "asynchat", "asyncore", "atexit", "base64",
                                      "BaseHTTPServer", "bisect", "bz2", "calendar", "cgitb", "cmd", "codecs",
                                      "collections", "commands", "compileall", "ConfigParser", "contextlib", "Cookie",
                                      "copy", "cPickle", "cProfile", "cStringIO", "csv", "datetime", "dbhash", "dbm",
                                      "decimal", "difflib", "dircache", "dis", "doctest", "dumbdbm", "EasyDialogs",
                                      "errno", "exceptions", "filecmp", "fileinput", "fnmatch", "fractions",
                                      "functools", "gc", "gdbm", "getopt", "getpass", "gettext", "glob", "grp", "gzip",
                                      "hashlib", "heapq", "hmac", "imaplib", "imp", "inspect", "itertools", "json",
                                      "linecache", "locale", "logging", "mailbox", "math", "mhlib", "mmap",
                                      "multiprocessing", "operator", "optparse", "os", "pdb", "pickle", "pipes",
                                      "pkgutil", "platform", "plistlib", "pprint", "profile", "pstats", "pwd", "pyclbr",
                                      "pydoc", "Queue", "random", "re", "readline", "resource", "rlcompleter",
                                      "robotparser", "sched", "select", "shelve", "shlex", "shutil", "signal",
                                      "SimpleXMLRPCServer", "site", "sitecustomize", "smtpd", "smtplib", "socket",
                                      "SocketServer", "sqlite3", "string", "StringIO", "struct", "subprocess", "sys",
                                      "sysconfig", "tabnanny", "tarfile", "tempfile", "textwrap", "threading", "time",
                                      "timeit", "trace", "traceback", "unittest", "urllib", "urllib2", "urlparse",
                                      "usercustomize", "uuid", "warnings", "weakref", "webbrowser", "whichdb", "xml",
                                      "xmlrpclib", "zipfile", "zipimport", "zlib", 'builtins', '__builtin__'],
           'known_third_party': ['google.appengine.api'],
           'known_first_party': [],

---snip---

I was already an hour into writing this tool for myself before I stumbled the isort module, so I hope this can also help somebody else to avoid re-inventing the wheel!

wim
  • 266,989
  • 79
  • 484
  • 630
6

Here's an improvement on Caspar's answer, which is not cross-platform, and misses out top-level modules (e.g. email), dynamically loaded modules (e.g. array), and core built-in modules (e.g. sys):

import distutils.sysconfig as sysconfig
import os
import sys

std_lib = sysconfig.get_python_lib(standard_lib=True)

for top, dirs, files in os.walk(std_lib):
    for nm in files:
        prefix = top[len(std_lib)+1:]
        if prefix[:13] == 'site-packages':
            continue
        if nm == '__init__.py':
            print top[len(std_lib)+1:].replace(os.path.sep,'.')
        elif nm[-3:] == '.py':
            print os.path.join(prefix, nm)[:-3].replace(os.path.sep,'.')
        elif nm[-3:] == '.so' and top[-11:] == 'lib-dynload':
            print nm[0:-3]

for builtin in sys.builtin_module_names:
    print builtin

This is still not perfect because it will miss things like os.path which is defined from within os.py in a platform-dependent manner via code such as import posixpath as path, but it's probably as good as you'll get, bearing in mind that Python is a dynamic language and you can't ever really know which modules are defined until they're actually defined at runtime.

Adam Spiers
  • 15,491
  • 5
  • 40
  • 61
3

On Python 3.10 there is now sys.stdlib_module_names.

CCCC_David
  • 41
  • 2
2

This will get you close:

import sys; import glob
glob.glob(sys.prefix + "/lib/python%d.%d" % (sys.version_info[0:2]) + "/*.py")

Another possibility for the ignore-dir option:

os.pathsep.join(sys.path)
Keith
  • 37,985
  • 10
  • 48
  • 67
  • 1
    I just realized that `sys.prefix` returns a path doesn't include most of the standard library modules when I'm running inside a virtualenv. I edited my question above. – saltycrane Jun 24 '11 at 06:17
1

I would consult the standard library reference in the official documentation, which goes through the whole library with a section for each module. :)

Karl Knechtel
  • 51,161
  • 7
  • 77
  • 117