18

Is the following code snippet from a Python WSGI app safe from directory traversal? It reads a file name passed as parameter and returns the named file.

file_name = request.path_params["file"]
file = open(file_name, "rb")
mime_type = mimetypes.guess_type(file_name)[0]
start_response(status.OK, [('Content-Type', mime_type)])
return file

I mounted the app under http://localhost:8000/file/{file} and sent requests with the URLs http://localhost:8000/file/../alarm.gif and http://localhost:8000/file/%2e%2e%2falarm.gif. But none of my attempts delivered the (existing) file. So is my code already safe from directory traversal?

New approach

It seems like the following code prevents directory traversal:

file_name = request.path_params["file"]
absolute_path = os.path.join(self.base_directory, file_name)
normalized_path = os.path.normpath(absolute_path)

# security check to prevent directory traversal
if not normalized_path.startswith(self.base_directory):
    raise IOError()

file = open(normalized_path, "rb")
mime_type = mimetypes.guess_type(normalized_path)[0]
start_response(status.OK, [('Content-Type', mime_type)])
return file
deamon
  • 78,414
  • 98
  • 279
  • 415
  • 1
    What happens if you give it an absolute path? – Katriel Jul 23 '11 at 21:55
  • Good idea! Your suggestion led me to the answer: It is insecure! Directory traversal was "accidentally" prevented by another part of the used framework. – deamon Jul 24 '11 at 09:24

3 Answers3

20

Your code does not prevent directory traversal. You can guard against this with the os.path module.

>>> import os.path
>>> os.curdir
'.'
>>> startdir = os.path.abspath(os.curdir)
>>> startdir
'/home/jterrace'

startdir is now an absolute path where you don't want to allow the path to go outside of. Now let's say we get a filename from the user and they give us the malicious /etc/passwd.

>>> filename = "/etc/passwd"
>>> requested_path = os.path.relpath(filename, startdir)
>>> requested_path
'../../etc/passwd'
>>> requested_path = os.path.abspath(requested_path)
>>> requested_path
'/etc/passwd'

We have now converted their path into an absolute path relative to our starting path. Since this wasn't in the starting path, it doesn't have the prefix of our starting path.

>>> os.path.commonprefix([requested_path, startdir])
'/'

You can check for this in your code. If the commonprefix function returns a path that doesn't start with startdir, then the path is invalid and you should not return the contents.


The above can be wrapped to a static method like so:

import os 

def is_directory_traversal(file_name):
    current_directory = os.path.abspath(os.curdir)
    requested_path = os.path.relpath(file_name, start=current_directory)
    requested_path = os.path.abspath(requested_path)
    common_prefix = os.path.commonprefix([requested_path, current_directory])
    return common_prefix != current_directory
Jossef Harush
  • 24,765
  • 7
  • 95
  • 103
jterrace
  • 57,555
  • 21
  • 140
  • 184
  • 2
    Just don't rely on doing things relative to the current working directory as that can be anything in a web application. Should always base it off an absolute path as starting point, be that hard wired or calculated from \_\_file\_\_. – Graham Dumpleton Jul 23 '11 at 23:49
  • @graham-dumpleton This has nothing to do with relative or absolute. A path can always break out unless you do this type of sanity check. – jterrace Jul 24 '11 at 00:32
  • You don't seem to quite get what I am referring to. You gave in your example 'startdir = os.path.abspath(os.curdir)'. That will set 'startdir' to the current working directory. In a Python web application there are no guarantees what the current working directory will be. So, it is a poor example to use because people will blindly cut and paste the code without understanding that they should be anchoring it at an absolute path which has meaning for their application rather than relying on whatever os.getcwd() is going to return when os.path.abspath(os.curdir) is called. – Graham Dumpleton Jul 24 '11 at 04:46
  • You can test this code snippet out against a directory traversal attack fuzzer like dotdotpwn @ https://github.com/wireghoul/dotdotpwn, eg. './dotdotpwn.pl -m http-url -u "http://google.com/?q=TRAVERSAL" -O -k "root:"' – evandrix Mar 27 '15 at 02:53
  • 1
    Note this depends on `startdir` ending with a forward slash, which it doesn't in this case! This code will incorrectly accept the input `../jterrace-attack/foobar`. – Flimm Feb 02 '16 at 17:26
4

Use only the base name of the user inputed file:

file_name = request.path_params["file"]
file_name = os.path.basename(file_name)
file = open(os.path.join("/path", file_name), "rb")

os.path.basename strips ../ from the path:

>>> os.path.basename('../../filename')
'filename'
Clodoaldo Neto
  • 98,807
  • 21
  • 191
  • 235
  • 1
    This does not prevent directory traversal since `file_name` can contain `../`! But your code was helpful anyway. – deamon Jul 24 '11 at 07:42
  • 1
    Sorry, your solution seems to work too. But it would be limited to a single directory level (without sub directories). I'll correct my down vote if you modify your answer slightly so that I can vote again). – deamon Jul 24 '11 at 09:20
  • @daemon `os.path.basename` strips `../` from the path. Check the update answer. – Clodoaldo Neto Mar 27 '15 at 22:07
  • 2
    `os.path.basename` does strip `../`, but it also strips all path related information. `os.path.basename("a/b/c")` returns `"c"` – Flimm Feb 02 '16 at 17:25
  • 1
    @Flimm: Yes, that is intended. – Clodoaldo Neto Feb 07 '16 at 09:19
  • Like this method because it's simple and does the right thing when you want to work in a fixed directory (without sub-directories) and receive just the filename from the user. – Augusto Destrero Dec 28 '16 at 14:25
2

There's a much simpler solution here:

relative_path = os.path.relpath(path, start=self.test_directory)
has_dir_traversal = relative_path.startswith(os.pardir)

relpath takes care of normalising path for us. And if the relative path starts with .., then you don't allow it.

Tom Viner
  • 5,975
  • 7
  • 35
  • 39