I am looking for a PHP function to sanitize strings into safe and valid file names with no directory separators (slashes).
Ideally it should be reversible, and it should not scramble the name more than necessary.
Of course I want to prevent intentional directory traversal attacks. But I also want to prevent subfolders being created.
I figured that urlencode()
would work, but I wonder if this is sufficient, and/or if there is something better or more popular.
Also if there is something that works equally well on Windows (backslash as directory separator) - so the solution would be portable.
Use case / scenario:
As part of a data import, I want to download files from remote urls into the local filesystem. The urls are from a csv file. Most of them are ok, but they may contain more slashes than expected.
E.g. most of them are like this:
https://files.example.com/pdf/12345.pdf
But then individual files might be like this:
https://files.example.com/pdf/1/2345.pdf
The files should all go into the same directory, e.g.
https://files.example.com/pdf/12345.pdf
-> /destination/dir/12345.pdf
A file like 1/2345.pdf
should not result in a subdirectory. Instead, the /
should be escaped in some (reversible) way. E.g. with urlencode() this would be 1%2F2345.pdf
.