372

I have an application which sends a POST request to the VB forum software and logs someone in (without setting cookies or anything).

Once the user is logged in I create a variable that creates a path on their local machine.

c:\tempfolder\date\username

The problem is that some usernames are throwing "Illegal chars" exception. For example if my username was mas|fenix it would throw an exception..

Path.Combine( _      
  Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData), _
  DateTime.Now.ToString("ddMMyyhhmm") + "-" + form1.username)

I don't want to remove it from the string, but a folder with their username is created through FTP on a server. And this leads to my second question. If I am creating a folder on the server can I leave the "illegal chars" in? I only ask this because the server is Linux based, and I am not sure if Linux accepts it or not.

EDIT: It seems that URL encode is NOT what I want.. Here's what I want to do:

old username = mas|fenix
new username = mas%xxfenix

Where %xx is the ASCII value or any other value that would easily identify the character.

Stephen Kennedy
  • 16,598
  • 21
  • 82
  • 98
masfenix
  • 6,790
  • 9
  • 40
  • 58
  • Incorporate this to make file system safe folder names: [http://stackoverflow.com/questions/333175/is-there-a-way-of-making-strings-file-path-safe-in-c](http://stackoverflow.com/questions/333175/is-there-a-way-of-making-strings-file-path-safe-in-c) – missaghi Feb 22 '09 at 21:03

13 Answers13

569

I've been experimenting with the various methods .NET provide for URL encoding. Perhaps the following table will be useful (as output from a test app I wrote):

Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded HexEscaped
A         A          A                 A              A                 A                A           A                    %41
B         B          B                 B              B                 B                B           B                    %42

a         a          a                 a              a                 a                a           a                    %61
b         b          b                 b              b                 b                b           b                    %62

0         0          0                 0              0                 0                0           0                    %30
1         1          1                 1              1                 1                1           1                    %31

[space]   +          +                 %20            %20               %20              [space]     [space]              %20
!         !          !                 !              !                 !                !           !                    %21
"         %22        %22               "              %22               %22              "      "               %22
#         %23        %23               #              %23               #                #           #                    %23
$         %24        %24               $              %24               $                $           $                    %24
%         %25        %25               %              %25               %25              %           %                    %25
&         %26        %26               &              %26               &                &       &                %26
'         %27        %27               '              '                 '                '       '                %27
(         (          (                 (              (                 (                (           (                    %28
)         )          )                 )              )                 )                )           )                    %29
*         *          *                 *              %2A               *                *           *                    %2A
+         %2b        %2b               +              %2B               +                +           +                    %2B
,         %2c        %2c               ,              %2C               ,                ,           ,                    %2C
-         -          -                 -              -                 -                -           -                    %2D
.         .          .                 .              .                 .                .           .                    %2E
/         %2f        %2f               /              %2F               /                /           /                    %2F
:         %3a        %3a               :              %3A               :                :           :                    %3A
;         %3b        %3b               ;              %3B               ;                ;           ;                    %3B
<         %3c        %3c               <              %3C               %3C              &lt;        &lt;                 %3C
=         %3d        %3d               =              %3D               =                =           =                    %3D
>         %3e        %3e               >              %3E               %3E              &gt;        >                    %3E
?         %3f        %3f               ?              %3F               ?                ?           ?                    %3F
@         %40        %40               @              %40               @                @           @                    %40
[         %5b        %5b               [              %5B               %5B              [           [                    %5B
\         %5c        %5c               \              %5C               %5C              \           \                    %5C
]         %5d        %5d               ]              %5D               %5D              ]           ]                    %5D
^         %5e        %5e               ^              %5E               %5E              ^           ^                    %5E
_         _          _                 _              _                 _                _           _                    %5F
`         %60        %60               `              %60               %60              `           `                    %60
{         %7b        %7b               {              %7B               %7B              {           {                    %7B
|         %7c        %7c               |              %7C               %7C              |           |                    %7C
}         %7d        %7d               }              %7D               %7D              }           }                    %7D
~         %7e        %7e               ~              ~                 ~                ~           ~                    %7E

Ā         %c4%80     %u0100            %c4%80         %C4%80            %C4%80           Ā           Ā                    [OoR]
ā         %c4%81     %u0101            %c4%81         %C4%81            %C4%81           ā           ā                    [OoR]
Ē         %c4%92     %u0112            %c4%92         %C4%92            %C4%92           Ē           Ē                    [OoR]
ē         %c4%93     %u0113            %c4%93         %C4%93            %C4%93           ē           ē                    [OoR]
Ī         %c4%aa     %u012a            %c4%aa         %C4%AA            %C4%AA           Ī           Ī                    [OoR]
ī         %c4%ab     %u012b            %c4%ab         %C4%AB            %C4%AB           ī           ī                    [OoR]
Ō         %c5%8c     %u014c            %c5%8c         %C5%8C            %C5%8C           Ō           Ō                    [OoR]
ō         %c5%8d     %u014d            %c5%8d         %C5%8D            %C5%8D           ō           ō                    [OoR]
Ū         %c5%aa     %u016a            %c5%aa         %C5%AA            %C5%AA           Ū           Ū                    [OoR]
ū         %c5%ab     %u016b            %c5%ab         %C5%AB            %C5%AB           ū           ū                    [OoR]

The columns represent encodings as follows:

  • UrlEncoded: HttpUtility.UrlEncode

  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode

  • UrlPathEncoded: HttpUtility.UrlPathEncode

  • EscapedDataString: Uri.EscapeDataString

  • EscapedUriString: Uri.EscapeUriString

  • HtmlEncoded: HttpUtility.HtmlEncode

  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode

  • HexEscaped: Uri.HexEscape

NOTES:

  1. HexEscape can only handle the first 255 characters. Therefore it throws an ArgumentOutOfRange exception for the Latin A-Extended characters (eg Ā).

  2. This table was generated in .NET 4.0 (see Levi Botelho's comment below that says the encoding in .NET 4.5 is slightly different).

EDIT:

I've added a second table with the encodings for .NET 4.5. See this answer: https://stackoverflow.com/a/21771206/216440

EDIT 2:

Since people seem to appreciate these tables, I thought you might like the source code that generates the table, so you can play around yourselves. It's a simple C# console application, which can target either .NET 4.0 or 4.5:

using System;
using System.Collections.Generic;
using System.Text;
// Need to add a Reference to the System.Web assembly.
using System.Web;

namespace UriEncodingDEMO2
{
    class Program
    {
        static void Main(string[] args)
        {
            EncodeStrings();

            Console.WriteLine();
            Console.WriteLine("Press any key to continue...");
            Console.Read();
        }

        public static void EncodeStrings()
        {
            string stringToEncode = "ABCD" + "abcd"
            + "0123" + " !\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" + "ĀāĒēĪīŌōŪū";

            // Need to set the console encoding to display non-ASCII characters correctly (eg the 
            //  Latin A-Extended characters such as ĀāĒē...).
            Console.OutputEncoding = Encoding.UTF8;

            // Will also need to set the console font (in the console Properties dialog) to a font 
            //  that displays the extended character set correctly.
            // The following fonts all display the extended characters correctly:
            //  Consolas
            //  DejaVu Sana Mono
            //  Lucida Console

            // Also, in the console Properties, set the Screen Buffer Size and the Window Size 
            //  Width properties to at least 140 characters, to display the full width of the 
            //  table that is generated.

            Dictionary<string, Func<string, string>> columnDetails =
                new Dictionary<string, Func<string, string>>();
            columnDetails.Add("Unencoded", (unencodedString => unencodedString));
            columnDetails.Add("UrlEncoded",
                (unencodedString => HttpUtility.UrlEncode(unencodedString)));
            columnDetails.Add("UrlEncodedUnicode",
                (unencodedString => HttpUtility.UrlEncodeUnicode(unencodedString)));
            columnDetails.Add("UrlPathEncoded",
                (unencodedString => HttpUtility.UrlPathEncode(unencodedString)));
            columnDetails.Add("EscapedDataString",
                (unencodedString => Uri.EscapeDataString(unencodedString)));
            columnDetails.Add("EscapedUriString",
                (unencodedString => Uri.EscapeUriString(unencodedString)));
            columnDetails.Add("HtmlEncoded",
                (unencodedString => HttpUtility.HtmlEncode(unencodedString)));
            columnDetails.Add("HtmlAttributeEncoded",
                (unencodedString => HttpUtility.HtmlAttributeEncode(unencodedString)));
            columnDetails.Add("HexEscaped",
                (unencodedString
                    =>
                    {
                        // Uri.HexEscape can only handle the first 255 characters so for the 
                        //  Latin A-Extended characters, such as A, it will throw an 
                        //  ArgumentOutOfRange exception.                       
                        try
                        {
                            return Uri.HexEscape(unencodedString.ToCharArray()[0]);
                        }
                        catch
                        {
                            return "[OoR]";
                        }
                    }));

            char[] charactersToEncode = stringToEncode.ToCharArray();
            string[] stringCharactersToEncode = Array.ConvertAll<char, string>(charactersToEncode,
                (character => character.ToString()));
            DisplayCharacterTable<string>(stringCharactersToEncode, columnDetails);
        }

        private static void DisplayCharacterTable<TUnencoded>(TUnencoded[] unencodedArray,
            Dictionary<string, Func<TUnencoded, string>> mappings)
        {
            foreach (string key in mappings.Keys)
            {
                Console.Write(key.Replace(" ", "[space]") + " ");
            }
            Console.WriteLine();

            foreach (TUnencoded unencodedObject in unencodedArray)
            {
                string stringCharToEncode = unencodedObject.ToString();
                foreach (string columnHeader in mappings.Keys)
                {
                    int columnWidth = columnHeader.Length + 1;
                    Func<TUnencoded, string> encoder = mappings[columnHeader];
                    string encodedString = encoder(unencodedObject);

                    // ASSUMPTION: Column header will always be wider than encoded string.
                    Console.Write(encodedString.Replace(" ", "[space]").PadRight(columnWidth));
                }
                Console.WriteLine();
            }
        }
    }
}

Click here to run code on dotnetfiddle.net

Rodolpho Brock
  • 7,389
  • 2
  • 25
  • 26
Simon Tewsi
  • 13,627
  • 17
  • 73
  • 85
  • 2
    This is a fantastic answer. Turns out I wanted to use Uri.EscapeDataString and not include System.Web. Thanks for this table. – Seravy Dec 10 '12 at 07:49
  • 7
    Note that this is no longer 100% accurate. Certain functions have changed slightly between .NET 4 and .NET 4.5. See http://stackoverflow.com/q/20003106/1068266. – Levi Botelho Jan 09 '14 at 12:22
  • 2
    @Levi: Thanks for the heads up. I've added a second answer with the table for .NET 4.5. I've edited the original answer to link to the second table. – Simon Tewsi Feb 19 '14 at 00:03
  • Note that the .NET documentation says *Do not use; intended only for browser compatibility. Use UrlEncode.*, but that method encodes a lot of other undesired characters. The closest one is `Uri.EscapeUriString`, but beware it doesn't support a `null` argument. – Andrew Jan 24 '18 at 15:16
  • 1
    I forgot to mention, my comment above is for `UrlPathEncode`. So basically replace `UrlPathEncode` with `Uri.EscapeUriString`. – Andrew Mar 21 '18 at 20:01
  • Best answer I came across! – Zaki Mohammed Apr 17 '20 at 09:06
  • Watch out: this answer is misleading. Some of these escape methods escape differing chars depending on their context within the string. Some are quite dangerous if you don't fully understand their limitations. If you're sticking stuff into uri's stick to `Uri.EscapeDataString` (not EscapeUriString!) unless you're very sure you know what you're doing. – Eamon Nerbonne Jun 10 '20 at 14:45
  • Does .NET Core 3+ and .NET 5 any changes of these? – Joke Huang May 19 '21 at 17:39
285

You should encode only the user name or other part of the URL that could be invalid. URL encoding a URL can lead to problems since something like this:

string url = HttpUtility.UrlEncode("http://www.google.com/search?q=Example");

Will yield

http%3a%2f%2fwww.google.com%2fsearch%3fq%3dExample

This is obviously not going to work well. Instead, you should encode ONLY the value of the key/value pair in the query string, like this:

string url = "http://www.google.com/search?q=" + HttpUtility.UrlEncode("Example");

Hopefully that helps. Also, as teedyay mentioned, you'll still need to make sure illegal file-name characters are removed or else the file system won't like the path.

Community
  • 1
  • 1
Dan Herbert
  • 90,244
  • 46
  • 174
  • 217
  • 34
    Using the HttpUtility.UrlPathEncode method should prevent the problem you're describing here. – vipirtti Mar 09 '09 at 10:08
  • 12
    @DJ Pirtu: It's true that UrlPathEncode won't make those undesired changes in the path, however it also won't encode anything after the `?` (since it assumes the query string is already encoded). In Dan Herbert's example it looks like he's pretending `Example` is the text that requires encoding, so `HttpUtility.UrlPathEncode("http://www.google.com/search?q=Example");` won't work. Try it with `?q=Ex&ple` (where the desired result is `?q=Ex%26ple`). It won't work because (1) UrlPathEncode doesn't touch anything after `?`, and (2) UrlPathEncode doesn't encode `&` anyway. – Tim Goodman Nov 29 '10 at 18:21
  • 1
    See here: http://connect.microsoft.com/VisualStudio/feedback/details/551839/error-in-documentation-for-httpserverutility-urlpathencode I should add that of course it's good that UrlPathEncode doesn't encode `&`, because you need that to delimit your query string parameters. But there are times when you want encoded ampersands as well. – Tim Goodman Nov 29 '10 at 18:23
  • 10
    HttpUtility is succeeded by WebUtility in latest versions, save yourself some time :) – Wiseman Apr 11 '14 at 07:25
212

Better way is to use

Uri.EscapeUriString

to not reference Full Profile of .net 4.

Erik Kalkoken
  • 23,798
  • 6
  • 53
  • 81
Siarhei Kuchuk
  • 4,831
  • 1
  • 25
  • 28
  • 1
    Totally agree since often the "Client Profile" is enough for apps using System.Net but not using System.Web ;-) – hfrmobile Sep 07 '12 at 17:08
  • 7
    OP is talking about checking it for file system compatibility, so this won't work. Windows disallowed character set is '["/", "\\", "", ":", "\"", "|", "?", "*"]' but many of these don't get encoded using EscapedUriString (see table below - thanks for that table @Simon Tewsi) ..."creates a path on their local machine" -OP UrlEncoded takes care of almost all of the problems, but doesn't solve the problem with "%" or "%3f" being in original input, as a "decode" will now be different than original. – m1m1k Feb 07 '13 at 00:32
  • 7
    just to make it clear: THIS answer WONT WORK for file systems – m1m1k Feb 07 '13 at 00:41
  • 1
    In addition, starting with the .NET Framework 4.5, the Client Profile has been discontinued and only the full redistributable package is available. – twomm Feb 19 '13 at 13:46
  • 37
    http://stackoverflow.com/a/34189188/3436164 Use `Uri.EscapeDataString` NOT `Uri.EscapeUriString` Read this comment, it helped me out. – ykadaru Mar 13 '17 at 15:40
205

Edit: Note that this answer is now out of date. See Siarhei Kuchuk's answer below for a better fix

UrlEncoding will do what you are suggesting here. With C#, you simply use HttpUtility, as mentioned.

You can also Regex the illegal characters and then replace, but this gets far more complex, as you will have to have some form of state machine (switch ... case, for example) to replace with the correct characters. Since UrlEncode does this up front, it is rather easy.

As for Linux versus windows, there are some characters that are acceptable in Linux that are not in Windows, but I would not worry about that, as the folder name can be returned by decoding the Url string, using UrlDecode, so you can round trip the changes.

Community
  • 1
  • 1
Gregory A Beamer
  • 16,342
  • 3
  • 23
  • 29
  • 5
    this answer is out of date now. read a few answers below - as of .net45 this might be the correct solution: http://msdn.microsoft.com/en-us/library/system.net.webutility.urlencode.aspx – blueberryfields Jan 07 '15 at 17:20
  • 1
    For FTP each Uri part (folder or file name) may be constructed using Uri.EscapeDataString(fileOrFolderName) allowing all non Uri compatible character (spaces, unicode ...). For example to allow any character in filename, use: req =(FtpWebRequest)WebRequest.Create(new Uri(path + "/" + Uri.EscapeDataString(filename))); Using HttpUtility.UrlEncode() replace spaces by plus signs (+). A correct behavior for search engines but incorrect for file/folder names. – Renaud Bancel Feb 17 '15 at 10:51
  • asp.net blocks majority of xss in url as you get warning when ever you try to add js script `A potentially dangerous Request.Path value was detected from the client`. – Learning Sep 02 '18 at 04:39
194

Since .NET Framework 4.5 and .NET Standard 1.0 you should use WebUtility.UrlEncode. Advantages over alternatives:

  1. It is part of .NET Framework 4.5+, .NET Core 1.0+, .NET Standard 1.0+, UWP 10.0+ and all Xamarin platforms as well. HttpUtility, while being available in .NET Framework earlier (.NET Framework 1.1+), becomes available on other platforms much later (.NET Core 2.0+, .NET Standard 2.0+) and it still unavailable in UWP (see related question).

  2. In .NET Framework, it resides in System.dll, so it does not require any additional references, unlike HttpUtility.

  3. It properly escapes characters for URLs, unlike Uri.EscapeUriString (see comments to drweb86's answer).

  4. It does not have any limits on the length of the string, unlike Uri.EscapeDataString (see related question), so it can be used for POST requests, for example.

Athari
  • 32,207
  • 13
  • 100
  • 135
  • I like the way it encodes using "+" instead of %20 for spaces.. but this one still does not remove " from the URL and gives me invalid URL... oh well.. just gonna have to do a a replace(""""","") – Piotr Kula May 15 '17 at 12:55
93

Levi Botelho commented that the table of encodings that was previously generated is no longer accurate for .NET 4.5, since the encodings changed slightly between .NET 4.0 and 4.5. So I've regenerated the table for .NET 4.5:

Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded WebUtilityUrlEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded WebUtilityHtmlEncoded HexEscaped
A         A          A                 A              A                    A                 A                A           A                    A                     %41
B         B          B                 B              B                    B                 B                B           B                    B                     %42

a         a          a                 a              a                    a                 a                a           a                    a                     %61
b         b          b                 b              b                    b                 b                b           b                    b                     %62

0         0          0                 0              0                    0                 0                0           0                    0                     %30
1         1          1                 1              1                    1                 1                1           1                    1                     %31

[space]   +          +                 %20            +                    %20               %20              [space]     [space]              [space]               %20
!         !          !                 !              !                    %21               !                !           !                    !                     %21
"         %22        %22               "              %22                  %22               %22              &quot;      &quot;               &quot;                %22
#         %23        %23               #              %23                  %23               #                #           #                    #                     %23
$         %24        %24               $              %24                  %24               $                $           $                    $                     %24
%         %25        %25               %              %25                  %25               %25              %           %                    %                     %25
&         %26        %26               &              %26                  %26               &                &amp;       &amp;                &amp;                 %26
'         %27        %27               '              %27                  %27               '                &#39;       &#39;                &#39;                 %27
(         (          (                 (              (                    %28               (                (           (                    (                     %28
)         )          )                 )              )                    %29               )                )           )                    )                     %29
*         *          *                 *              *                    %2A               *                *           *                    *                     %2A
+         %2b        %2b               +              %2B                  %2B               +                +           +                    +                     %2B
,         %2c        %2c               ,              %2C                  %2C               ,                ,           ,                    ,                     %2C
-         -          -                 -              -                    -                 -                -           -                    -                     %2D
.         .          .                 .              .                    .                 .                .           .                    .                     %2E
/         %2f        %2f               /              %2F                  %2F               /                /           /                    /                     %2F
:         %3a        %3a               :              %3A                  %3A               :                :           :                    :                     %3A
;         %3b        %3b               ;              %3B                  %3B               ;                ;           ;                    ;                     %3B
<         %3c        %3c               <              %3C                  %3C               %3C              &lt;        &lt;                 &lt;                  %3C
=         %3d        %3d               =              %3D                  %3D               =                =           =                    =                     %3D
>         %3e        %3e               >              %3E                  %3E               %3E              &gt;        >                    &gt;                  %3E
?         %3f        %3f               ?              %3F                  %3F               ?                ?           ?                    ?                     %3F
@         %40        %40               @              %40                  %40               @                @           @                    @                     %40
[         %5b        %5b               [              %5B                  %5B               [                [           [                    [                     %5B
\         %5c        %5c               \              %5C                  %5C               %5C              \           \                    \                     %5C
]         %5d        %5d               ]              %5D                  %5D               ]                ]           ]                    ]                     %5D
^         %5e        %5e               ^              %5E                  %5E               %5E              ^           ^                    ^                     %5E
_         _          _                 _              _                    _                 _                _           _                    _                     %5F
`         %60        %60               `              %60                  %60               %60              `           `                    `                     %60
{         %7b        %7b               {              %7B                  %7B               %7B              {           {                    {                     %7B
|         %7c        %7c               |              %7C                  %7C               %7C              |           |                    |                     %7C
}         %7d        %7d               }              %7D                  %7D               %7D              }           }                    }                     %7D
~         %7e        %7e               ~              %7E                  ~                 ~                ~           ~                    ~                     %7E

Ā         %c4%80     %u0100            %c4%80         %C4%80               %C4%80            %C4%80           Ā           Ā                    Ā                     [OoR]
ā         %c4%81     %u0101            %c4%81         %C4%81               %C4%81            %C4%81           ā           ā                    ā                     [OoR]
Ē         %c4%92     %u0112            %c4%92         %C4%92               %C4%92            %C4%92           Ē           Ē                    Ē                     [OoR]
ē         %c4%93     %u0113            %c4%93         %C4%93               %C4%93            %C4%93           ē           ē                    ē                     [OoR]
Ī         %c4%aa     %u012a            %c4%aa         %C4%AA               %C4%AA            %C4%AA           Ī           Ī                    Ī                     [OoR]
ī         %c4%ab     %u012b            %c4%ab         %C4%AB               %C4%AB            %C4%AB           ī           ī                    ī                     [OoR]
Ō         %c5%8c     %u014c            %c5%8c         %C5%8C               %C5%8C            %C5%8C           Ō           Ō                    Ō                     [OoR]
ō         %c5%8d     %u014d            %c5%8d         %C5%8D               %C5%8D            %C5%8D           ō           ō                    ō                     [OoR]
Ū         %c5%aa     %u016a            %c5%aa         %C5%AA               %C5%AA            %C5%AA           Ū           Ū                    Ū                     [OoR]
ū         %c5%ab     %u016b            %c5%ab         %C5%AB               %C5%AB            %C5%AB           ū           ū                    ū                     [OoR]

The columns represent encodings as follows:

  • UrlEncoded: HttpUtility.UrlEncode
  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode
  • UrlPathEncoded: HttpUtility.UrlPathEncode
  • WebUtilityUrlEncoded: WebUtility.UrlEncode
  • EscapedDataString: Uri.EscapeDataString
  • EscapedUriString: Uri.EscapeUriString
  • HtmlEncoded: HttpUtility.HtmlEncode
  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode
  • WebUtilityHtmlEncoded: WebUtility.HtmlEncode
  • HexEscaped: Uri.HexEscape

NOTES:

  1. HexEscape can only handle the first 255 characters. Therefore it throws an ArgumentOutOfRange exception for the Latin A-Extended characters (eg Ā).

  2. This table was generated in .NET 4.5 (see answer https://stackoverflow.com/a/11236038/216440 for the encodings relevant to .NET 4.0 and below).

EDIT:

  1. As a result of Discord's answer I added the new WebUtility UrlEncode and HtmlEncode methods, which were introduced in .NET 4.5.
T.Todua
  • 44,747
  • 17
  • 195
  • 185
Simon Tewsi
  • 13,627
  • 17
  • 73
  • 85
  • 2
    No not user UrlPathEncode - even the MSDN says it is not to be used. It was build to fix an issue with netscape 2 http://msdn.microsoft.com/en-us/library/system.web.httpserverutility.urlpathencode(v=vs.110).aspx – Jeff Mar 20 '14 at 20:03
  • Is Server.URLEncode yet another variation on this theme? Does it generate any different output? – ALEXintlsos Nov 18 '15 at 17:21
  • 2
    @ALEX: In ASP.NET the Server object is an instance of HttpServerUtility. Using the dotPeek decompiler I had a look at HttpServerUtility.UrlEncode. It just calls HttpUtility.UrlEncode so the output of the two methods would be identical. – Simon Tewsi Nov 19 '15 at 02:10
  • It seems like, even with this overabundance of encoding methods, they all still fail pretty spectacularly for anything above Latin-1, such as → or ☠. (UrlEncodedUnicode seems like it at least tries to support Unicode, but is deprecated/missing.) – brianary Dec 15 '15 at 16:46
  • Simon, can you just integrate this answer in the accepted answer? it will be nice to have it in one answer. you could integrate it and make a h1 heading in the bottom of that answer, or integrate in one table, and marked different lines, like: `(Net4.0) ? %3f................................` `(Net4.5) ? %3f ..................................` – T.Todua Sep 18 '17 at 14:17
  • Sad, url i need `'`->`'`, `=` -> `%3D`, `[space]` -> `%20`. but i dont want move to 4.0 – Trương Quốc Khánh Sep 14 '20 at 02:55
65

Url Encoding is easy in .NET. Use:

System.Web.HttpUtility.UrlEncode(string url)

If that'll be decoded to get the folder name, you'll still need to exclude characters that can't be used in folder names (*, ?, /, etc.)

teedyay
  • 22,415
  • 19
  • 63
  • 73
  • Does it encode every character thats not part of the alphabet? – masfenix Feb 22 '09 at 19:02
  • 1
    URL encoding converts characters that are not allowed in a URL into character-entity equivalents. List of unsafe characters: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm – Ian Robinson Feb 22 '09 at 19:05
  • MSDN Link on HttpUtility.UrlEncode: http://msdn.microsoft.com/en-us/library/4fkewx0t.aspx – Ian Robinson Feb 22 '09 at 19:06
  • 11
    It is good practice to put the full System.Web... part in your answer, it saves a lot of people a little time :) thanks – Liam Apr 24 '09 at 12:09
  • 3
    This is dangerous: not all character of the url have to be encoded, only the values of parameters of querystring. The way you suggest will encode also the & that is needed to create multiple parameter in the querystring. The soution is to encode each value of parameters if needed – Marco Staffoli Jan 21 '13 at 09:29
12

If you can't see System.Web, change your project settings. The target framework should be ".NET Framework 4" instead of ".NET Framework 4 Client Profile"

useful
  • 442
  • 3
  • 6
  • 1
    In my opinion developers should know about ".NET Profiles" and they should use the **correct** one for their purposes! Just adding the full profile in order to get (e.g System.Web) without really knowing why they add the full profile, isn't very smart. Use "Client Profile" for your **client** apps and the full profile *only when needed* (e.g. a WinForms or WPF client should use client profile and not full profile)! e.g. I don't see a reason using the HttpServerUtility in a client app ^^ ... if this is needed then there is something wrong with the design of the app! – hfrmobile Oct 26 '12 at 11:28
  • 4
    Really? Do don't ever see a need for a client app to construct a URL? What do you do for a living - janitorial duties? – sproketboy Mar 26 '13 at 20:33
  • @hfrmobile: no. It's all wrong with the profile model (which lived just once and was abandoned in next version). And it was obvious from the beginning. Is it obvious for you now? Think first, don't accept everything 'as is' what msft tries to sell you ;P – abatishchev Jan 19 '14 at 17:58
  • Sorry, but I never said that a client never has to build/use an URL. As long as .NET 4.0 is in use, user should care about it. To put it short: Developers should think twice before adding HttpServerUtility to a client. There are other/better ways, just see the answer with 139 votes or "Since .NET Framework 4.5 you can use WebUtility.UrlEncode. First, it resides in System.dll, so it does not require any additional references.". – hfrmobile Jan 20 '14 at 05:45
9

The .NET implementation of UrlEncode does not comply with RFC 3986.

  1. Some characters are not encoded but should be. The !()* characters are listed in the RFC's section 2.2 as a reserved characters that must be encoded yet .NET fails to encode these characters.

  2. Some characters are encoded but should not be. The .-_ characters are not listed in the RFC's section 2.2 as a reserved character that should not be encoded yet .NET erroneously encodes these characters.

  3. The RFC specifies that to be consistent, implementations should use upper-case HEXDIG, where .NET produces lower-case HEXDIG.

Charlie
  • 7,227
  • 48
  • 51
5

I think people here got sidetracked by the UrlEncode message. URLEncoding is not what you want -- you want to encode stuff that won't work as a filename on the target system.

Assuming that you want some generality -- feel free to find the illegal characters on several systems (MacOS, Windows, Linux and Unix), union them to form a set of characters to escape.

As for the escape, a HexEscape should be fine (Replacing the characters with %XX). Convert each character to UTF-8 bytes and encode everything >128 if you want to support systems that don't do unicode. But there are other ways, such as using back slashes "\" or HTML encoding """. You can create your own. All any system has to do is 'encode' the uncompatible character away. The above systems allow you to recreate the original name -- but something like replacing the bad chars with spaces works also.

On the same tangent as above, the only one to use is

Uri.EscapeDataString

-- It encodes everything that is needed for OAuth, it doesn't encode the things that OAuth forbids encoding, and encodes the space as %20 and not + (Also in the OATH Spec) See: RFC 3986. AFAIK, this is the latest URI spec.

Gerard ONeill
  • 3,193
  • 31
  • 22
3

I have written a C# method that url-encodes ALL symbols:

    /// <summary>
    /// !#$345Hf} → %21%23%24%33%34%35%48%66%7D
    /// </summary>
    public static string UrlEncodeExtended( string value )
    {
        char[] chars = value.ToCharArray();
        StringBuilder encodedValue = new StringBuilder();
        foreach (char c in chars)
        {
            encodedValue.Append( "%" + ( (int)c ).ToString( "X2" ) );
        }
        return encodedValue.ToString();
    }
Sergey
  • 725
  • 5
  • 5
1

Ideally these would go in a class called "FileNaming" or maybe just rename Encode to "FileNameEncode". Note: these are not designed to handle Full Paths, just the folder and/or file names. Ideally you would Split("/") your full path first and then check the pieces. And obviously instead of a union, you could just add the "%" character to the list of chars not allowed in Windows, but I think it's more helpful/readable/factual this way. Decode() is exactly the same but switches the Replace(Uri.HexEscape(s[0]), s) "escaped" with the character.

public static List<string> urlEncodedCharacters = new List<string>
{
  "/", "\\", "<", ">", ":", "\"", "|", "?", "%" //and others, but not *
};
//Since this is a superset of urlEncodedCharacters, we won't be able to only use UrlEncode() - instead we'll use HexEncode
public static List<string> specialCharactersNotAllowedInWindows = new List<string>
{
  "/", "\\", "<", ">", ":", "\"", "|", "?", "*" //windows dissallowed character set
};

    public static string Encode(string fileName)
    {
        //CheckForFullPath(fileName); // optional: make sure it's not a path?
        List<string> charactersToChange = new List<string>(specialCharactersNotAllowedInWindows);
        charactersToChange.AddRange(urlEncodedCharacters.
            Where(x => !urlEncodedCharacters.Union(specialCharactersNotAllowedInWindows).Contains(x)));   // add any non duplicates (%)

        charactersToChange.ForEach(s => fileName = fileName.Replace(s, Uri.HexEscape(s[0])));   // "?" => "%3f"

        return fileName;
    }

Thanks @simon-tewsi for the very usefull table above!

m1m1k
  • 1,249
  • 11
  • 13
  • also usefull: `Path.GetInvalidFileNameChars()` – m1m1k Feb 08 '13 at 22:06
  • yes. Here's one way of doing it: foreach (char c in System.IO.Path.GetInvalidFileNameChars()) { filename = filename.Replace(c, '_'); } – netfed Jun 25 '13 at 02:02
0

In addition to @Dan Herbert's answer , You we should encode just the values generally.

Split has params parameter Split('&','='); expression firstly split by & then '=' so odd elements are all values to be encoded shown below.

public static void EncodeQueryString(ref string queryString)
{
    var array=queryString.Split('&','=');
    for (int i = 0; i < array.Length; i++) {
        string part=array[i];
        if(i%2==1)
        {               
            part=System.Web.HttpUtility.UrlEncode(array[i]);
            queryString=queryString.Replace(array[i],part);
        }
    }
}
Davut Gürbüz
  • 4,462
  • 4
  • 38
  • 73