0

I'm trying to implement a basic http server code to render a web form and process posted data when user hits submit. Code is hosted here: http://www.codeproject.com/Articles/137979/Simple-HTTP-Server-in-C

However I'm getting post data in quoted-printable format which I don't want to mess with. How can I make everything wrapped in UTF-8 encoding?

Maybe you can shed a light in my particular problem or you may suggest a different approach to present a web form and handle submitted data. Thank you anyways... Sample webform and post data

tripleee
  • 139,311
  • 24
  • 207
  • 268
Serhan BAKIR
  • 49
  • 1
  • 7

1 Answers1

0

You are mixing apples and oranges. The content transfer encoding encapsulates character data in a robust wrapper which escapes any characters with significance on the protocol level. This is regardless of the character set in the data. Inside the quoted-printable data, you could find character data in any character set (or binary data, though then base64 is usually the content transfer encoding of choice).

So to reiterate, you don't get an either/or choice -- character data you receive will have a content transfer encoding (which could be transparent; MIME calls this "binary", though the "8bit" and "7bit" content transfer encodings are also transparent, albeit only suitable for certain constrained types of data) and a character encoding (which could be "US-ASCII" aka plain old 7-bit character data with a very restricted character repertoire, or "ISO-8859-1" which is 8-bit and thus has a slightly larger repertoire, but still restricted; or one of the Unicode encodings, such as "UTF-8").

There really is no way for you to handle HTTP without also handling this aspect of MIME.

tripleee
  • 139,311
  • 24
  • 207
  • 268
  • I knew that at some point I misunderstood smth. about the encapsulation. Thank you for making it clear! However there seems to be no proper tool to unwrap that character data. Can you suggest a hint to convert quoted-printable post content? – Serhan BAKIR Jan 18 '16 at 08:52
  • The quoted-printable encoding is extremely simple, and is documented in full in section 6.7 of [RFC2045](https://www.ietf.org/rfc/rfc2045.txt); but in brief, any byte may be encoded as an ASCII sequence consisting of an equals sign and two hex numbers in ASCII. Also, the sequence of an equals sign and a newline will simply be removed, in order to allow for transparent line wrapping to a limited width. I'm sure it can't be hard to find a C# library which implements this, but it's not a language I am familiar with. – tripleee Jan 18 '16 at 09:32