Zen of Unicode

I attended David Goodger's Unicode talk at PyCon earlier this year and I thought I'm well on my way to Unicode enlightenment. It turns out I still need to chop a lot of wood, carry a lot of water before I attain this particular Zen...In the hope that other people will find it useful, here's a mini-tutorial on Unicode in the form of an email message from David, who responded in excruciating detail to some Unicode-related questions I sent him. I tried to copy and paste the text into the Blogger editor, only to get all sorts of markup-related errors, so I just put it on a Trac wiki. Hopefully David will soon publish his Unicode tutorial on the Web. Until then, happy Unicode hacking!


Florian said…
Now consider this:

- You send utf-8 encoded links to a browser.
- User clicks on a link
- What do you get back in the request url on your webserver?

I'l enlighten you:
- In general the url gets urlquoted, so you have to urlunquote.
- IE will return you the url in the encoding you send it
- Mozilla/Firefox will send you latin-1 if there's no non latin-1 characters in your link. It will send you utf-8 however if there are...


Popular posts from this blog

Performance vs. load vs. stress testing

Running Gatling load tests in Docker containers via Jenkins

Dynamic DNS updates with nsupdate and BIND 9