Monday, December 27, 2010

UTF-8 and UTF-16???

UTF-8 and UTF-16 are 2 different ways for representing your unicode characters. To briefly touch on the technical side, in UTF-8, anywhere between one to four 8-bit units are used to represent a particular unicode character. In UTF-16, 16 bit units are used to represent (in either big-endian or little-endian format) a particular unicode character.  [There is also UTF-32 format which is mostly used in Unix only].

As such, UTF-8 is the most popular and standard format for Web Pages. So, if you are designing web pages, you can opt for conversion to UTF-8 format always (which is the default setting in azhagi's 'unicode converter'). For other purposes, you may choose UTF-8 or UTF-16 according to your specifications and requirements.