Archive for May 4th, 2009
Character entities
Posted by Jurgen in Character Encoding, Web standards on May 4th, 2009
As in real life characters that build written language differ from system to system. Ελληνικά characters differ from Русский, 汉语 and Latin characters. Fortunately these character sets have been standardized and called alphabets. The same goes for character sets in the digital world. As computers can only process binary data, all characters are mapped to a number. In the early days such a mapping of the Latin alphabet, along with some other graphical ‘characters’, digits and control characters (e.g. escape, tab, line feed, carriage return) was standardized. This standard is known as the American Standard Code for Information Interchange (ASCII) and was developed by the American Standards Association (currently: ANSI). This 7-bit encoding lacked digital representations for many characters of e.g. foreign characters (as respectively Greek, Russian and Chinese are mentioned above) but also accents like å, è, ï, ó and û were not represented in the set. But as you can see in this paragraph, improvements have been made to facilitate such ’special’ characters. Read the rest of this entry »