Unicode: now, later, or beyond?
I've known about Unicode and internationalization issues for many years, but still haven't settled my mind on it. After reading this interesting post on the topic, the question returns to haunt me: Is it finally time to bite the bullet and adopt full-bore 16-bit Unicode, or can we continue to defer the issue? Now, the question that occurs to me is deeper: Is there something significantly beyond Unicode that we should be considering?
Personally, I think the idea of fixed-character strings is an archaic artifact of our computational "early years" and it's time to move on. Take a look at HTML or XML or SGML They have the concept of a "character entity", where an extended character code or even a name for a character can be encoded. I don't want to suggest that we use XML as our new character representation format, but at least it's worth considering, and it's already there as a high-level external representation format.
Oddly, people are still concerned about the storage space and performance of 16-bit characters. Geez, get over it already.
Think about it: 8-bit character codes, they fit in a "byte"... how quaint and useless for computing in the 21st Century.
-- Jack Krupansky