The Power and Responsibility of Unicode Adoption

Communication is difficult. Whether it’s between humans or machines or a combination of the two, trying to translate meaningful information is a lossy process.

Converting programming languages to use the new Unicode standard is hard, but once it’s in place, you get this marvellous feature-add: Emoji compatibility. No longer do we have to make faces with symbols, or be forced to platform-specific emoticons! Rejoice in the extended character set!

Emoji has a rich history as a way to allow the communication of ideas in a reduced amount of data, and dates back to a time where this was important: SMS communications in Japan. However, as social networks feverishly try and clamber into this bandwagon, their implementations of the standard create more issues with miscommunication that aren’t already possible with a 12×12 pictograph. 🤔

From the technical to the social aspects, mojibake and UTF-8, this talk will cover why the extended character set provided by the Unicode standard needs to be treated with responsibility by users and platforms alike.

This talk is not just an excuse to see what parts of the conference stack can’t handle Unicode, I promise. 😇

Nearly every language in the world can use Unicode to communicate.

In the late 1990s, Japanese mobile phone users had taken to using emoji, simple ideograms that conveyed specific concepts in a single 12×12 pixel picture character.  

Apple made emoji available in Japan to help build the early iPhone’s popularity in that market.

Emoji were excluded from Unicode until 2007, but mobile phone users were using them.

In 2010, Apple made emoji available internationally in iOS 5 and they became popular worldwide.

By 2014, Unicode 7 had accepted many more emoji, and now most operating systems support them.

Unicode 9 was recently released, with more new emoji.

Different systems and platforms have their own ways of rendering emoji, which creates not only confusion but sometimes much hilarity.

The Unicode Consortium takes applications for new emoji for specific reasons, including backwards compatibility, completion of sets and popular demand.

RFC 3492, also known as Punycode, provides a way to use emoji in URLs by translating them into alphanumeric strings.

Include fallback text for emoji and use titles, alt text and ARIA correctly.

It’s possible to combine two emoji to create a single glyph, eg a white flag plus a rainbow creates a pride flag.

Some of the more obscure and eccentric emoji are there to ensure backward compatibility with Wingding and Webding fonts.

Early emoji had some issues, including the famous yellow heart that in Android 4.4 looked like a hairy pink heart (due to a historical use of heraldic conventions to use black and white symbols to denote colour).

Emoji are not supposed to represent brands or logos, but since platforms create their own versions, many emoji do clearly derive from certain brands.

When using Unicode, be sure to use functions in JavaScript that can handle it properly.

Declare your meta charset as UTF-8 – otherwise it’ll be ASCII by default.

The new emoji made available with Unicode 9 are slated for adoption by Google, Apple and Facebook, they currently only work on Twitter on the web.