The Power and Responsibility of Unicode Adoption

Opening title So, Katie McLaughlin is an operations engineer who's developed in numerous languages and used systems of every nature. But she has a particular passion for and interest in emoji and the relationship with Unicode. So there's kind of a serious side to this as well. So to tell us all about this, and later on come see her 'cause she's got some amazing, if you've got any left, little emoji stickers you might have seen people putting on their name badges. Please welcome, Katie McLaughlin. All right. Hi! Hello! I love emoji. Do you love emoji? - Yes. - I love emoji. I love how broken it is. Now, emoji is full of bugs, but because I've only got 20 minutes, here's a TLDR because computers, we have Unicode. There you go. Unicode is extremely powerful. Nearly every language in the world can use Unicode to be able to communicate, which means that you can send your document that's in Chinese across the internet, and it can be read on the other side. There's no issues with the translation of characters. Now, I'm not talking about internationalization and localization as Greg did. I'm talking about the actual characters getting from one place to another without being garbled, like a really good teleporter without any issues like on Star Trek. However, Unicode didn't work for the Japanese, and they were very upset about this. Everything was fine. They had the Hiragana. They had the Katakana. They had the Kanji. All in this great big million-character slot table. Everything's in there, except they liked their mobile phones back in the '90s, and they had all these little tiny characters that they liked to send to each other. And this is part of the Japanese culture where they could send a horse head or a pig snout face, and they wanted these things in Unicode. So they asked the Unicode Consortium. Isn't that a great name? The Unicode Consortium. They asked them if they could add these wonderful little pictures in Unicode. And the Unicode Consortium said no. Bugger. But this was back in the '90s. They tried again. In 2007, the Unicode Standard 6 added emoji. So you could have penguins and horses and apples and flowers and such, and nothing happened. No one noticed. And then Apple came along. Apple was trying to make the iPhone in the Japanese market. And, of course, Japanese people want their emoji. So Apple added in emoji, and the Japanese loved it. 2010 comes along, and they went, "Hmm, we have these emoji things. "Let's just put it on the U.S. version of iOS 5. "No one will notice." They noticed. The story goes that one day, somebody was playing around with their new download on the phone and went, "Oh, I can send a picture of poop to my friends," and then somebody else got the message and went, "How'd you do that? "How do I get the poop?" And then it exploded. (laughter) So, Apple didn't create emoji, but Apple were the ones to popularize it in the western world. Since then, we've had a whole lot more emoji coming to Unicode. This is the base set. If your system supports emoji, it'll support the base set from 2010. There have been many more versions of Unicode that have come out since then. In version seven, we had chipmunks and we had Spock and we have men in business suit levitating. (laughter) The reason we have this is, who remembers Wingdings? Who remembers Webdings? Who remembers why there's a businessman in Webdings? No one. However, because backwards compatibility, they needed to have every single character in Wingdings and Webdings, in Unicode so you can have consistent backward-compatible translations between all the different systems. So now we have men in business suit levitating. But it doesn't stop there because last year, we got even more emoji. We finally have a unicorn emoji. How awesome is that? And tacos, because tacos. But also, if you notice the last one here, we have a wonderful brown head. The Fitzpatrick color modifiers mean that now you don't have to have Simpsons yellow. You can have a gradient scale of general human skin tones, and so you can have a rocking sign that looks like you, which is kind of cool. But adopting this throughout all the different platforms has been fraught with issues. And I posit that most of these is because people rushed it. How so? This is a yellow heart. Yellow. Say it with me, "This is yellow." Yellow. On the left here, we have what the standard defines as the yellow heart. Now, these speckles are similar to what you see on old English heraldry shields. This is supposed to mean golden. Different stripes in different directions mean green and red and all the rest of it. This is supposed to denote yellow in a format that can't otherwise represent color. This is yellow. Yellow. Android 4.4 decided to do this. (laughter) So you can see that we're in a bit of a hairy situation here. (laughter) There are so many examples of these kind of issues across different platforms. Here we have some of them. This is supposed to be a flushed face. The first three examples make it look like, "Oh, no, I've done something wrong." The last one looks like I could be taking credit, and I'm being embarrassed for something, going, "Na-ah." Depending on which platform you're using, you can have an entirely different interpretation of what this is supposed to mean. Or, such as this one. This is hugging. I look at that and I just see, "Micro-services." (laughter and applause) Thank you. I'll be here all week. (laughter) I'm a kid of the '90s. I'm used to seeing hugging like this, from MSN Messenger. (laughter) That's supposed to be hugging. That's supposed to be hugging. Not just hands. And, okay, so, we've been clapping a lot already. We've been doing half a day of talk so far. When you clap, you press your thumb and your other thumb together, and your palms face each other. The example on the right there, that's Windows. Have you ever tried clapping your hands with the thumbs facing the wrong way? (laughter) Just don't do that too much if you've got like rings on and stuff 'cause it starts to hurt. But (clears throat) do Microsoft people not have hands? (laughter) But the cross-platform issues, you only see this if you're using multiple systems. But when it gets really confusing is when it's like, "Oh, what's this character?" This is blonde. (laughter) There's a ginger there. There's a brunette. Yeah. So you can see how this is getting really, really confusing, especially because the emoji on the far right is a question mark. (laughter) According to Facebook, this is a question mark. They have fixed it in some places now, but, yeah. This all culminates into a scientific paper that was released a couple of months ago from the University of Minnesota where they used this as an example of the issues of misinterpretation. These are two completely separate emoji. The only difference here is the eyes. The left one is grinning. The right one is grimacing. (laughter) Unless you're familiar with Japanese culture, you wouldn't understand that the semicircles are smiling eyes. And so, if you send the wrong one to someone, it's like, "Oh, I just ran over my cat. "Yi-hee!" What? (laughter) So how do we even get new emoji anyway? I mean, how did we end up with the Unicode? Well, the Unicode Consortium. No. Yes, the Unicode Consortium, I nearly said Emoji Consortium, they do stuff other than emoji, but this is fun enough to talk about. So the Unicode Consortium will take applications from the public for people that want to submit new candidates for emoji. Things that will get you accepted are backwards compatibility such as the cowboy from Yahoo Messenger, which is now available. Completing sets. In one version of Unicode, you didn't have all the zodiac symbols, which is kind of silly. And they will take suggestions from the public if they get bombarded enough, which is how you get unicorns and tacos now because tacos. Things that won't get you accepted, though, if it's too specific. So we already have one martini and tropical drink, so we're not going to get Manhattans any time soon. You're never ever allowed to have logos or brands. Never ever ever. So there is absolutely no way that you can tell that these are Apple products at all, ever. (laughter) And you're also not supposed to have any fads or memes, which is how the Vulcan sign got in there, obviously. The thing is, though, with all these different versions of Unicode coming out, you're going to have an issue between new and old systems. This is also known as Mojibake or Mojibake because I'm a westerner and I absolutely trash the English language. Depending on which system you are using, you're not going to be able to see the new symbols. Normally, they will show up as a replacement character, glyph, or they might not show up at all. Places where emoji have to work, though, are in URLs. This is a real website. The way that we can get emoji in URLs is using RFC 3492, also known as Punycode, which allows you to use Latin characters in place of Unicode ones, which is how you're able to get domain names in Chinese and Japanese and such. So in this particular example, the spoon is this combination of letters. The RFC tells you exactly how to translate it all. In JavaScript, now, JavaScript is fun. JavaScript was originally created around the same time that Unicode was still getting standardized, so there are some limitations. In JavaScript, if we want to split a string, we can split a string of characters and we can go from ABC to an array of A and B and C. That's fine. This doesn't work with emoji by default. If we're trying to split the chicken and a chipmunk, we end up getting broken things. So if you're going to be using Unicode, make sure you're using functions in JavaScript that can handle Unicode properly. Yeah. Input. So, how do we actually get the emoji into our systems anyway if you're a user? If you've got a mobile phone, normally you have a Enter key with like a little smiley face on it. Depending on whether you're using Apple or Android, it will be slightly different. If you're using a Mac, you can have the combination key of Command, Option, and Space, and you can bring up a nice little searchable text box, and that's all cool. This doesn't work on every system, thought. But one blokey had a really good idea and he made an emoji keyboard. 14 keyboards, all connected together, plugged into a laptop, with a little bit of AutoHotKey and a little bit of LUA script, and he could have every single emoji at a click of a button. You might think this is crazy. This is the same guy that made an entire social network based on emoji. Your username had to be emoji. Not your password, if I recall right, but your username and everything had to be emoji. So you can imagine all the fun that he had trying to get this to technically and culturally work. It doesn't exist anymore. (laughter) Now, on the web, because we all have these different systems that all go into the web, someone thought it would be a really good idea to have a way to input emoji that was universal, and they called it shortcodes. Shortcake, shortcodes. Do you know how hard it is to find emoji puns to everything in this talk? Anyway, if you're on Slack or if you're in GitHub, you can use colons and you can say colon cake colon, and you get a cake. That's kind of cool. If you're on HipChat, though, HipChat don't use colons. They use brackets. So you have to go, open bracket cake close bracket, and then you get a cake. But the cake is not equal to cake. They're different cakes. And, also, these aren't emoji. These are emoticons and/or shortcodes. In Slack, in HipChat, you have the ability to add custom shortcodes. These are not emoji. Your party parrot is not Unicode-compliant, people. I'm sorry. (laughter) And if you're going to be making systems that want to automatically convert these things, for the love of all things, please let me disable Auto Correct because I can't tell you how many times I've been like, "Ah, this is fine," smiley, and it just takes it right to the extreme of the "I'm really happy with this situation" thing. Or, if I want to use a legacy emoticon for embarrassment, I don't want to be getting money mouth. I mean, seriously, let me disable it, and we can be friends again. So, reading things back. You get your emoji and now you want to read it back. In the web, we have absolutely complete control over what we're doing, and we can take advantage of this. If I can make one suggestion, one suggestion, just fall back. Do not rely on your browser automatically putting in the Unicode for you. It'll look different across different systems. People won't have the right emoji characters locally installed. Just do it for them. This is running LCAV. This is actual emoji. This is the only slide that actually has emoji. The rest have been pictures. We have a question mark and an exclamation mark and an arrow that don't look like emoji. They look like symbols. This is how Apple translates it. I've got emoji back here. And I'm on the most recent version of Mac. What the heck? If you want to try this out with your system, I've made a little wonderful website where you can test it out and so you can see what version of Unicode compatibility you have. If you're going to be doing things on the web and you want to go full interoperability, you need to use fallbacks. You also need to use highlights and mouseovers and think about your web accessibility. So if we combine all these things together, if you want to, say, have a emoticon, an emoji of a whiskey, which is now an emoji because of course it is, you're going to need to do something like this. We have image tags. We have the source tag for PNG. Use high resolution PNGs. Don't use SVGs because they're not completely cross-browser compliant. Then what you have is you have your title which when you mouseover an image that has a title, it'll pop up in a little tool tip and that allows you to actually describe what the emoji is, which means I can tell the difference between grinning and grimacing even if I can't see it. Titles and alt text do have different properties. If you copy and paste and select an image, you are going to copy the alt tag into your clipboard, which means that if you have inline emoji, you can copy that and put it into another system, and you have loss-less data transfer. Use that. Extra bonus points if you use your accessibility aria-labels so that screen readers can actually read it out for you. And, of course, if you're not doing this already, you're limiting your application to only a small segment of the market. Make sure that you're declaring that your meta charset is UCF-8 because otherwise it'll be ASCII by default and you won't get emoji, you won't get anything apart from your main ASCII-character Latin text, and you're doing yourself a disservice if you don't automatically have this set. This has been your public service announcement. Use that. So, the future. Guess what we're getting. New ones, yes! A couple of weeks ago now, Unicode 9 dropped, and we get new emoji. We have bacon now. (laughter) And we have an egg. Chicken's been in since version six, so now you know which came first. (laughter) We also have such timeless classics as shrug and face palm, because, fads. These will work on Twitter. Google adoption is coming. Apple adoption is coming. Facebook adoption is coming. But Twitter will work on the web for all of these. Also, there are some sports thing on sometime soon, so now we have emoji for water polo and wrestling and stuff. That's cool. But they've also already put in some shortlist for what's going to be in next year's version, including the Steven Colbert emoji. (laughter) He's trying to take credit for that. But also a whole bunch of other things that are already in there. And you don't have to wait until the new version of Unicode comes out because people are updating their emoji all the time. Depending on what version of Android or Facebook, depending on what platform you're on, you're going to see different emoji. So this is all the flushed face from earlier in different versions of Android. I'm not sure about the last one. I think it's supposed to be a duck. (laughter) Yeah. Facebook have recently updated all their emoji. Depending on which platform you're using, you're going to see different ones. You can still see the top row if you go into the actual messaging part, at least on web, when I checked it earlier, on the mobile platforms and on the little popup dialogue. It's the new really weird ones. Everyone is just like looking for it and they're like, "We're going to go that way." (laughter) Anyway. Oh, and Windows. Windows are going to fix their emoji, which is great because they looked creepy. This is all the Windows 10 insider preview, so whenever the anniversary update of Windows comes out, you're going to get all the cool emoji. But when you update to that Windows, you're also going to get cats. Microsoft decided that they were going to add cats as emoji. (laughter) So that's a thing that happened. Cats riding dinosaurs, yeah. (laughter) This is a combination character. So you've previously been able to have family groups, so if you have man, woman, boy, girl, you can combine into a family group for single glyph. There is the ability to do this already for family groups, and Microsoft want to do it for cats, apparently. However, there are some emoji that are missing but you can have vendor support if you nag them enough. You do have the abilIty to use this already, such as the pride flag. There is already a white flag. There's already a rainbow. Then just could, if they wanted to, say, "If I see these two together, I'm going to turn it "into a pride flag." This is completely valid. And, also, Google have just gotten their gender-occupation proposal in. So if you were to have girl and cooking, you can have chef. There's an entire proposal about all the different gender equality emoji. And this is all able to be adopted by vendors, but the Unicode Consortium are making their suggestion. So, oh, and ECMAScript 6, this was a proposal that was just released for consideration for ECMA 6 Script last week. Being able to use, to be able to test what kind of Unicode characters you have, that's kind of cool. That one's by Matias. I can't pronounce his name. There's a link to these slides that I'll be tweeting out later if you want to look up how to use rejects properly in JavaScript when it gets dropped, that will be cool. So, some takeaways. There's a lot of power in the human speech and there's a lot of things that we can do with all the crazy different characters that we have access to. But you need to implement responsibly, but don't forget to have just a little fun as well. Thank you very much. (applause) - Thanks so much, Katie. - Do you we have time for questions? - We might have time for one while we set up Marcos. So who's got a question about Unicode, emoticons, emoticodes, emoji codes? So, and there's modules and how you can just use... - I have a question for myself. - Emoji as your code there. - I have a question for myself. Can I ask a question for myself? - Question without notice to myself. - Okay. Are there any JavaScript npm packages that will just automatically do this for you? Answer, no. I have checked, and I cannot find one that ticks every single box that I would like, mostly because a lot of the stuff you should be doing service-side, there is a lot of data included in actually getting all the descriptions of all the emoji there, and it's 22 megabytes of plain text. You really don't want to be putting that client-side. If somebody knows of a service-side pre-render for whatever database widget server thing, do let me know, but I don't know of anything yet. And, no, I'm not making one. The last time I said I would make one onstage, I ended up making one. I'm not going to do that again. - All right, we've got one question. - Oh, question. - Apart from Emojily, do you know any service that allows people to use emojis as usernames? - Where was that voice coming from? There it is, hello! Emojis as usernames, okay. So, on Twitter, you can have some emoji as your display name. As usernames, it would depend whether they are Unicode-compliant because if you can have an accent in your name when you register, you should be able to use emoji, unless they have a white list. Oh, also, whoever's got the triple skull wi-fi hotspot around here, you're cool. (laughter) - Thank you. - You're welcome. (laughter) - It had to be Tony Caldwell. (laughter) All right. Once again, thanks, Katie. Fantastic and engaging and awesome. - Thanks.