Conversational AI Has a Massive, UX-Shaped Hole

Peter Isaacs at Summit '23

Introduction

Peter, Senior Conversation Design Advocate at Voiceflow, introduces himself and the company. He highlights Voiceflow's evolution from an Alexa skill-building tool to an end-to-end conversational AI platform used by major brands like Home Depot and BMW. Peter emphasizes the importance of UX in conversational AI.

The Importance of User Experience

Peter uses the examples of Cartrivision and VCRs to illustrate the crucial role of user experience in technology adoption. He argues that despite technological advancements, poor UX can hinder even the most promising innovations. He connects this to the current state of conversational AI and generative AI.

Rethinking Content Consumption

Peter critiques the limitations of linear content consumption, arguing that it doesn't reflect the non-linearity of human thought. He proposes that generative AI and conversational AI can create more dynamic and intuitive experiences that adapt to individual users.

Micro UIs and Ephemeral Apps

Peter introduces the concepts of micro UIs and ephemeral apps. He describes micro UIs as dynamic UI fragments that adapt and change in response to user interactions. He explains ephemeral apps as transient applications created for specific tasks, then disappear after use, reducing digital clutter.

Micro UI Example: Booking a Holiday

Peter presents a scenario of booking a holiday using micro UI. Instead of a traditional search, the user interacts with the UI conversationally, specifying preferences like "beach retreat" and "vegan." The UI dynamically adapts, presenting personalized recommendations for destinations, activities, and restaurants.

Ephemeral App Example: The Ultimate Book Club

Peter illustrates the concept of an ephemeral app by organizing a book club event. The app, created for this specific event, suggests books, generates thematic decor and costume ideas, curates recipes, and even designs trivia games. After the event, the app sends thank you notes and fades away.

Real-World Examples and the Importance of UX

Peter showcases existing generative AI tools like Google Duet for AppSheet, Dora AI, and Shopify's Sidekick Assistant that demonstrate the potential of this technology. He emphasizes that while generative AI is a powerful tool, it should not replace human insight. UX professionals are crucial in guiding the ethical and user-centered development of these technologies.

The Future of UX and the Need for Collaboration

Peter envisions a future where conversational AI powers highly personalized and intuitive interfaces. He stresses the need for collaboration between UX designers, conversation designers, researchers, developers, and product owners to create truly human-centered experiences.

Conclusion

Peter concludes by reiterating that conversational AI extends beyond chatbots; it's about replicating the richness of human communication in digital interactions. He calls for a collective effort to shape the future of UX, making technology more intuitive, responsive, and ultimately, more human.

I'm Pete, as you have all heard.

So I thought I'd just quickly explain a bit about Voiceflow and what we do.

So I'm the Senior Conversation Design Advocate there.

Voiceflow started as a an Alexa kind of skills building tool.

We quickly moved into conversation design for enterprises, so like a Figma but for conversational apps.

And now we do the whole end to end sort of thing so you can launch your, assistants and all that sort of thing.

And we work with brands, some pretty big brands, so Home Depot, Woolworths, BMW, Mattel, US Bank, Alexa, Salesforce, and Spotify.

All doing various things as well.

And, yeah.

As I said before, we're like Figma, but for talking to things.

Today I want to talk to you about how, conversational AI has this large gaping hole in it, which is UX.

There aren't enough UX, practitioners who are in the field.

And I want to talk about why UX and UI designers need to start looking past chatbots and voice interfaces as the main modality for conversational AI.

And start talking about what the interfaces of tomorrow could look like, especially if they're powered by conversational AI.

But first, I want to take a, But first, I want to take a trip back to 1972.

If any of you were around in 1972, first, it makes you a bit older than me.

But you might also remember a thing called Cartrivision.

Cartrivision was a system that promised to revolutionize the way that we consume TV shows, allowing you to record and do all that sort of stuff.

And when it came out, it was actually a huge technological leap.

However, it had a massive problem.

It was really hard to use.

To record a show, you actually had to use two hands, you had to have one pressing buttons, another turning knobs.

It only recorded every third frame, and overall it was very expensive, and it just failed, leaving a lot of its promises unfulfilled.

And that gave rise to the VCR.

Now the VCR had a, a solution to the two handed recording problem, you just press play and record, and I've got a lot of fond memories of recording some of my favorite shows.

I used to tape my favorite basketball team, the Indiana Pacers, all the time, and I watched the Blues Brothers probably about a hundred times, and from memory I think the recording was from a Channel 10, thing, but yeah, anyway, I digress, but VCRs also had this massive problem as well.

You had to be home to record and you couldn't skip ads.

And for those of you that did have, the ability to pre program a VCR, you need to be able to work out how to use the clock.

Which, from my memory of VCRs, you needed a PhD in astrophysics.

To work out, which led to the fact that most VCRs just had this blinking 12 going on forever.

Funnily enough, actually, I noticed in my hotel today that the oven had a blinking zero.

So it's obviously still a problem.

So what these examples both show you is that not that these are, technological artifacts, but they give you crucial lessons about the importance of user experience.

And they demonstrate that no matter how big your technological advancements are, they can, encounter obstacles.

If you don't have great user experience.

So fast forward to today and I want to start looking at conversational AI and generative AI and how the technology is really here to do some really amazing stuff but we have this massive UX shaped hole that needs to be addressed.

So that, generative AI and conversational AI don't fall victim to that same blinking 12 syndrome.

And to do this, I think we need to start rethinking about how we consume content.

So the way we consume content is very linear.

Left to right, to left in some cultures, top to bottom.

We have our H1s, 2s, 3s, 4s, body copy, hyperlinks, buttons, images, captions, etc.

And this actually makes a lot of sense.

It's really familiar, and we're used to consuming information in this way.

It's the way that, when technology had a lot of limitations, that top down linear approach made a lot of sense.

And, it's grounded in what we, have in the real world.

And to be fair, there is actually nothing wrong with this.

One of the best things, one of the things that a UX designer should always really strive for is to create familiar user interfaces so that they're easy to use.

The only problem with this current paradigm is that the user isn't actually in control at all.

The brand determines the most important content.

Rightly or wrongly, and then displays it to you.

And then it's up to the user to take whatever visual cues you can get from the website to go through whatever rabbit holes you need to go through to find out what you need to find out.

My belief is that Gen AI, along with conversational AI techniques, will give UX and UI designers the tools to actually escape this kind of way of thinking.

So I want to take a look at the nab dot com website, and I wanna start off by saying, this isn't a takedown on nab dot com.

I'm a nab dot com, or a NAB customer I used to write for nab.

Their website's actually pretty good, and they're very, they care a lot about accessibility and all those sorts of things.

But the other day I went to nab dot com.

And I went to check out my particular home loan.

And that, what you're seeing on the screen at the moment is what I saw.

And without scrolling down, which, had I known that I could just scroll down, I probably would have found what I was looking for a lot quicker.

But instead, I didn't see it straight away, so I jumped into the information architecture.

I dug through home loans and found my specific loan and then tried to find the answers to my questions that I had.

And, it wasn't really the most efficient way to do things.

It took me about 15 to 20 minutes to find the information that I needed so that I could make a decision about what I needed to do.

And the experience overall, it was fine.

It was shaped by the designer and this linear kind of way of going through a website.

And, it reflects this traditional approach that we've had to content.

But it doesn't really naturally, resonate with how we think and explore.

Because ultimately humans aren't linear.

Our thoughts, our inquiries, our desires don't go in straight lines.

They branch, they loop, they connect back, and they do all that in very unexpected ways.

We ask directions, we seek, we ask questions even.

Seek clarifications.

And often change directions as we go.

And this non linearity to the way that we think is what makes us human, and yet when it comes to design and just, to be fair, computers in general, it's often overlooked.

So wouldn't it be nice if we could start creating experiences that sort of mirror this complexity and richness of human thought?

Imagine if I could have gone to the NAB dot com website and said, what's your cheapest home loan?

Or I could have asked the specific question about the home loan that I have, and it gives me back clarifying questions, and the UI morphs and changes as I start talking to it.

And so by powering our interfaces with generative AI and conversational AI techniques, we can start adapting user interfaces to this sort of non linear way of thinking.

We can move from what Jacob Nilsson described as a command based interaction design into intent based outcome specification.

I've got to say, for someone who does UX, he does really bad names to talk about these sort of things.

But, I want to pause quickly.

And I want to bring this back home and when I talk to UX professionals, UI professionals, generally I say that I do conversational AI.

To be fair, not as much in the past year since ChatGPT has come around, but I would say that I do conversation design and they would get this glazed look over their face and be like, oh, chatbots.

And I get that because most chatbots are shit.

And I also get it because, for a UI, UX designer, designing a chat widget isn't actually the most interesting thing.

And I too, if I was doing those things, would probably much prefer to design a beautiful web page that is visually appealing.

But if there's anything that any of you can all take away from this is that conversational AI doesn't have to be a chat interface.

It doesn't have to be a voice interface.

It can power an interface in general.

It can be a supplementary thing that kind of allows you to do so much more cool things.

I, keep thinking about, Star Trek.

How, Picard would go computer and then says the thing and then it happens.

That's where I'm looking for us to go, where we have these interfaces that are context sensitive, dynamic and honestly just far more intuitive than what we currently have.

So as I mentioned, Jacob Nielsen has been talking about this thing called intent based outcome specification.

And essentially this is a user says what they want or the outcome they desire and it happens.

So what could this look like in practice?

When I start thinking about generative AI powering interfaces, I start thinking of things called, micro UIs, and, ephemeral apps, and I'll go into these sorts of things a bit more detail over the next few slides, but basically, it's this kind of shift towards a more human centric approach where technology adapts to us rather than the other way around.

Micro UIs, what are they?

I did not coin the term.

I wish I could remember who did, because I'd give them some kudos.

But to me, a micro UI represents smaller fragments as part of a UI, and allow us to have interactions with interfaces that are far more similar to a one on one conversation than a sort of a transactional experience.

They move and shift and dynamically generate new content on the fly.

So what could this look like?

I want you to imagine that you're booking a holiday.

So instead of starting with the destination, you start with the mood.

You say, I'm in the mood for a beach retreat.

Immediately, it offers you coastal destinations based on your past preferences, maybe trending locations and that sort of thing.

You mentioned that you want to relax, but you also want to, enjoy some water sports and that sort of thing.

So the UI morphs again, or parts of the UI morph again, showing you beach resorts that are known for tranquility, but also maybe have some, sort of action activities on the side.

You're vegan.

The, website that you're visiting remembers this and knows this about you and starts highlighting restaurants around these areas which are rated well for vegan options and those sorts of things.

Maybe it even starts pushing hotels higher up the list that have more superior vegan options for you to dine at.

And recollecting your last trip You say, I love that spa that I went to in Bali, and without missing a beat, the UI then morphs again, showing you resorts which have similar sort of spa experiences.

Finally, you're going, yeah, while I'm there, I actually want to do something fun.

Again, the UI throws out a little wild card.

It shows you a few events that are happening while you're there, maybe a moonlit beach, content.

And the beauty of the micro UI is that, In this scenario is its ability to weave this sort of tapestry of, different threads and evolve as the conversation continues, making your interaction feel way less like a transaction and far more like a conversation with a trusted friend.

Now, ephemeral apps.

So if we're looking at parts of the UI being able to morph and change as they go, then what is an ephemeral app?

The first thing I want you to think about is the fact that the average person has 80 apps on their phone, yet they only use 9 of them regularly.

So if we look at this kind of potential paradigm shift, we've got to start looking at the applications themselves.

And, the way that I like to think of an ephemeral application, is an application that is created and is transient and is used for a specific task.

What could this look like?

In this example, I'm going to talk you through, doing an event for a book club.

But this isn't your usual setup for a book club.

This is like a book club on steroids.

So first you start with, rather than a specific book, you go, I'd like to delve into the mysteries of the ancient world.

And the app suggests top novels about ancient civilizations, offering a blend of historical facts which go with them, and detailing thrilling plots about them.

And here, in this stage, the app could be orchestrating multiple microservices, like recommendation algorithms, maybe showing APIs for where you could get the books or book reviews, while also creating a graphical UI on the fly.

And perhaps you chose the Iliad, and you're thinking about decor and you go, I want it to feel like we're in ancient Greece when we're doing this book club.

The app immediately proposes decor, reminiscent of the era, maybe you get some playlists, and even suggests some costume ideas for your attendees.

In here we could see large language models coming up, helping you come up with ideas, image generation for mood boards, and APIs where a customer could maybe purchase these sorts of things.

Now you're looking at food for the gathering and you say some of my friends are gluten free.

And the app curates a list of gluten free recipes, options and that sort of thing that match the theme of the event that you're hosting and even maybe highlight local restaurants and catering options that you could do within the area.

Here there's, again, lots of microservices that are being automated.

We could have, integrations with databases for recipes.

We could have integrations with, your Uber Eats, your DoorDashes, and those sorts of things to find all the, potential catering options.

And to help drive discussion, you go, I want to discuss the book with maybe a related game or something like that.

And the app then goes about designing a set of trivia questions and games that you could play with your, the people attending the event.

Finally, after wrapping up, you've taken a heap of photos.

It's been a really great app.

You ask the app, you send the app some photos and you ask it to send thank you notes to everyone that attended.

And once done, the app just fades off into existence.

And there's no digital clutter.

Here you could see the app using image processing, gallery creation, email integrations, and automating the sending of notes using LLMs and all that sort of stuff.

Now, a lot of this stuff does sound like the stuff of science fiction.

And to be honest, we're not there yet, and it probably is in some way.

However, we are starting to see this sort of thing, come to fruition.

There are a few pretty interesting Gen AI examples, and there's supposed to be a gif playing at the moment, but whatever.

There's Google Duet for AppSheet, which actually allows you to create little apps from single prompts.

An example that I've seen is you could say I want to create an app to track my team's travel expenses.

And you go through a conversation and it builds an app for you.

Dora AI is allowing you to actually build entire websites with a single prompt.

How good they are, I'm not quite sure.

But the one which I found really interesting is Shopify's Sidekick Assistant, which actually allows you to query data on the website.

Things like sales and help you optimize your, web store.

On the fly, with prompts.

Generative AI can do this because it can learn from us and it's able to help drive, because it can code actually.

And it can learn from us and start to drive these innovative, user centric experiences at scale.

And I honestly believe this will end up being an absolute boon for creativity, allowing us to create these highly personalized experiences that are based on customer behavior, but unique customer behavior.

It's not going to be like we have these customer segments of these people that we think that all these people fit into these, buckets, you'll be able to create experiences that are, just for that customer.

And there is a massive, but we always have to remember that generative AI is a tool and it does not replace human insight.

I will repeat that again.

It is a tool and it doesn't replace human insight.

And it's the UX practitioners that will need to define the guidelines, establish the best practices and ensure that the technology aligns with human needs and values.

Sorry, I forgot to.

Let's do a little bit more here.

So what UX is going to actually need to be doing?

They're going to be needing to ask questions like how much information should actually be shown when we morph our micro UIs or using our ephemeral apps and that sort of thing.

How do we prioritize content?

Based on everything that we know about the customer, what's the best way to feed this to the customer?

And how do we ultimately make this experience enjoyable?

What this partnership is really about is a partnership between humans and machines and allowing each other to do what each other does best.

UX brings efficiency and scalability, while, oh sorry, AI brings efficiency and scalability, while UX designers bring understanding, empathy, and creativity.

And in this new paradigm, what's going to be happening is that human expertise guides AI, ensuring that our experiences are still human centered, inclusive, and ethical.

And making sure that ultimately it's a collaboration that serves us and is molded by us and it's done by those who understand us best, which is UX designers and just people in general.

And ultimately it's going to take an army of people to build these sort of assistants of tomorrow with so many people doing so many different cross functional tasks, making it all work together.

Why did that one animate?

None of them.

This is all right.

So the types of skills that we're going to need UX practitioners, UI designers.

Conversation Designers, Researchers, Linguists, Developers, Product Owners and more.

In terms of the kind of things that I can imagine all these people doing, UX practitioners will be researching how the customer is using these, very personalized UI components.

UI designers will be setting up best practices for how everything should look when a UI does create something new.

Conversation designers will be coming in and working out the best sort of turn structure to work out when we should clarify things from a customer and that sort of thing.

Researchers will be, doing research.

Linguists will be helping with the natural language processing.

Developers will be still needing to do code and all that sort of stuff.

And hopefully, product owners have a vision and they can really push that, vision forward.

So ultimately, again, this is the third time that I've, I'm saying this, but conversational AI isn't just about chatbots.

It's about recreating the richness and the nuance and the intuitive understanding that kind of marks human communication.

And together UX practitioners, conversation designers, product owners, everyone needs to come together to start thinking about how this new frontier is going to look.

And making sure that we can build something together that really aligns with what we want as humans.

Making sure that it's intuitive, responsive, and ultimately more human.

Thanks.

CONVERSATIONAL AI HAS A MASSIVE, UX-SHAPED HOLE

Peter Isaacs

Senior Conversation Design Advocate
Voiceflow

Introduction

I'm Peter Isaacs
Senior Conversation Design Advocate at Voiceflow
Worlds #1 Conversation Design and Development tool

Photograph of a person with short hair and a beard, wearing a black t-shirt, seated with a scenic background featuring greenery and a cloudy sky seen through a window.

WE HELP BRANDS LIKE THIS

Using Voiceflow for Chatbots, Modern IVR

Using Voiceflow for In-car Assistant

Using Voiceflow for Voice Enabled Toys

Using Voiceflow for Conversational Banking

Using Voiceflow for Prototyping, User Testing

Using Voiceflow for In-app Assistant

Logos of The Home Depot, Woolworths, BMW, Mattel, US Bank, Amazon Alexa, Salesforce, and Spotify representing different use cases of Voiceflow, including Chatbots, Modern IVR, In-car Assistant, Voice Enabled Toys, Conversational Banking, Prototyping, User Testing, and In-app Assistant.

Visual representation of UI design workflows. On the left, there is a design workflow with multiple frames showing different website layout screens. On the right, there is a flowchart diagram showing a "Booking Flow" process with various connected steps including "Learn More," "Verify Permissions," and "Explain Options."

An illustration of a person standing in front of a large, glowing, blue and teal tunnel with a light at the end. The person is in silhouette against the light.

Cartrivision

A colorful, repetitive pattern of vintage television sets in various colors in a grid format.

Why it failed

It was complex
It only recorded every third frame
It was very expensive

An illustrated image depicts a hand inserting a cartridge into a player. Below, there are smaller images of a man, a sports player, and a person using a microphone, with descriptions indicating the content available, such as movies, sports action, and TV shows.

The rise of VCRs

The image features various vintage videocassette cover designs from different brands, including BASF, Scotch, Polaroid, Kodak, AGFA, Philips, TDK, and Fujifilm.

The slide contains two images. The left image from the movie the Blues Brothers shows two individuals wearing black hats and sunglasses, one feeding the other. The right image depicts a close-up of an athlete in a basketball uniform in motion.

A digital clock showing the time 12:00 in blue digits.

The Persistent UX Challenge

A collage of colorful vintage television sets on the left and various VHS cassette covers on the right.

Bridging the UX Gap in Conversational AI

A series of overlapping cards with colored dots organized in an arch pattern.

BRIDGING THE UX GAP IN CONVERSATIONAL AI

Rethinking the web interface

We consume content linearly.
This is based on newspapers, books, etc.
Users don’t think linearly.

A grid of ten smartphone screen wireframes showcasing different app designs and functionalities.

Breaking the mold

Hierarchical designs require users to navigate through layers to find specific information.
This structure might not resonate with how people naturally think and search.
As a result, users might miss important "signposts" or cues.

Screenshot of a NAB Classic Banking account webpage that includes options to find out more about the account, and various popular banking solutions like home loans, credit cards, and bank accounts.

Humans aren't linear

An abstract, colorful illustration of tangled, multi-colored lines forming a complex, nonlinear shape.

A search bar with the text "What's".

Beyond chatbots

An image of a small robot toy with large buttons for eyes and a screen on its body.

A new paradigm in interaction design

"Intent-Based" lets users choose outcomes; tech does the rest.
From linear to dynamic: AI powers fluid, instant experiences.
A shift to tech that adapts to humans, not vice versa.

Micro User Interfaces

Micro UIs morph on the fly, mirroring one-on-one conversations.
User choices shape the next interface step, ensuring tailored interactions.
They're a seamless tech journey, adapting instantly to user inputs.

How a Micro UI Could Work

A green moss-like structure at the center of a blue background.

"I'm in the mood for a beach retreat"

A tropical beach scene with huts over the water and hammocks strung between palm trees at sunset.

"I want to relax, but try water sport"

A beach resort with cabanas, people parasailing, a banana boat ride, and a serene sea.

Personalised nearby restaurants

An illustration of a coastal area with various buildings, trees, bodies of water, boats, and map location icons.

"I loved that spa in Bali"

A luxurious outdoor spa setting in Bali with people relaxing in a seaside environment.

"I want to do something fun. Surprise me"

A vibrant beach party scene with a live band on stage, colorful lights, and people dancing on the sand under palm trees.

How an Ephemeral App Could Work

Illustration of an island with greenery and small structures, accompanied by a butterfly.

"I'd like to delve into mysteries of the ancient world."

An image of a hand holding a tablet displaying book covers.

"I want it to feel like we're in ancient Greece"

A reconstructed scene of ancient Greece, showing classical columns and ruins with people in period attire, replicating an ancient Greek setting.

Some of my friends are gluten free

An assortment of visually appealing food items, such as salads, grilled meats, and desserts, displayed in antique-style bowls and plates with bronze figures in the background.

"What games could we play"

A group of people dressed in ancient Greek attire with laurel wreaths, appearing to be in a lively discussion.

Wrapping up and fading away

On the left side, there is an illustration of a finger touching a smartphone screen, which shows a "GREEK Book Club" logo. Surrounding the finger, images appear to be dissolving and floating away, representing the theme of fading away.

Stuff of Sciencefiction?

A side-by-side presentation with a slide on the left featuring a question "Stuff of Sciencefiction?" overlaid on an image of leaves and plants.

In Production Gen AI Experiences

Google's Duet for AppSheet allows you to create apps from a single prompt.
Dora AI allows you build entire websites with a prompt.
Shopify's "Side Kick" allows you to query data and make changes to your website with a prompt.