The Internet of Things has been a trend since its emergence in the early 2010s.
Everything from cars to lights to houses to refrigerators to even toasters have now been connected to the internet. On top of that, companies such as Amazon have come up with products like the Echo, a voice-enabled wireless speaker that is your home automation hub as much as it is your personal assistant. It’s represented by its voice-only persona, Alexa, who you can train to learn new skills to assist you with all sorts of tasks.

But, despite all of the progress, the IoT Revolution presents a unique question:how do we design for an IoT product like Alexa that has no user interface? The Capital One team needed to tackle that exact challenge when they designed an Alexa skill to allow users to manage finances just through talking with the Echo device.
In this interview with Stephanie Hay (Head of Content Strategy at Capital One), we explore the unique challenge of designing UX without a traditional UI.

What was your role on the Capital One Alexa team?
At Capital One, I work with more than 200 folks. I was the lead designer on the initial release of Alexa.
We have a human-centered culture that enables everybody on the design team, including content strategy and design thinking and user research, which are different special practices within the larger design, to do great integrated work across a company the size of ours.
I joined 2 years ago because of that reason. Scott Zimmer, the head of global design for Capital One here is my boss. In the past, there was no content strategy team on the design team and he said, “We’ve got this opportunity to infuse humanity and clarity into the experiences that we’re designing to create really personalized, tailored conversation, thanks to all the great data that we have on customers and how they’re interacting with their money. So let’s do it.”
True to form, the Alexa team is exactly as I just described. A beautiful, cross-functional team of product managers, engineers, and designers who absolutely have to understand the context in how someone who is probably walking around their kitchen or their living room wants to talk about their money when they’re reading the Sunday paper or just finished asking Alexa to stop listening to NPR.
It’s an entirely new context and it’s a brave new frontier for our cross-functional team to explore.
How does the Capital One Alexa skill integrate into people’s live?
Money is already integrated in people’s lives.
You’re thinking about money when you’re buying groceries, when you’re cooking dinner, when you’re watching TV, and so many other instances. It sits in the background of people’s lives all the time. Alexa, smart homes, virtual systems, and AI all have the promise of being backgrounded in that same way to create experiences that naturally integrate in your life, your daily behavior, and the triggers that influence them, like watching TV or cooking dinner.
For us, wherever Alexa goes, Capital One could go. We wanted to be easily accessible to users, so they can talk about their money whenever needed.
How did you conduct user research for the project?
We run regular user interviews as part of the design process. We talk to customers everyday.
To better understand the natural flow of human conversation, we also dived into raw transcripts of call center research and studied Google search keywords. We learned about people’s transactional needs and their emotional context when talking about money.
What interesting insights did you uncover about human behavior?
People use different language to end a conversation, and we have to design for that.
We can suggest terms, like “all set,” but ultimately it’s up to us to understand when the customer wants to continue, and when to be done. That’s a pure language design challenge.
Can you describe the process of designing a conversation when no UI is available?
This project is the purest design challenge I’ve ever taken on.
All fidelity is removed. It’s only emotion and language.
One use case was hearing your balances. So what if you’re a credit card-only customer? What if you’re a credit card, plus checking account, plus savings account customer? What does the word “balance” mean in both of those cases? Balance, when it comes to a credit card, is what you owe. On the other hand, the balance in a checking account is what you have, like available cash. Even that word alone needs to change, and you start getting into this and realize that you have no UI to delineate that.
The challenge kept getting deeper and deeper.
In Amazon’s world, “utterances” form the language for what customers say to Alexa. We made that language more flexible by building word clouds around people’s intention. We had to ask: did they mean “how much money do I have” or “how much money do I have to spend”. We started designing use cases instead of only conversation.
One of our core principles was the establishment of assumptions. Let’s design to answer their question first, and then ask clarifying questions next.
Next, we needed to get as specific with the persona, based on as much real data as possible. For example, Alexa can answer your question better if we know you’re just a credit card customer or a savings account customer.
We also needed to pay close attention to Alexa’s inflection. What looks great on paper when we would design it wouldn’t always sound right because the cadence of her statements and pauses. Engineering, product marketing, legal, and design were all on the phone together everyday creating content together for these conversations. We tested the conversations in the labs, listening to the way they sounded on Alexa, and then ultimately released them.
What were some of the most important iterations you completed after user testing?
When people first paired their accounts, Amazon didn’t have voice recognition at that time, so Alexa didn’t know if she was talking to you or me. She’s just sitting on somebody’s counter in an open environment. So we asked: should we actually build in an extra layer of security that would act on top of the username and password?
With Amazon, the answer was, “Yes. Let’s go ahead and build that in.”
We also invented what’s called a personal key. It’s a four digit passcode that you can create to further protect your financial information. If you make that personal key, then Alexa’s going to prompt you for it every time you try to talk to her about money.
That’s really where having a cross-functional team comes into play: to accommodate user context and data. That’s the only way in a conversational UI to design well. Our first release is in market now and we’re learning from it, so there’s only more to come.
Where do you see IoT and AI headed in the future? What’s most exciting for you as a design leader?
Being untethered is most exciting. It means we’re available 24/7 – you just live your life knowing we’ve got your back.
That’s what immersive design looks like. We shift the paradigm and go wherever you are, and we’re there in such a frictionless and adaptive way, you can’t believe you ever existed without it.
If you enjoyed this post, download the free e-book Real-Life UX Processes. The guide explains the secret sauce behind the products of companies like Slack, Autodesk, 3M, and others.
 
                                
 
                                