Voice-based interaction has become a major disrupter in the world of human and machine interaction. Gone are the days of the “Command and Control” Voice User Interface (VUI), as the focus has shifted to a voice-based conversational UI. Devices like Amazon Alexa, Google Home, the Apple Home-Pod and others are not just redefining how we live our lives, but are bringing about a huge change in how end users experience and engage with technology.


What everybody is aiming for is to build a conversational platform between device and human. Voice is everywhere, and interacting with our voices is far more natural and intuitive than typing or tapping on a keyboard. Speech conveys more meaning than text, and we can see inflections based on how we interact with various systems. Those same nuances that make us human, however, pose a challenge when designing a VUI that can seamlessly communicate with users across devices.


VUI enables users to communicate and interact with voice-enabled systems using speech-recognition technology in a hands-free environment. When deciding whether to use a VUI or a Graphical User Interface (GUI), the answer is simple: determine whether it’s better to respond via natural speech patterns or in a way that only a machine will recognize. Conventional UIs can only be interpreted by computers, so it helps to minimize new products’ and services’ use of them. That doesn’t mean VUI should completely replace a GUI; in certain situations, text input with conversation creates the best user experience. For example, graphical interfaces can be quite efficient when filling-in a lengthy form or comparing multiple products. But asking a device to place an order for ice cream or book a taxi to a destination is much easier with VUI.


Why Voice UI
There are several reasons designers might choose voice UI over graphical UI:

  • Talking is faster than typing
  • Voice is omnidirectional
  • The user experience remains same in VUI, while only the device or medium of interaction changes
  • Interacting with technology can seem more human


Designing a voice-based conversational UI is not about mimicking how humans interact. It is about designing a voice-based interface that applies human-to-computer interaction design principles within the technical constraints we face in our day-to-day life. In short, it leverages how we interact with each other to build a voice-based user interface that makes it more intuitive for users to communicate with machines.


While designing a VUI, the aim should be to make the user experience simple, engaging and reliable. For an engaging and delightful user experience, the VUI designer must consider several aspects:

  • When and where the interaction will take place
  • The level of information being exchanged between the user and the system
  • The proximity of the dialogue between the user and the system, and how to handle it
  • How to make the conversation strategy as natural as possible by avoiding anything superficial


To build an engaging VUI, here are four tips that can help designers combine human-centred design techniques with leading-edge technologies regardless of development platform.

  1. Identify Your Users
    Identifying the end-user persona is key to designing an effective VUI. A persona is a representation of your end user base. Each persona represents a specific type of user who will interact with the system under various scenarios based on their behavioural patterns. Each persona will expect an easy solution to their queries, which will vary from user to user depending on the context and situation. Identifying these users and creating specific personas will make the rest of the VUI design process much easier to manage.
  2. Build a Sample Dialogue or an Experience Flow
    Sample dialogue can help anticipate the flow of how users will interact with the voice-enabled device. This flow should include what users are going to say and how the system will answer back under various scenarios, such as a happy conversational path or a complicated conversation flow. For example, let’s assume that a user is ordering ice cream. The sample dialogue should include the various scenarios a user has to go through while interacting with the voice-enabled device. This should include the user initiating the ice cream-ordering process and the device confirming via voice prompt when the ordering process is complete.

    For example, if a user says “Order a banana peanut butter chip ice cream,” the system should reply back saying “Ok, ordering for banana peanut butter chip ice cream.” Building a sample dialogue helps to visualize how the user and system will interact by minimizing the user’s cognitive load during the exchange.

  3. Tool for Prototyping

    Choosing the right tool to prototype a VUI is really important. There are various tools available to build an interactive wireframe or a high-fidelity interactive user interface for non-voice based applications. Similarly, it’s important to choose the appropriate tool to create the VUI based on the sample dialogue or experience flow. There are many tools available, such as Adobe Sayspring, Tortu and Botsociety, and it’s important to pick the one meets the project’s requirements.
  4. Error Handling Dialogues 

    While designing a VUI, make sure to have in place a perfect error-handling strategy. For a great end-user experience, the error-handling strategy should be based on context-specific scenarios, be transparent, and must guide the user accordingly. Building trust is critical. Users should always feel that they are in control. Think of errors as opportunities to create meaningful conversations between the system and the user. Also, adding a related sound (Earcon) before the voice-based error message can be more meaningful while building an error-handling strategy.


Several other areas deserve consideration while designing a VUI. For instance, it’s important to provide contextual help when needed and to allow users to respond back, but not to overwhelm users with too many interaction choices. Likewise, effective VUIs should provide short information and ask users whether they want to hear more before offering additional options. Having a mechanism to remember a user’s previous interactions in certain contexts can also be helpful, as are prompts for a verbal response if the system has waited too long for input. Finally, just like real-life verbal reminders, it can be useful when designing a VUI for the system to provide feedback on progress for a task that requires completing multiple steps.


The relationship between voice and screen has added a new dimension to VUI designing. In spite of the high profiles of certain branded devices, voice-based UI still has a long way to go in terms of efficiency. The evolution of VUI can be considered the beginning of a new kind of service design, and it’s one with enormous possibilities. Aside from natural-language issues, identifying the environments in which a VUI will be used remains a key challenge, yet it also presents unique opportunities. Voice technology supported by a well-designed VUI will only become more productive as voice-enabled devices increase in number as time goes by.

Anindya Sengupta

Anindya Sengupta


Anindya is a strategy-focused user-experience professional who uses a human-centred design approach to solve business problems. He encourages participatory and iterative design techniques that help his clients to determine the future direction of their products.

What you’ve read here? Tip of the iceberg. Are you ready to be part of the excitement?