A new kind of museum for a new kind of media: Voice.

The Audio Museum of Art is a hands-free eyes-free museum toured with your voice.

Challenge

The voice web was a brave new media channel. An oft-wobbly technology scrambling to walk on spring legs. But it was a landscape full of fresh opportunities. To define an emerging medium and explore formats of information-sharing previously unavailable.

Discovery

Given the industry’s newly emerging status, the project challenge was to select the right voice platform and product category, then create a hopefully engaging experience that would stand out from current offerings.

To select the right voice platform, the first critical datapoint was having a large installed user base.

While the voice industry encompases both smart speakers and mobile-based assistants, it was smart speakers that were capturing the excitement in the industry. And in the smart speaker category, Amazon’s Echo devices held an installed user base lead of 3x over its nearest competitor, the Google Home.

With emerging markets it’s often difficult to validate future potential. This is especially true in the controlled, walled-garden environment of the voice industry, where each platform operates independently. However, new product development activity is a good indicator of each platform’s potential, including essential elements like technology infrastructure and developer support. A continuing growth rate of new skills indicated brands and developers were impressed with the platform and willing to invest their time and resources toward it.

Solution

With a foundation of research, the next step was determining a specific approach. Wanting to impart a sense of artistry and creativity, the idea of an audio museum that was toured by voice projected an artistic image and charted new territory in the voice space.

App development and prototyping took place in Voiceflow, where user experience flows and content examples could be worked out through an Agile approach.

With the concept and structure in place, there were several additional considerations.

A key decision was whether to use synthetic or recorded speech. Wanting to demonstrate the ability to project brand personas through synthetic speech, the decision was made to use TTS technology.

A synthetic voice casting session ensued, with a review of all available voices in Polly, Amazon’s TTS tool. A foreign voice was initially considered as a way of adding a sense of distinction. However, this was decided against, as too many of the pronunciations were difficult to understand.

For the Text-to-Speech technology, the Kendra voice in Polly was selected for its proximity to the brand persona. A female doyen of the art world. With SSML, the pacing was slowed to give an introspective, thoughtful sensibility to the character.

In contrast to the seriousness of the MC, a humorous male docent voice guided users through the first gallery. For this, the Matthew voice was selected.

To attain the right voice characteristics for the specific personas, Synthetic Speech Markup Language (SSML) was used extensively.

Using SSML to achieve a particular distinction for Amazon Polly TTS voices

Lastly, sound effects, uncommon in voice apps, were used to establish a feeling of atmosphere and place.

The Audio Museum of Art. A museum toured with your voice.

“Alexa, open Audio Museum of Art”

A new kind of museum for a new kind of media: Voice.

The Audio Museum of Art is a hands-free eyes-free museum toured with your voice.

Challenge

Discovery

Solution

Footer