Hey—we've moved. Visit
The Keyword
for all the latest news and stories from Google
Official Blog
Insights from Googlers into our products, technology, and the Google culture
Speech technology at Google: teaching machines to talk and listen
February 22, 2011
This is the latest post in our
series
profiling entrepreneurial Googlers working on products across the company and around the world. Here, you’ll get a behind-the-scenes look at how one Googler built an entire R&D team around voice technology that has gone on to power products like YouTube transcriptions and Voice Search. - Ed.
When I first interviewed at Google during the summer of 2004, mobile was just making its way onto the company’s radar. My passion was speech technology, the field in which I’d already worked for 20 years. After 10 years of speech research at SRI, followed by 10 years helping build Nuance Communications, the company I co-founded in 1994, I was ready for a new challenge. I felt that mobile was an area ripe for innovation, with a need for speech technology, and destined to be a key platform for delivery of services.
During my interview, I shared my desire to pursue the mobile space and mentioned that if Google didn’t have any big plans for mobile, then I probably wouldn’t be a good fit for the company. Well, I got the job, and I started soon after, without a team or even a defined role. In classic Google fashion, I was encouraged to explore the company, learn about what various teams were working on and figure out what was needed.
After a few months, I presented an idea to senior management to build a telephone-based spoken interface to local search. Although there was a diversity of opinion at the meeting about what applications made the most sense for Google, all agreed that I should start to build a team focused on speech technology. With help from a couple of Google colleagues who also had speech backgrounds, I began recruiting, and within a few months people were busily building our own speech recognition system.
Six years later, I’m excited by how far we’ve come and, in turn, how our long-term goals have expanded. When I started, I had to sell other teams on the value of speech technology to Google's mission. Now, I’m constantly approached by other teams with ideas and needs for speech. The biggest challenge is scaling our effort to meet the opportunities. We've advanced from GOOG-411, our first speech-driven service, to
Voice Search
,
Voice Input
,
Voice Actions
, a
Voice API
for Android developers,
automatic captioning of YouTube videos
,
automatic transcription of voicemail for Google Voice
and
speech-to-speech translation
, amongst others. In the past year alone, we’ve ported our technology to more than 20 languages.
Speech technology requires an enormous amount of data to feed our statistical models and lots of computing power to train our systems—and Google is the ideal place to pursue such technical approaches. With large amounts of data, computing power and an infrastructure focused on supporting large-scale services, we’re encouraged to launch quickly and iterate based on real-time feedback.
I’ve been exploring speech technology for nearly three decades, yet I see huge potential for further innovation. We envision a comprehensive interface for voice and text communication that defies all barriers of modality and language and makes information truly universally accessible. And it’s here at Google that I think we have the best chance to make this future a reality.
Update
9:39 PM
: Changed title of post to clarify that speech technology is not only used on mobile phones but also for transcription tasks like YouTube captioning and voicemail transcription. -Ed.
Posted by Mike Cohen, Manager, Speech Technology
Labels
accessibility
41
acquisition
26
ads
131
Africa
19
Android
58
apps
419
April 1
4
Asia
39
books + book search
48
commerce
12
computing history
7
crisis response
33
culture
12
developers
120
diversity
35
doodles
68
education and research
144
entrepreneurs at Google
14
Europe
46
faster web
16
free expression
61
google.org
73
googleplus
50
googlers and culture
202
green
102
Latin America
18
maps and earth
194
mobile
124
online safety
19
open source
19
photos
39
policy and issues
139
politics
71
privacy
66
recruiting and hiring
32
scholarships
31
search
505
search quality
24
search trends
118
security
36
small business
31
user experience and usability
41
youtube and video
140
Archive
2016
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2007
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2006
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2005
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2004
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Google
on
Follow @google
Follow
Give us feedback in our
Product Forums
.