Forbes published an article recently providing insight into the turmoil inside Facebook surrounding the company’s recently announced in-home camera device, Portal.
In an article titled “On Mute: How Facebook Floundered On Voice Technology” Forbes writer Parmy Olson outlines how Facebook’s newest product, which is designed to take on home assistants such as Amazon Alexa and Google Home, failed to launch without a key element – a digital voice assistant. Instead, the device comes integrated with Amazon Alexa, essentially making it a Facebook produced, Amazon controlled in-home webcam.
Facebook’s Portal looked like a slick alternative to the Amazon Echo speaker when it launched it earlier this month, but problems abounded behind the scenes. Facebook had already delayed the video-calling device due to privacy concerns around the Cambridge Analytica scandal. And when it finally did launch, there was a glaring omission: no voice assistant from Facebook. Instead it came with Alexa, meaning anyone who bought the 15.6-inch version for $350 got an awkward gateway to Amazon, whose competing Echo Show cost at least $100 less. It also meant Facebook was blocked from collecting any speech data to train its voice technology further.
Olson notes that the lack of voice assistant from Facebook isn’t due to lack of trying on Facebook’s part, the company has been investing heavily in voice recognition technology since 2013:
Facebook started investing heavily in voice tech from 2013. Yet despite that early start, being one the world’s biggest technology companies with 30,275 employees and booking nearly $16 billion in 2017 profits, the company has yet to plant a stake in technology that lets you talk to computers, widely-regarded as the next wave of human-to-machine interfaces.
The omission points to Facebook’s broader difficulties in turning innovative technology into products. Among its previous misfires: Android launcher Home, which shut down in April 2013; virtual currency Credits (closed in Sept. 2013); Snapchat competitors Poke and Slingshot (2014 and 2015) and mobile development platform Parse (2017). In the field of voice, Facebook bought multiple speech-based companies and hired experts specializing in voice technology over the last five years, but it has struggled to turn those investments into useful services, two senior sources who worked at the company told Forbes, largely due to chaotic product priorities and confusion over where researchers should focus their time. “After five years, to still not have a product is shameful,” said one.
Olson states that a reason for issues with the technology has been due to internal conflict at Facebook over the direction that a voice search project should take:
Facebook has in fact worked on voice technology, but its efforts have suffered from confusing directions between product managers and voice engineers, as well as pressure to move more quickly than the development of voice-recognition technology allows. Product managers often wanted voice-based research to turn into products “within half a year,” said another senior engineering source, who asked not to be named due to non-disclosure agreements. Group-based product reviews held every six months would typically spur a change of direction, from voice-based search, to news transcription, to a voice-assistant for Messenger — all internal projects that never turned into products.
The problem was that building voice technology takes much longer than half a year due to its sheer complexity. Voice data is constantly changing. There are different types of microphones, varying accents and different processing hardware between phones. To build software that recognizes speech, you also need to train it on a database of voices first, then put it out in the wild and train it some more on real voices.
Read the full article in Forbes here.