Optical sensor measures speaker’s skin vibrations and transforms them into audio signal
Israeli startup VocalZoom has developed a sensor that it says allows “near perfect” voice control performance, even when there is a lot of background noise, like a car with an open window.
“VocalZoom’s sensor can hear your voice inside your mouth, so it is not contaminated by background noises, ” VocalZoom CEO Tal Bakish in a phone interview. This is done by measuring the vibrations on the skin of the people who are talking, he said.
“The challenge was to make something accurate and low cost,” he said.
The aim is to improve on existing voice recognition technologies and voice by metric systems — in which people can be recognized based on their voice. The voice recognition systems in use today, by Google and Apple’s Siri, for example, “are not good enough, they are used by geeks but not anyone else,” he said, because the system does not always manage to correctly interpret the voice commands.
VocalZoom’s technology is an optical sensor that is immune to acoustic noise. It measures the vibrations that are created only when a person is speaking — from the throat and face of the speaker — and converts the data into an audio signal.
VocalZoom has been working with Honda Xcelerator to apply its technology to the in-car experience. The two companies presented the results of their collaboration last week at the CES 2017 consumer electronics trade show in Las Vegas.
The Honda Xcelerator is an innovation program designed to promote collaborations between early stage technology startups and global Honda.
“The technology can be used anywhere and in any gadget,” Bakish said, “But the automotive industry is our primary focus” because the industry has a high demand for technology that allows accurate and clear communication through voice commands to make it more convenient and safer for drivers.
VocalZoom’s optical sensor can be installed in a car’s rear-view mirror, dashboard, seats or ceiling, where it can be used to acquire data from tiny vibrations in a driver’s facial skin while issuing voice commands. This data is measured and converted to an isolated, “near-perfect reference signal” that automotive voice control systems can understand and quickly respond to, regardless of noise levels, the company said.
Tests have shown performance improvements of at least 50 percent compared to traditional speech-recognition technology in a quiet automotive environment, and even better in noisy environments, VocalZoom said in a statement.
The Yokne’am, Israel-based company, founded in 2010 to focus on human-to-machine-communication, has been working together with a number of original equipment manufacturers to integrate its technology into their products. Honda is just one of the car manufacturers it is working with, Bakish said.
The first application of VocalZoom’s speech recognition technology will be in the consumer electronics market, he said, in products like virtual reality headsets and virtual reality helmets for motorcycles, sometime during 2017 and 2018, he said. The technology is expected to be deployed in cars, including Honda models, sometime in 2019.
VocalZoom investors include 3M Ventures, OurCrowd, Motorola Solutions and Japan’s Fuetrek Co.