GRENOBLE, France – Feb. 21, 2024 – CEA-Leti has developed a keyword-spotting system that dramatically improves accuracy in always-on, voice-activated Edge-AI systems and that consumes less power in a far smaller silicon footprint than current technology.
www.leti-cea.com/cea-tech/leti/english, Feb. 21, 2024 –
Presented in a paper at ISSCC 2024 in San Francisco, the new architecture uses time-domain signal processing on oscillators locked by injection and is suitable for devices running on energy harvesters, which supply power below 0.5V. The paper, "0.4V 988nW Time-Domain Audio Feature Extraction for Keyword Spotting Using Injection-Locked Oscillators", reports accurate speech recognition at power consumption below one microwatt.
It describes the first injection-locked, oscillator-based time-domain audio feature extraction (TD-FEx) demonstrating keyword spotting operating down to 0.4V, while achieving 91 percent accuracy on 10 words. TD-FEx information is not coded as a voltage but as a time delay of two clocks' signals. In addition to being well suited for advanced nodes, its advantages are digital-like implementation with low-supply voltage and better noise immunity than current systems. CEA-Leti's system demonstrated accurate speech recognition with power consumption below 1 µW.
Some analog-based audio feature extraction (FEx) units using multi-channel Gm-C bandpass filters can supply 10 times the power efficiency of digital FEx units in a comparable silicon area. "However, analog FEx circuits have not demonstrated KWS with more than four keywords," the paper reports. "They also suffer from a large footprint, challenging technology migration and limited dynamic range at low supply voltage, while speech signals have inherently a high dynamic range."