Design & Reuse

Processor architecture for edge AI/ML inference workloads

Signal processing IP provider CEVA has announced its latest generation processor architecture for artificial intelligence and machine learning (AI/ML) inference workloads on edge AI and edge compute devices.

www.smart2zero.com, Jan. 11, 2022 – 

The NeuPro-M processor architecture is a self-contained heterogeneous architecture composed of multiple specialized co-processors and configurable hardware accelerators that seamlessly and simultaneously process diverse workloads of Deep Neural Networks - boosting performance by 5-15x compared to its predecessor. NeuPro-M supports both system-on-chip (SoC) as well as Heterogeneous SoC (HSoC) scalability to achieve up to 1,200 TOPS and offers optional robust secure boot and end-to-end data privacy.

NeuPro–M compliant processors initially include the following pre-configured cores:

  • NPM11 – single NeuPro-M engine, up to 20 TOPS at 1.25GHz
  • NPM18 – eight NeuPro-M engines, up to 160 TOPS at 1.25GHz

A single NPM11 core, when processing a ResNet50 convolutional neural network, achieves a 5x performance increase and 6X memory bandwidth reduction versus its predecessor, which results in power efficiency of up to 24 TOPS per watt. NeuPro-M is capable of processing all known neural network architectures, as well as integrated native support for next-generation networks like transformers, 3D convolution, self-attention and all types of recurrent neural networks.

NeuPro-M has been optimized to process more than 250 neural networks, more than 450 AI kernels and more than 50 algorithms. The embedded vector processing unit (VPU) ensures future proof software-based support of new neural network topologies and new advances in AI workloads. In addition, the CDNN offline compression tool can increase the FPS/Watt of the NeuPro-M by a factor of 5-10x for common benchmarks, with very minimal impact on accuracy.

click here to read more...