The Raspberry Pi of AI: Korean startup wants to bring ultra-cheap, high performance NPUs to billions of devices using its own secret quantization sauce, with $1/TOPS target on the horizon

Click here to visit Original posting

DeepX is a South Korean AI technology company that specializes in deep learning solutions across industries like autonomous systems, robotics, and healthcare. At the recent 2024 Embedded Vision Summit, DeepX presented its first-generation chips, the V1 and M1, designed for different applications, and hinted at its upcoming next-generation chip focused on AI for on-device and autonomous robot applications.

The V1 SoC (previously called L1) features the DeepX 5-TOPS NPU paired with quad-RISC-V CPUs and a 12-MP image signal processor. This sub-$10 SoC is built on Samsung’s 28-nm process and runs the YOLO v7 model at 30fps, consuming just 1-2 watts. It supports the latest CNN algorithms for computer vision and is designed for products like IP and CCTV cameras, robotic cameras, and drones.

The M1 is a larger accelerator created to work with a host CPU. It reportedly achieves the highest cost-efficiency (inference/$), power-efficiency (TOPS/W) and performance efficiency (FPS/TOPS). AI performance is 25-TOPS and it consumes 5 watts. It's suitable for use in consumer and industrial robots, machine visions, AI required IPC and HPC, smart factories and edge computing.

Partnering with LG

DeepX CEO Lokwon Kim told Sally Ward-Foxton from EE Times that the company is collaborating with LG to port LLMs to DeepX’s chip for use in mobile devices, cars, and white goods. “[AI in the device] really makes sense for their business model for LLMs, that’s why we’re collaborating,” Kim said. “They are providing their LLM technology so we can learn about the model’s characteristics and optimize for on-device applications.” The result will be an NPU chip optimized for running LLMs on-device, but initially, it will function solely as an accelerator. It is expected to take another 3-5 years to develop a fully LLM-capable SoC.

The next chip on DeepX’s roadmap is the V3, developed in response to feedback from Chinese and Taiwanese customers. The V3 will reportedly feature a 15-TOPS dual-core DeepX NPU with quad-Arm Cortex A52 CPU cores and will operate below 5 watts on average. “Previously we used a RISC-V CPU, but customers wanted to have Arm,” Kim told Ward-Foxton. “That’s why we targeted an Arm quad-core there. Customers also wanted USB 3.1, a more powerful ISP -not an upgrade on the NPU. That’s why we redesigned it.”

As EE Times explains, “Customers wanted Arm CPUs in part because the Arm ecosystem can provide better security solutions - many customers are building security camera systems. Other customers want to run the robot operating system, which is now supported on Arm, though it has not come to RISC-V yet.”

DeepX says it will continue to offer the RISC-V-based V1 alongside the Arm-based V3 (samples of which are expected by the end of 2024), promising to support both architectures well into the future.

The Raspberry Pi of AI: Korean startup wants to bring ultra-cheap, high performance NPUs to billions of devices using its own secret quantization sauce, with $1/TOPS target on the horizon

Partnering with LG

More from TechRadar Pro

NYT Wordle today — answer and my hints for game #1362, Wednesday, March 12

'Garbage' to blame Ukraine for massive X outage, experts say

I test AI agents for a living and these are the 5 reasons you should let tools like ChatGPT Deep Research get things done for you

I compared Manus AI to ChatGPT – now I understand why everyone is calling it the next DeepSeek