DeepSeek’s open source AI models have made significant technological strides, now capable of running on relatively affordable hardware such as the Raspberry Pi.
In a breakthrough development, it has been reported by ITHome that the DeepSeek V3 and R1 models are compatible with Moore Threads GPUs, a product of China. This compatibility marks a significant milestone for DeepSeek, the hardware designers, and China, potentially broadening the scope for Moore Threads and diminishing the dependency on Nvidia hardware for both DeepSeek and China.
According to reports, Moore Threads has effectively deployed the DeepSeek-R1-Distill-Qwen-7B distilled model on its MTT S80 consumer graphics card and the MTT S4000, a higher-end datacenter-grade graphics card. The deployment utilized the Ollama lightweight framework, which allows the operation of large language models on MacOS, Linux, and Windows systems, coupled with a highly optimized inference engine to deliver ‘high’ performance levels.
The reports highlight ‘excellent’ and ‘high’ performance levels when describing the MTT S80 and MTT S4000’s capabilities with the DeepSeek-R1-Distill-Qwen-7B distilled model. However, specific performance metrics or comparisons with other hardware were not provided, making it challenging to validate these claims. Moreover, the limited availability of the MTT S80 outside of China further complicates verification efforts.
Ollama supports a variety of models including Llama 3.3, DeepSeek-R1, Phi-4, Mistral, and Gemma 2, facilitating their efficient local execution, which eliminates the need for cloud-based resources. Primarily developed for macOS, Ollama utilizes Metal for Apple GPU acceleration, CUDA for Nvidia, and ROCm for AMD acceleration.
While Ollama officially does not support Moore Threads GPUs, the company asserts that its graphics processors are capable of executing code intended for CUDA GPUs. This compatibility has been confirmed, establishing that Moore Threads GPUs can handle CUDA code and are well-suited for AI tasks, particularly those involving Chinese-language applications.
In an effort to further boost performance, Moore Threads has implemented a proprietary inference engine that integrates custom computational optimizations and enhanced memory management. This integration not only elevates computing performance and resource efficiency but also facilitates a smoother deployment process for supporting future AI models. It’s important to note that these results pertain to a distilled model, and thus direct comparisons of Moore Threads GPU performance with those from AMD, Apple, or Nvidia are currently not feasible.
Similar Posts
- Nvidia’s RTX 4090 Crushes AMD’s 7900 XTX in AI Speed, Nearly 50% Faster!
- Huawei Ascend 910C Rivals Nvidia H100 with 60% Inference Performance, Study Shows
- Huawei Pulls in $118 Billion Revenue Despite Harsh U.S. Sanctions!
- DeepSeek’s $1.6 Billion GPU Empire: Not as Disruptive as Hyped?
- ByteDance Evades U.S. Sanctions with $7 Billion Nvidia Cloud Deal!

Avery Carter explores the latest in tech and innovation, delivering stories that make cutting-edge advancements easy to understand. Passionate about the digital age, Avery connects global trends to everyday life.






