AI Language Model Powers Up on Ancient Windows 98, Pentium II System!

Home » Technology » AI Language Model Powers Up on Ancient Windows 98, Pentium II System!

AI language model runs on a Windows 98 system with Pentium II and 128MB of RAM — Open-source AI flagbearers demonstrate Llama 2 LLM in extreme conditions

A Pentium II with 128MB of RAM achieved a remarkable 35.9 tokens per second.

EXO Labs recently shared an intriguing blog post detailing their experience with running the Llama AI model on a vintage Windows 98 system, complemented by a short social media video. The clip highlights an old Elonex Pentium II @ 350 MHz powering up with Windows 98, after which EXO launches its specialized C-based inference engine derived from Andrej Karpathy’s Llama2.c. They command the LLM to concoct a tale about “Sleepy Joe,” and impressively, it does so quite swiftly.

LLM operating on a 26-year-old Windows 98 PC with an Intel Pentium II CPU and 128MB RAM. Utilizing llama98.c, our tailored pure C inference engine based on @karpathy llama2.c. Code and DIY guide pic.twitter.com/pktC8hhvvaDecember 28, 2024

This groundbreaking achievement is just the beginning for EXO Labs. Emerging from obscurity in September, EXO Labs announced its mission to make AI accessible to all. Founded by a group from Oxford University, EXO is driven by the conviction that AI controlled by a few large corporations is detrimental to culture, truth, and societal foundations. Their goal is to create open infrastructure to train cutting-edge models and enable anyone to operate them on virtually any device. This demonstration using Windows 98 is a prime example of what’s possible with minimal resources.

The video shared on Twitter is quite brief, but thankfully, EXO’s blog post about Running Llama on Windows 98 offers more details. It is part of their “12 days of EXO” series, so there’s more to look forward to.

Acquiring an old Windows 98 PC on eBay was straightforward for EXO, but configuring it was not without its challenges. Data transfer was particularly tricky, leading them to utilize “good old FTP” to manage files through the retro machine’s Ethernet connection.

Adapting modern code to run on Windows 98 was another significant hurdle. Fortunately, they discovered Andrej Karpathy’s llama2.c — a lean 700-line C script capable of running inference on Llama 2 architecture models. Using the vintage Borland C++ 5.02 IDE and compiler, with a few adjustments, they managed to compile a Windows 98-compatible executable. The completed code is available on GitHub.

35.9 tok/sec on Windows 98 This is a 260K LLM with Llama-architecture.We also tried larger models. Results in the blog post. https://t.co/QsViEQLqS9 pic.twitter.com/lRpIjERtSrDecember 28, 2024

Alex Cheema from EXO praised Andrej Karpathy for his innovative code, which allowed a 260K LLM to run at 35.9 tok/sec on the old Windows 98 system. Karpathy, a former AI director at Tesla and co-founder of OpenAI, has contributed significantly to the field. Although a 260K LLM is relatively small, it still performed well on the dated 350 MHz single-core PC. According to the EXO blog, scaling up to a 15M LLM slowed generation speed to just over 1 tok/sec, while Llama 3.2 1B was extremely slow at 0.0093 tok/sec.

BitNet: A Vision for the Future

The story goes beyond merely running an LLM on a Windows 98 system. EXO concludes its post by discussing its future aspirations with BitNet, a transformative architecture using ternary weights. This design means a 7B parameter model only needs 1.38GB of storage, a manageable load even for a 26-year-old Pentium II, and trivial for more recent hardware. BitNet prioritizes CPU usage, avoiding the costly need for GPUs and claims to be 50% more efficient than full-precision models, capable of supporting a 100B parameter model on a single CPU at human-like reading speeds (around 5 to 7 tok/sec).

Before wrapping up, it’s worth noting that EXO is still seeking collaborators. If you’re interested in preventing the monopolization of AI by large corporations and think you can help, consider reaching out. For more casual engagement, EXO hosts a Discord Retro channel where enthusiasts discuss running LLMs on vintage tech like old Macs, Gameboys, Raspberry Pis, and more.

BitNet: A Vision for the Future

Similar Posts

See also

Intel vs. TSMC Process Nodes Battle: Speed or Density – Which Wins?

Musk Teases Grok 3 Dominance, Full Release Just Around the Corner!

Data Hoarders Rush to Save Vanishing U.S. Federal Websites!

US Government Urges TSMC, Intel to Launch Joint Venture: Inside the Mega Deal!

White House Reviews CHIPS Act Awards, Delays Likely in Payments

Rumors Spark Fear in Taiwan Over Potential Loss of ‘Silicon Shield’ to TSMC and Intel

Leave a Comment Cancel reply

AI Language Model Powers Up on Ancient Windows 98, Pentium II System!

BitNet: A Vision for the Future

Similar Posts

See also

Intel vs. TSMC Process Nodes Battle: Speed or Density – Which Wins?

Musk Teases Grok 3 Dominance, Full Release Just Around the Corner!

Data Hoarders Rush to Save Vanishing U.S. Federal Websites!

US Government Urges TSMC, Intel to Launch Joint Venture: Inside the Mega Deal!

White House Reviews CHIPS Act Awards, Delays Likely in Payments

Rumors Spark Fear in Taiwan Over Potential Loss of ‘Silicon Shield’ to TSMC and Intel

Leave a Comment Cancel reply

Contact details

Categories

Useful links