TL;DR

A software engineer has demonstrated that it is possible to run certain AI language models locally on a MacBook Pro with 24GB RAM. While these models are not state-of-the-art, they can perform basic tasks with acceptable speed, reducing dependence on cloud services. This development highlights potential for more accessible local AI use, though with limitations.

A software engineer has demonstrated that it is feasible to run a smaller AI language model, Qwen 3.5 9B, locally on an M4 MacBook Pro with 24GB of memory, achieving functional performance for basic tasks without internet access.

The experiment involved configuring models on various local AI frameworks such as LM Studio and Pi, with the best results obtained using Qwen 3.5-9B (Q4) with specific settings for thinking mode and context window size. The model runs at approximately 40 tokens per second, enabling interactive tasks like coding assistance and research.

Compared to larger state-of-the-art models, the local setup is less capable of handling complex, multi-step reasoning tasks or long-term problem solving. The engineer notes that while the model is not as powerful as cloud-based SOTA models, it still offers meaningful utility for basic research and coding, with the advantage of offline operation and reduced reliance on external cloud services.

Why It Matters

This development matters because it demonstrates that accessible, smaller-scale AI models can be run locally on consumer hardware, expanding options for privacy-conscious users and reducing dependence on large cloud providers. It also indicates a potential shift toward more flexible AI deployment, though with acknowledged performance limitations.

Apple 2024 MacBook Pro with Apple M4 Pro Chip (16-inch, 24GB RAM, 512GB SSD Storage) (QWERTY English) Space Black (Renewed)

Apple 2024 MacBook Pro with Apple M4 Pro Chip (16-inch, 24GB RAM, 512GB SSD Storage) (QWERTY English) Space Black (Renewed)

  • Processor Options: M4 Pro or M4 Max chip
  • Display: 16-inch Liquid Retina XDR display
  • Brightness: Up to 1600 nits peak brightness

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Recent years have seen rapid advances in large language models (LLMs), primarily hosted in the cloud due to their computational demands. Smaller models have been available but often require significant tuning and configuration to run efficiently on consumer hardware. The experiment by Johanna Larsson, a software engineer, builds on ongoing efforts to democratize AI by making it more accessible and private, especially as cloud costs and privacy concerns grow.

“It’s surprisingly good for something that can run on a 24GB MacBook Pro while leaving space for lots of other things running too.”

— Johanna Larsson

“While it’s not a 10x productivity boost, it’s something, and it’s interesting.”

— Johanna Larsson

Amazon

local AI model running software MacBook

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how well these local models will perform across diverse real-world tasks or how scalable the setup is for more complex applications. The long-term stability and ease of use also remain to be tested across different hardware configurations and user expertise levels.

Engineering a Small AI Language Model: Training, Evaluation, and Deployment Without Myth

Engineering a Small AI Language Model: Training, Evaluation, and Deployment Without Myth

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Further testing and optimization are expected to improve performance and usability. Developers and researchers may explore integrating these models into workflows for specific tasks, and hardware improvements could expand the capabilities of local AI deployment.

Repair Tool Kit for Apple Macbook,MINGFIX Macbook Pro/Air Screwdriver Set with P2 P5 P6 Pentalobe,T3 T4 T5 T6 T8 Torx,Ph000 Ph00 Ph0 Phillips & Y0 Screwdrivers for Teardown,Opening,Screen Replacement

Repair Tool Kit for Apple Macbook,MINGFIX Macbook Pro/Air Screwdriver Set with P2 P5 P6 Pentalobe,T3 T4 T5 T6 T8 Torx,Ph000 Ph00 Ph0 Phillips & Y0 Screwdrivers for Teardown,Opening,Screen Replacement

  • Precision Screwdriver Set: Includes 10 interchangeable bits for various screws
  • Universal Compatibility: Suitable for MacBook, other laptops, electronics
  • Complete Repair Kit: Includes screwdrivers, pry tools, tweezers, suction cup

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Can I run larger models locally on my M4 MacBook?

Currently, models larger than around 20-30B parameters are unlikely to run efficiently on 24GB RAM, but ongoing optimizations may extend this limit in the future.

What are the main limitations of running local models like Qwen 3.5 9B?

These models are less capable of complex reasoning, long-term memory, and multi-step problem solving compared to state-of-the-art cloud models. They also require careful configuration and tuning.

Is this setup suitable for production or critical tasks?

No, these models are primarily experimental and suitable for research, coding, or basic tasks. They are not reliable substitutes for larger, cloud-hosted models for critical applications.

What hardware is needed to run these models locally?

A MacBook Pro or similar laptop with at least 24GB RAM, a capable CPU, and sufficient storage is required. GPU acceleration is not necessary but can improve performance.

You May Also Like

The Core Ultra 7 270K was too good, so Intel scrapped the flagship Core Ultra 9 290K Plus — benchmarks of the 290K prototype find slim 2% faster performance in gaming and applications

Intel has scrapped the flagship Core Ultra 9 290K Plus, citing performance issues and slim gains over the 270K Plus, as confirmed by recent benchmarks.

The Cloud Divide: Data Security in a Fractured Global Cloud Ecosystem

Managing data security across fractured global clouds requires understanding regional laws and proactive strategies—discover how to stay protected in this complex environment.

Native all the way, until you need text

Developers find native SDKs insufficient for complex rich text tasks, leading many to turn to web-based solutions like Electron for chat apps.

Singapore Armed Forces Debuts Dual Commands to Bolster Its Cyber Defense Edge.

Fostering a new era in cybersecurity, Singapore Armed Forces unveils dual commands to tackle evolving threats—discover how these initiatives reshape national defense.