scouzi1966/maclocal-api

每日信息看板 · 2026-03-07

开源项目

AI 总结

If you find this useful, please ⭐ the repo!   Also check out Vesta AI Explorer — my full-featured native macOS AI app. [!NOTE] 5 Mar, 2026. Apologies. The…

[Vesta AI Explorer](https://kruks.ai/) — full-featured native macOS AI chat app
[AFMTrainer](https://github.com/scouzi1966/AFMTrainer) — LoRA fine-tuning wrapper for Apple's toolkit (Mac M-series & Linux CUDA)
[Apple Foundation Model Adapters](https://developer.apple.com/apple-intelligence/foundation-models-adapter/) — Apple's adapter training too…
**🔗 OpenAI API Compatible** - Works with existing OpenAI client libraries and applications
**🧠 MLX Local Models** - Run any Hugging Face MLX model locally (Qwen, Gemma, Llama, DeepSeek, GLM, and 28+ tested models)
**🌐 API Gateway** - Auto-discovers and proxies Ollama, LM Studio, Jan, and other local backends into a single API

#GitHub #repo #开源项目

原链接

内容摘录

If you find this useful, please ⭐ the repo! &nbsp; Also check out Vesta AI Explorer — my full-featured native macOS AI app.
[!NOTE]
5 Mar, 2026. Apologies. There were a few glitches for the brew and pip packages deployed. It should be fixed by now. Please report any issues.

**Attention M-series Mac AI enthusiasts!** You don't need to be a Swift developer to explore. Vibe coding really allows anyone to participate in this project. A lot of the hype is real! It does work.
Fork this repo first, then clone your fork to submit PRs:

To just experiment locally

/build-afm is an AI skill that builds for the first time so that you can start coding
Start vibe coding! I will add support for skills with more coding agents in the future.
afm — Run Any LLM on Your Mac, 100% Local

Extensive testing of Qwen3.5-35B-A3B with afm. Uses an experimental technique with Claude and Codex as judges for evaluation scoring. Click the link below to view test results.
afm-next Nightly Test Report — Qwen3.5-35B-A3B Focus

Run open-source MLX models **or** Apple's on-device Foundation Model through an OpenAI-compatible API. Built entirely in Swift for maximum Metal GPU performance. No Python runtime, no cloud, no API keys.
Install

| | Stable (v0.9.6) | Nightly (afm-next) |
|---|---|---|
| **Homebrew** | brew install scouzi1966/afm/afm | brew install scouzi1966/afm/afm-next |
| **pip** | pip install macafm | — |
| **Release notes** | v0.9.6 | Latest nightly |
[!TIP]
**Switching between stable and nightly:**

What's new in afm-next
[!IMPORTANT]
The nightly build is the future stable release. It includes everything in v0.9.6 plus:
No new features yet — nightly is currently in sync with the stable release
Quick Start
Use with OpenCode

OpenCode is a terminal-based AI coding assistant. Connect it to afm for a fully local coding experience — no cloud, no API keys. No Internet required (other than initially download the model of course!)

**1. Configure OpenCode** (~/.config/opencode/opencode.json):

**2. Start afm with a coding model:**

**3. Launch OpenCode** and type /connect. Scroll down to the very bottom of the provider list — macafm (local) will likely be the last entry. Select it, and when prompted for an API key, enter any value (e.g. x) — tokenized access is not yet implemented in afm so the key is ignored. All inference runs locally on your Mac's GPU.

---
28+ MLX Models Tested

!MLX Models

28 models tested and verified including Qwen3, Gemma 3/3n, GLM-4/5, DeepSeek V3, LFM2, SmolLM3, Llama 3.2, MiniMax M2.5, Nemotron, and more. See test reports.

---

Swift
macOS
License
⭐ Star History

Star History Chart
Related Projects
Vesta AI Explorer — full-featured native macOS AI chat app
AFMTrainer — LoRA fine-tuning wrapper for Apple's toolkit (Mac M-series & Linux CUDA)
Apple Foundation Model Adapters — Apple's adapter training toolkit
🌟 Features
**🔗 OpenAI API Compatible** - Works with existing OpenAI client libraries and applications
**🧠 MLX Local Models** - Run any Hugging Face MLX model locally (Qwen, Gemma, Llama, DeepSeek, GLM, and 28+ tested models)
**🌐 API Gateway** - Auto-discovers and proxies Ollama, LM Studio, Jan, and other local backends into a single API
**⚡ LoRA adapter support** - Supports fine-tuning with LoRA adapters using Apple's tuning Toolkit
**📱 Apple Foundation Models** - Uses Apple's on-device 3B parameter language model
**👁️ Vision OCR** - Extract text from images and PDFs using Apple Vision (afm vision)
**🖥️ Built-in WebUI** - Chat interface with model selection (afm -w)
**🔒 Privacy-First** - All processing happens locally on your device
**⚡ Fast & Lightweight** - No network calls, no API keys required
**🛠️ Easy Integration** - Drop-in replacement for OpenAI API endpoints
**📊 Token Usage Tracking** - Provides accurate token consumption metrics
📋 Requirements
**macOS 26 (Tahoe) or later
**Apple Silicon Mac** (M1/M2/M3/M4 series)
**Apple Intelligence enabled** in System Settings
**Xcode 26 (for building from source)
🚀 Quick Start
Installation
Option 1: Homebrew (Recommended)
Option 2: pip (PyPI)
Option 3: Build from Source
Running
MLX Local Models

Run open-source models locally on Apple Silicon using MLX:

Models are downloaded from Hugging Face on first use and cached locally. Any model from the mlx-community collection is supported.
📡 API Endpoints
Chat Completions
**POST** /v1/chat/completions

Compatible with OpenAI's chat completions API.
List Models
**GET** /v1/models

Returns available Foundation Models.
Health Check
**GET** /health

Server health status endpoint.
💻 Usage Examples
Python with OpenAI Library
JavaScript/Node.js
curl Examples
Single Prompt & Pipe Examples
🏗️ Architecture
🔧 Configuration
Command Line Options
Environment Variables

The server respects standard logging environment variables:
LOG_LEVEL - Set logging level (trace, debug, info, notice, warning, error, critical)
⚠️ Limitations & Notes
**Model Scope**: Apple Foundation Model is a 3B parameter model (optimized for on-device performance)
**macOS 26+ Only**: Requires the latest macOS with Foundation Models framework
**Apple Intelligence Required**: Must be enabled in System Settings
**Token Estimation**: Uses word-based approximation for token counting (Foundation model only; proxied backends report real counts)
🔍 Troubleshooting
"Foundation Models framework is not available"
Ensure you're running **macOS 26 or later
Enable **Apple Intelligence** in System Settings → Apple Intelligence & Siri
Verify you're on an **Apple Silicon Mac**
Restart the application after enabling Apple Intelligence
Server Won't Start
Check if the port is already in use: lsof -i :9999
Try a different port: afm -p 8080
Enable verbose logging: afm -v
Build Issues
Ensure you have **Xcode 26 installed
Update Swift toolchain: xcode-select --install
Clean and rebuild: swift package clean && swift build -c release
🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first…