MLX-VLM: Run Vision AI Models Natively on Your Mac
Powerful vision-language inference and fine-tuning directly on Apple Silicon for startups building multimodal AI applications. Perfect for Mac-first AI founders.
4,840 stars554 forksPythonQuality 8/10Updated 6/1/2026100% free ยท open source
What it does
MLX-VLM allows startup founders to run and fine-tune powerful vision-language models directly on their Macs, enabling the development of multimodal AI applications
When to use it
โขYou're building an AI application that requires both image and text understanding
โขYou need to fine-tune a pre-trained vision-language model for your specific use case
โขYou prefer to develop and test AI models on your Mac rather than relying on cloud services
Quick start
1Install MLX-VLM using pip: `pip install mlx-vlm`
2Clone the MLX-VLM GitHub repository to access example code and models: `git clone https://github.com/Blaizzy/mlx-vlm`
3Import the library and load a pre-trained model: `from mlx_vlm import VLM; model = VLM.from_pretrained('model_name')`
4Use the model for inference or fine-tuning: `model.predict(image, text)`
Ready-to-paste prompt
from mlx_vlm import VLM; model = VLM.from_pretrained('vlc-bert-base'); model.predict(image='path/to/image.jpg', text='This is a picture of a cat')