Posts

Showing posts with the label asitop

Using mlx-lm to run local LLM

Image
Introduction mlx-lm is a library designed by Apple to optimize running Large Language Models directly on Apple Silicon chips. Compared to Ollama, mlx-lm has superior performance advantages due to its ability to directly access Unified Memory and maximize the power of Apple GPUs, resulting in faster processing speeds and better energy efficiency for Mac users. Prerequisites Because mlx-lm was developed specifically for Apple Silicon chips, the following instructions are only applicable if you are using an Apple computer. Detail First, install mlx-lm pip install mlx-lm Then, visit this HuggingFace page of the mlx community . This is a reputable page sharing LLMs that have been converted from GGUF to MLX to be suitable for running on Macs with Apple Silicon chips. You can search for models that fit your usage needs and machine configuration. Here, I will use the model mlx-community/Qwen2.5-Coder-7B-Instruct-4bit. The model name mlx-community/Qwen2.5-Coder-7B-Instruct-4bit consists of th...