r/LocalLLaMA Daily Update (24h, 2026-03-24 JST)
Top concrete r/LocalLLaMA updates from the last 24 hours: notable model releases/signals, tooling updates, and practical resources.
Window: last 24 hours (reported on 2026-03-24 JST)
Models
- Qwen3.5-9B finetune/export with Opus 4.6 reasoning distillation + mixed extras — new community finetune drop aimed at stronger reasoning behavior in a small local footprint.
- Looks like MiniMax M2.7 weights will be released in ~2 weeks — high-signal open-weights roadmap update discussed broadly in-thread.
- Mistral-4-Small UNCENSORED (MLX / Mac-focused release) — additional local deployment option for Apple Silicon users.
Tools/Frameworks
- Introducing oQ: data-driven mixed-precision quantization for Apple Silicon (mlx-lm compatible) — new quantization workflow for MLX stacks on Mac.
- Reworked LM Studio plugins out now (Plug’n’Play Web Research, fully local) — concrete plugin/tooling refresh for local-agent workflows.
- Phone Whisper: push-to-talk dictation for Android with local Whisper (sherpa-onnx) — practical mobile local STT utility release.
Resources
- Exa AI introduces WebCode, a new open-source benchmarking suite — fresh benchmark resource for web-focused coding/eval scenarios.
- Awesome-Autoresearch (collection around Karpathy-style autoresearch workflows) — curated references and tools for autonomous research pipelines.
- WMB-100K: open-source benchmark for AI memory systems at 100K turns — long-context/memory evaluation resource with explicit test framing.