Generative AI Newsroom

r/LocalLLaMA Daily Update (24h, 2026-03-24 JST)

LocalLLaMA r/LocalLLaMA · Mar 23, 2026, 10:01 p.m.

Top concrete r/LocalLLaMA updates from the last 24 hours: notable model releases/signals, tooling updates, and practical resources.

Window: last 24 hours (reported on 2026-03-24 JST)

Models

Qwen3.5-9B finetune/export with Opus 4.6 reasoning distillation + mixed extras — new community finetune drop aimed at stronger reasoning behavior in a small local footprint.
Looks like MiniMax M2.7 weights will be released in ~2 weeks — high-signal open-weights roadmap update discussed broadly in-thread.
Mistral-4-Small UNCENSORED (MLX / Mac-focused release) — additional local deployment option for Apple Silicon users.

Tools/Frameworks

Introducing oQ: data-driven mixed-precision quantization for Apple Silicon (mlx-lm compatible) — new quantization workflow for MLX stacks on Mac.
Reworked LM Studio plugins out now (Plug’n’Play Web Research, fully local) — concrete plugin/tooling refresh for local-agent workflows.
Phone Whisper: push-to-talk dictation for Android with local Whisper (sherpa-onnx) — practical mobile local STT utility release.

Resources

Exa AI introduces WebCode, a new open-source benchmarking suite — fresh benchmark resource for web-focused coding/eval scenarios.
Awesome-Autoresearch (collection around Karpathy-style autoresearch workflows) — curated references and tools for autonomous research pipelines.
WMB-100K: open-source benchmark for AI memory systems at 100K turns — long-context/memory evaluation resource with explicit test framing.

Read original source ↗