TechHub

Grigory ShepelevI am an <a href="https://fosstodon.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a>-enhanced coding believer now since I've started working at the new place (3-4 months ago). Using <a href="https://fosstodon.org/tags/openrouter" class="mention hashtag" rel="nofollow noopener" target="_blank">#openrouter</a> is a corporate practice there and it's kinda obligatory. Now I want to enhance my <a href="https://fosstodon.org/tags/guix" class="mention hashtag" rel="nofollow noopener" target="_blank">#guix</a> setup with all mcp's possible, upgrade a video card in my desktop and start local <a href="https://fosstodon.org/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a> server and share it with some friends.

risseRunning a privacy-friendly local LLM on a Raspberry Pi? It's possible, check out my video below<a href="https://www.youtube.com/watch?v=TNxIIDkP2Zg" rel="nofollow noopener" translate="no" target="_blank">https://www.youtube.com/watch?v=TNxIIDkP2Zg</a><a href="https://mastodon.content.town/tags/raspberrypi" class="mention hashtag" rel="nofollow noopener" target="_blank">#raspberrypi</a> <a href="https://mastodon.content.town/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#ai</a> <a href="https://mastodon.content.town/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a> <a href="https://mastodon.content.town/tags/makersgonnamake" class="mention hashtag" rel="nofollow noopener" target="_blank">#makersgonnamake</a> <a href="https://mastodon.content.town/tags/linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#linux</a> <a href="https://mastodon.content.town/tags/tech" class="mention hashtag" rel="nofollow noopener" target="_blank">#tech</a>

Saemon ZixelЗапустил llama.cpp на другой материнке с процессором AMD E2-3000. Это хоть и аналог Intel Atom, но посовременнее.Разбор промпта и генерация ответа стали чуть-чуть быстрее. На 10 процентов примерно. Хотя память DDR3 работает на шине 1600МГц и быстрее в 1,5 раза, чем предыдущая DDR2 на 1066МГц шине. Зато процессор был на 2,6ГГц. А у этого всего лишь 1,6ГГц.Перекомпилировал llama.cpp на этом процессоре, и скорость прям удвоилась. Vikhr-Llama-3.2-1B-Q8_0 выдаёт 2 токена в секунду. А QwQ-500M.Q8_0 выдаёт 6 токенов в секунду и прям так бодренько пишет ответ. Правда, моделька глупенькая, склонна рассуждать и редко правильно отвечает. Как я понял, это всё из-за поддержки процессором AVX1 и FP16C. А скорость оперативной памяти, к сожалению, тут почти не играет роли.<a href="https://mastodon.ml/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a> <a href="https://mastodon.ml/tags/vikhr" class="mention hashtag" rel="nofollow noopener" target="_blank">#vikhr</a> <a href="https://mastodon.ml/tags/qwq" class="mention hashtag" rel="nofollow noopener" target="_blank">#qwq</a> <a href="https://mastodon.ml/tags/amd" class="mention hashtag" rel="nofollow noopener" target="_blank">#amd</a>

Eric CurtinRamaLama just got multimodal! 🚀 See, understand & respond to visual info with new VLM capabilities. Shoutout to Xuan-Son Nguyen! <a href="https://social.treehouse.systems/tags/RamaLama" class="mention hashtag" rel="nofollow noopener" target="_blank">#RamaLama</a> <a href="https://social.treehouse.systems/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a> <a href="https://social.treehouse.systems/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a><a href="https://developers.redhat.com/articles/2025/06/20/unleashing-multimodal-magic-ramalama" rel="nofollow noopener" translate="no" target="_blank">https://developers.redhat.com/articles/2025/06/20/unleashing-multimodal-magic-ramalama</a>

Eric CurtinStef Walter utilising one of <a href="https://social.treehouse.systems/tags/RamaLama" class="mention hashtag" rel="nofollow noopener" target="_blank">#RamaLama</a> 's latest features, containerised multi-modal inferencing. We make great use of Xuan-Son Nguyen's demo application <a href="https://social.treehouse.systems/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a>

Olivier Chafikllama.cpp streaming support for tool calling & thoughts was just merged: please test & report any issues 😅<a href="https://github.com/ggml-org/llama.cpp/pull/12379" rel="nofollow noopener" translate="no" target="_blank">https://github.com/ggml-org/llama.cpp/pull/12379</a><a href="https://fosstodon.org/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a>

Eric CurtinOn route to <a href="https://social.treehouse.systems/tags/redhatsummit" class="mention hashtag" rel="nofollow noopener" target="_blank">#redhatsummit</a>, watch out for: "AI inferencing for developers and administrators", "Securing AI workloads with RamaLama", "RamaLama Making developing AI Boring". We may even see a vlm demo, very accurate models as we can see here <a href="https://social.treehouse.systems/tags/ramalama" class="mention hashtag" rel="nofollow noopener" target="_blank">#ramalama</a> <a href="https://social.treehouse.systems/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a>

Boiling SteamVision Now Available in Llama.cpp: <a href="https://github.com/ggml-org/llama.cpp/blob/master/docs/multimodal.md" rel="nofollow noopener" target="_blank">https://github.com/ggml-org/llama.cpp/blob/master/docs/multimodal.md</a> <a href="https://mastodon.cloud/tags/linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#linux</a> <a href="https://mastodon.cloud/tags/update" class="mention hashtag" rel="nofollow noopener" target="_blank">#update</a> <a href="https://mastodon.cloud/tags/foss" class="mention hashtag" rel="nofollow noopener" target="_blank">#foss</a> <a href="https://mastodon.cloud/tags/release" class="mention hashtag" rel="nofollow noopener" target="_blank">#release</a> <a href="https://mastodon.cloud/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a> <a href="https://mastodon.cloud/tags/vision" class="mention hashtag" rel="nofollow noopener" target="_blank">#vision</a> <a href="https://mastodon.cloud/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#ai</a> <a href="https://mastodon.cloud/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#llm</a>

WinbuzzerMicrosoft Clippy Returns as AI Assistant, Empowered By LLMs You Can Run Locally on Your PC<a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a> <a href="https://mastodon.social/tags/Clippy" class="mention hashtag" rel="nofollow noopener" target="_blank">#Clippy</a> <a href="https://mastodon.social/tags/AIClippy" class="mention hashtag" rel="nofollow noopener" target="_blank">#AIClippy</a> <a href="https://mastodon.social/tags/AIAssistants" class="mention hashtag" rel="nofollow noopener" target="_blank">#AIAssistants</a> <a href="https://mastodon.social/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#LLMs</a> <a href="https://mastodon.social/tags/LocalAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#LocalAI</a> <a href="https://mastodon.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#OpenSource</a> <a href="https://mastodon.social/tags/ElectronJS" class="mention hashtag" rel="nofollow noopener" target="_blank">#ElectronJS</a> <a href="https://mastodon.social/tags/LlamaCpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#LlamaCpp</a> <a href="https://mastodon.social/tags/GGUF" class="mention hashtag" rel="nofollow noopener" target="_blank">#GGUF</a> <a href="https://mastodon.social/tags/Gemma3" class="mention hashtag" rel="nofollow noopener" target="_blank">#Gemma3</a> <a href="https://mastodon.social/tags/Llama3" class="mention hashtag" rel="nofollow noopener" target="_blank">#Llama3</a> <a href="https://mastodon.social/tags/Phi4" class="mention hashtag" rel="nofollow noopener" target="_blank">#Phi4</a> <a href="https://mastodon.social/tags/Qwen3" class="mention hashtag" rel="nofollow noopener" target="_blank">#Qwen3</a> <a href="https://mastodon.social/tags/RetroTech" class="mention hashtag" rel="nofollow noopener" target="_blank">#RetroTech</a> <a href="https://mastodon.social/tags/MicrosoftOffice" class="mention hashtag" rel="nofollow noopener" target="_blank">#MicrosoftOffice</a> <a href="https://mastodon.social/tags/OnDeviceAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#OnDeviceAI</a> <a href="https://winbuzzer.com/2025/05/06/microsoft-clippy-returns-as-ai-assistant-empowered-by-llms-you-can-run-locally-on-your-pc-xcxwbn/" rel="nofollow noopener" translate="no" target="_blank">https://winbuzzer.com/2025/05/06/microsoft-clippy-returns-as-ai-assistant-empowered-by-llms-you-can-run-locally-on-your-pc-xcxwbn/</a>

Saemon ZixelА llama.cpp достаточно легко и просто скомпилировалась в моей 32битной altlinux. Зависимостей мизер. Ничего не потребовалось доустанавливать, компилить. При этом работает стабильно, не ругается, не сегфолтиться.Тестил с Vikhr-Llama-3.2-1B-Q8_0.gguf, которая на 1,2ГБ и знает русский язык. Скорость "чтения" промпта 2 токена/сек. А скорость генерации ответа 1 токен/сек. Для вопросов "не к спеху" можно использовать, но качество ответа так себе. Замечу, что компьютер у меня старенький: Pentium D E6300 на 2,8Ггц, поддерживает максимум SSSE3 и работает с памятью DDR2 на 4ГБ. По этому, то, что есть уже радует меня)<a href="https://lor.sh/tags/llama" class="mention hashtag" rel="nofollow noopener" target="_blank">#llama</a> <a href="https://lor.sh/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a> <a href="https://lor.sh/tags/linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#linux</a> <a href="https://lor.sh/tags/vikhr" class="mention hashtag" rel="nofollow noopener" target="_blank">#vikhr</a>

PhilBig hopes for Qwen3. IF the 30A3B model works well, <code>gptel-org-tools</code> will be very close to what I envision as a good foundation for the package. It's surprisingly accurate, especially with reasoning enabled. At the same time, I'm finding that <a href="https://fed.bajsicki.com/tags/gptel" rel="nofollow noopener" target="_blank">#gptel</a> struggles a lot with handling LLM output that contains reasoning, content and tool calls at once. I'm stumped. These new models are about as good as it's ever been for local inference, and they work great in both the llama-server and LM Studio UI's. Changing the way I prompt doesn't work. I tried taking an axe to gptel-openai.el, but I frankly don't understand the code nearly well enough to get a working version going. So... yeah. Kinda stuck. Not sure what next. Having seen Qwen3, I'm not particularly happy to go back to older models. <a href="https://fed.bajsicki.com/tags/emacs" rel="nofollow noopener" target="_blank">#emacs</a> <a href="https://fed.bajsicki.com/tags/gptelorgtools" rel="nofollow noopener" target="_blank">#gptelorgtools</a> <a href="https://fed.bajsicki.com/tags/llamacpp" rel="nofollow noopener" target="_blank">#llamacpp</a>

Hassan HabibRun AI completely offline with Llama-CLI and C#! 🚀 No cloud. Full control. Watch the full guide here: <a href="https://www.youtube.com/watch?v=lc6lVCe0XHI" target="_blank" rel="nofollow noopener" translate="no">https://www.youtube.com/watch?v=lc6lVCe0XHI</a> <a href="https://techhub.social/tags/AI" class="mention hashtag" rel="tag">#AI</a> <a href="https://techhub.social/tags/CSharp" class="mention hashtag" rel="tag">#CSharp</a> <a href="https://techhub.social/tags/OfflineAI" class="mention hashtag" rel="tag">#OfflineAI</a> <a href="https://techhub.social/tags/LlamaCpp" class="mention hashtag" rel="tag">#LlamaCpp</a>

Peter LordStarted preparing for my next talk on <a href="https://u3acommunities.org/@u3acommunities.org" class="u-url mention" rel="nofollow noopener" target="_blank">@u3acommunities.org</a>. Will outline running <a href="https://mastodon.social/tags/generativeai" class="mention hashtag" rel="nofollow noopener" target="_blank">#generativeai</a> locally, mainly for privacy reasons. Will include <a href="https://mastodon.social/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a> <a href="https://mastodon.social/tags/ollama" class="mention hashtag" rel="nofollow noopener" target="_blank">#ollama</a> <a href="https://mastodon.social/tags/AUTOMATIC1111" class="mention hashtag" rel="nofollow noopener" target="_blank">#AUTOMATIC1111</a> <a href="https://mastodon.social/tags/openwebui" class="mention hashtag" rel="nofollow noopener" target="_blank">#openwebui</a> and probably others.Any pointers of things to mention appreciated !

N-gated Hacker News🐪🤯 Oh, the riveting saga of Llama.cpp's heap—it’s like watching paint dry, but with more compiler errors. Our intrepid hacker spent 30 hours (yes, you read that right) dissecting code so niche, even the bugs were disinterested. 🐛💤 <a href="https://retr0.blog/blog/llama-rpc-rce" rel="nofollow noopener" translate="no" target="_blank">https://retr0.blog/blog/llama-rpc-rce</a> <a href="https://mastodon.social/tags/LlamaCpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#LlamaCpp</a> <a href="https://mastodon.social/tags/Debugging" class="mention hashtag" rel="nofollow noopener" target="_blank">#Debugging</a> <a href="https://mastodon.social/tags/CodeNiche" class="mention hashtag" rel="nofollow noopener" target="_blank">#CodeNiche</a> <a href="https://mastodon.social/tags/CompilerErrors" class="mention hashtag" rel="nofollow noopener" target="_blank">#CompilerErrors</a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#HackerNews</a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#HackerNews</a> <a href="https://mastodon.social/tags/ngated" class="mention hashtag" rel="nofollow noopener" target="_blank">#ngated</a>

Hacker NewsHeap-overflowing Llama.cpp to RCE<a href="https://retr0.blog/blog/llama-rpc-rce" rel="nofollow noopener" translate="no" target="_blank">https://retr0.blog/blog/llama-rpc-rce</a><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#HackerNews</a> <a href="https://mastodon.social/tags/HeapOverflow" class="mention hashtag" rel="nofollow noopener" target="_blank">#HeapOverflow</a> <a href="https://mastodon.social/tags/LlamaCpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#LlamaCpp</a> <a href="https://mastodon.social/tags/RCE" class="mention hashtag" rel="nofollow noopener" target="_blank">#RCE</a> <a href="https://mastodon.social/tags/CyberSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#CyberSecurity</a> <a href="https://mastodon.social/tags/Exploit" class="mention hashtag" rel="nofollow noopener" target="_blank">#Exploit</a> <a href="https://mastodon.social/tags/TechNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#TechNews</a>

Nexus6I've just published the second part of my guide on setting up an AI/LLM stack in Haiku. If you've been curious about running AI models on alternative operating systems, this one's for you! 🔗 <a href="https://blog.nexus6.me/new%20adventures%20in%20ai/Setup-an-environment-for-AI-in-Haiku-Part-2/" rel="nofollow noopener" translate="no" target="_blank">https://blog.nexus6.me/new%20adventures%20in%20ai/Setup-an-environment-for-AI-in-Haiku-Part-2/</a> <a href="https://mastodon.social/tags/HaikuOS" class="mention hashtag" rel="nofollow noopener" target="_blank">#HaikuOS</a> <a href="https://mastodon.social/tags/langchain" class="mention hashtag" rel="nofollow noopener" target="_blank">#langchain</a> <a href="https://mastodon.social/tags/openai" class="mention hashtag" rel="nofollow noopener" target="_blank">#openai</a> <a href="https://mastodon.social/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a>

Nexus6I've just published the first part of my guide on setting up an AI/LLM stack in Haiku. If you've been curious about running AI models on alternative operating systems, this one's for you! 🔗 <a href="https://blog.nexus6.me/new%20adventures%20in%20ai/Setup-an-environment-for-AI-in-Haiku-Part-1/" rel="nofollow noopener" translate="no" target="_blank">https://blog.nexus6.me/new%20adventures%20in%20ai/Setup-an-environment-for-AI-in-Haiku-Part-1/</a> <a href="https://mastodon.social/tags/HaikuOS" class="mention hashtag" rel="nofollow noopener" target="_blank">#HaikuOS</a> <a href="https://mastodon.social/tags/langchain" class="mention hashtag" rel="nofollow noopener" target="_blank">#langchain</a> <a href="https://mastodon.social/tags/openai" class="mention hashtag" rel="nofollow noopener" target="_blank">#openai</a> <a href="https://mastodon.social/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a>

Hacker NewsLlama.cpp AI Performance with the GeForce RTX 5090 Review — <a href="https://www.phoronix.com/review/nvidia-rtx5090-llama-cpp" rel="nofollow noopener" translate="no" target="_blank">https://www.phoronix.com/review/nvidia-rtx5090-llama-cpp</a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#HackerNews</a> <a href="https://mastodon.social/tags/LlamaCPP" class="mention hashtag" rel="nofollow noopener" target="_blank">#LlamaCPP</a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a> <a href="https://mastodon.social/tags/GeForceRTX5090" class="mention hashtag" rel="nofollow noopener" target="_blank">#GeForceRTX5090</a> <a href="https://mastodon.social/tags/NVIDIA" class="mention hashtag" rel="nofollow noopener" target="_blank">#NVIDIA</a> <a href="https://mastodon.social/tags/Review" class="mention hashtag" rel="nofollow noopener" target="_blank">#Review</a> <a href="https://mastodon.social/tags/TechNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#TechNews</a>

Todd A. Jacobs | RubyistIt seems like metal-enabled <a href="https://ruby.social/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a> using <a href="https://ruby.social/tags/gguf" class="mention hashtag" rel="nofollow noopener" target="_blank">#gguf</a> is faster than llama.cpp with <a href="https://ruby.social/tags/mlx" class="mention hashtag" rel="nofollow noopener" target="_blank">#mlx</a> on my <a href="https://ruby.social/tags/AppleSilicon" class="mention hashtag" rel="nofollow noopener" target="_blank">#AppleSilicon</a>. <a href="https://ruby.social/tags/Ollama" class="mention hashtag" rel="nofollow noopener" target="_blank">#Ollama</a> is mlx-only and slower, so not just a tool optimization.MLX was designed for Metal so should be faster. Maybe it helps more with Apple Intelligence or something? I now choose GGUF over MLX unless I specifically need Ollama.Anyone else had similar experiences? Do newer M-series chips do a better job with it, or did I not account for something?<a href="https://github.com/ggerganov/llama.cpp" rel="nofollow noopener" translate="no" target="_blank">https://github.com/ggerganov/llama.cpp</a>

Olivier Chafikllama.cpp now supports tool calling (OpenAI-compatible)<a href="https://github.com/ggerganov/llama.cpp/pull/9639" rel="nofollow noopener" translate="no" target="_blank">https://github.com/ggerganov/llama.cpp/pull/9639</a>On top of generic support for *all* models, it supports 8+ models’ native formats: - Llama 3.x - Functionary 3 - Hermes 2/3 - Qwen 2.5 - Mistral Nemo - Firefunction 3 - DeepSeek R1Runs anywhere (incl. Raspberry Pi 5). On a Mac:brew install llama.cpp llama-server --jinja -fa -hf bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_MStill fresh / lots of bugs to discover: feedback welcome!<a href="https://fosstodon.org/tags/llamacpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#llamacpp</a>

Recent searches

Search options

Administered by:

Server stats:

#llamacpp