Raptor Computing Systems Community Forums (BETA)
Software => Applications and Porting => Topic started by: icbts on February 18, 2025, 11:11:59 am
-
After reading the great post on VivaPowerPC about running Ollama on PPC64LE, I got excited and made a video following their steps, and the results I got on my Raptor Blackbird (yes, there is a shout out to VivaPowerPC for their excellent article). Wanted to share a link with the community :)
Has anyone else run Ollama on their PPC64LE rigs? Have benchmark results to share?
If I can get another RTX A1000 (PCIe x8) card I'll see if i can get the Nvidia ppc64le drivers to work with Ollama.
How to run Ollama on PPC64LE:
https://www.youtube.com/watch?v=P4iEZiwfLm8
-
The article in question (https://vivapowerpc.eu/20250204-1600_LLMs_on_ppc64le_with_Ollama). If it can be built and run in an entirely directory contained way, I will give it a try. I'm still running a Polaris generation AMD GPU, so it may or may not be worth trying with acceleration.
-
Yea I was able to get Ollama up and running following the instructions.
I didn't get very good performance though. I was getting only around 0.10 tokens/s using deepseek-r1:32b on CPU only.
I did try to get my RTX A4000 GPU to work but I couldn't get the driver to work properly. nvidia-smi just hangs and eventually gives an error message.
-
I got llama.cpp working with vulkan and on CPU alone, and I also noticed poor performance
-
For the llama3.2, CPU only, on my blackbird (4-core, 32GB) I get 5.94 tokens/s but on my TalosII (18-core, 368GB) I only get 0.34 tokens/s
-
llama.cpp with Vulkan is good on my older AMD card as long as the model fits in its VRAM and I supply the correct command line arguments (the llama.cpp wiki has good examples).