Raptor Computing Systems Community Forums (BETA)

Software => Applications and Porting => Topic started by: icbts on February 18, 2025, 11:11:59 am

Title: Running Ollama on PPC64LE
Post by: icbts on February 18, 2025, 11:11:59 am
After reading the great post on VivaPowerPC about running Ollama on PPC64LE, I got excited and made a video following their steps, and the results I got on my Raptor Blackbird (yes, there is a shout out to VivaPowerPC for their excellent article). Wanted to share a link with the community :)

Has anyone else run Ollama on their PPC64LE rigs? Have benchmark results to share?
If I can get another RTX A1000 (PCIe x8) card I'll see if i can get the Nvidia ppc64le drivers to work with Ollama.

How to run Ollama on PPC64LE:
https://www.youtube.com/watch?v=P4iEZiwfLm8
Title: Re: Running Ollama on PPC64LE
Post by: Borley on February 18, 2025, 10:36:22 pm
The article in question (https://vivapowerpc.eu/20250204-1600_LLMs_on_ppc64le_with_Ollama). If it can be built and run in an entirely directory contained way, I will give it a try. I'm still running a Polaris generation AMD GPU, so it may or may not be worth trying with acceleration.
Title: Re: Running Ollama on PPC64LE
Post by: witsu on February 19, 2025, 06:56:46 pm
Yea I was able to get Ollama up and running following the instructions.
I didn't get very good performance though. I was getting only around  0.10 tokens/s using deepseek-r1:32b on CPU only.

I did try to get my RTX A4000 GPU to work but I couldn't get the driver to work properly. nvidia-smi just hangs and eventually gives an error message.
Title: Re: Running Ollama on PPC64LE
Post by: adaptl on February 19, 2025, 10:55:30 pm
I got llama.cpp working with vulkan and on CPU alone, and I also noticed poor performance
Title: Re: Running Ollama on PPC64LE
Post by: atomicdog on February 20, 2025, 12:18:45 am
For the llama3.2, CPU only, on my blackbird (4-core, 32GB) I get 5.94 tokens/s but on my TalosII (18-core, 368GB) I only get 0.34 tokens/s
Title: Re: Running Ollama on PPC64LE
Post by: rheaplex on February 21, 2025, 10:37:50 pm
llama.cpp with Vulkan is good on my older AMD card as long as the model fits in its VRAM and I supply the correct command line arguments (the llama.cpp wiki has good examples).