Do you hate #AI ? Think about #LLM as a lossy database. Which can be also queried by human language.
What Linux needs right now to enjoy this DB?
1. Reasonable support for accelerating models on #GPU (hello #OpenCL ), #NPU (TensorFlow Lite delegate #Mesa3D ) across various hardware
2. #Linux distributions shipping tools to run models, but most importantly stable interface to communicate with these models ( #OpenAI API on expected port) by default
3. #GNOME or #KDE levearaging it
4. and almost most importantly, distros shipping models trained on #OpenSource datasets by default.
@okias These models are all researched on Linux, they are trained on Linux. The GP-GPU frameworks like Cuda and HIP work on Linux. All the Python code to run the models runs on Linux, also PyTorch with GPU acceleration. Optimized C/C++-based engines like llama.cpp run on Linux on Cuda and ROCm (I worked on this), as well as other projects that use llama.cpp underneath like Ollama.
NPUs are garbage for LLMs, only good for 10 year old image recognition models. Very hard to develop against because there is no standard APIs like we have for graphics.
KDE and GNOME cannot leverate it because they don't have a model that has an appropriate license. There is also no suitable model for GNOME's or KDE's usecase for a DE-level integration. I also can't think of a usecase at all.
1/2 1. I agree; however, when I install my distro, I'm not getting the experience of having an LLM ready to use. #CUDA is proprietary and cannot be distributed with Linux distributions, and #ROCm only works with AMD cards. Thus #OpenCL is so far only vendor-agnostic option.
2. Perhaps @tomeu could jump in here? :)
@slyecho 2/2
3. Environments could leverage it even without the model in place, offering the user the option to download it, but the infrastructure is lacking. Once the user downloads the model, GNOME/KDE could start using it automatically.
Use cases include everything from code generation in IDEs to offline translation, etc.
My main concern is regular user.
I can set up the LLM + OpenAI API + apps to be performant and useful, but it costs an arm and a leg
@Paralyses2834 @tomeu Thanks, I need to look more into OneAPI. Anyway statement OpenCL is quite slow doesn't make much sense to me, could you elaborate bit?