All posts by DH

AI Image Generation on RX 580 Using Vulkan: A Cost-Effective Solution

This guide explores how to leverage the AMD Radeon RX 580 graphics card for AI image generation using Vulkan compute capabilities, without requiring the ROCm software stack. By utilizing stable-diffusion.cpp compiled with Vulkan support, users can take advantage of their existing hardware to run modern AI image generation models.

The approach focuses on maximizing the capabilities of older but still capable hardware, specifically targeting the 8GB VRAM of the RX 580 for efficient model execution. This method provides a cost-effective alternative to more expensive GPU options while maintaining reasonable performance for image generation tasks.

Prerequisites and Vulkan Setup

Before beginning the AI image generation setup, it is essential to have Vulkan properly installed and configured on the system. The installation process for Vulkan can be found in our related guide: Running Large Language Models on Cheap Old RX 580 GPUs with llama.cpp and Vulkan.

This prerequisite ensures that the system has the necessary graphics runtime and compute capabilities required for the Vulkan-based AI image generation framework. The Vulkan API provides a cross-platform solution for leveraging GPU compute resources, making it ideal for running AI workloads on AMD hardware.

Installing stable-diffusion.cpp with Vulkan Support

The core of this setup involves compiling and installing stable-diffusion.cpp with Vulkan support enabled. This specialized version of the stable diffusion framework is designed to utilize Vulkan compute capabilities for image generation tasks.

The installation begins by cloning the repository from GitHub, which includes all necessary submodules and dependencies:

git clone --recursive https://github.com/leejet/stable-diffusion.cpp

After cloning, navigate into the project directory and create a build directory to maintain clean separation between source and compiled files:

cd stable-diffusion.cpp
mkdir build && cd build

The compilation process requires enabling Vulkan support through CMake configuration. This step is crucial for ensuring that the application can utilize the GPU compute capabilities:

cmake .. -DSD_VULKAN=ON

Following the CMake configuration, build the project in Release mode to optimize performance:

cmake --build . --config Release

This compilation process generates the necessary executables and libraries required for running AI image generation tasks with Vulkan acceleration.

Model Preparation and Hardware Considerations

To run AI image generation on the RX 580, users must download appropriate model files in GGUF format. These models are specifically designed for efficient execution on hardware with limited VRAM. The process requires careful consideration of memory constraints, as each instance will operate on a single GPU with no ability to combine VRAM from multiple GPUs.

The 8GB VRAM of the RX 580 limits the size of models that can be fully loaded into memory. Some components of the generation process must be offloaded to the CPU, which affects overall performance but allows for operation within hardware constraints.

Model files typically include diffusion models, VAE components, CLIP encoders, and T5XXL text encoders in safetensors format. These files must be organized in a directory structure that the application can access during execution.

Sample Usage Commands

Once the system is properly configured with stable-diffusion.cpp compiled with Vulkan support, users can begin generating images using various command-line options. The following examples demonstrate different approaches to image generation with varying model configurations:

sd --diffusion-model  SD-Models/flux1-schnell-q4_0.gguf --vae SD-Models/ae.safetensors --clip_l SD-Models/clip_l.safetensors --t5xxl SD-Models/t5xxl_fp16.safetensors  -p "a lovely beagle holding a sign says 'hello'" --cfg-scale 1.0 --sampling-method euler -v --steps 4 --clip-on-cpu

This command demonstrates basic image generation with the flux1-schnell model, using CPU offloading for CLIP processing to accommodate memory limitations.

sd --diffusion-model  SD-Models/flux1-dev-q4_0.gguf --vae SD-Models/ae.safetensors --clip_l SD-Models/clip_l.safetensors --t5xxl SD-Models/t5xxl_fp16.safetensors  -p "a lovely beagle holding a sign says 'hello'" --cfg-scale 1.0 --sampling-method euler -v --steps 4 --clip-on-cpu

This example uses the flux1-dev model, which may offer different quality characteristics compared to the schnell variant.

For users interested in enhanced realism or artistic styles, LoRA (Low-Rank Adaptation) models can be incorporated:

sd --diffusion-model  SD-Models/flux1-dev-q4_0.gguf --vae SD-Models/ae.safetensors --clip_l SD-Models/clip_l.safetensors --t5xxl SD-Models/t5xxl_fp16.safetensors  -p "a lovely beagle holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir SD-Models --clip-on-cpu

This command demonstrates the integration of LoRA models for enhanced image generation quality and style control.

The final example combines both the flux1-schnell model with LoRA support:

sd --diffusion-model  SD-Models/flux1-schnell-q4_0.gguf --vae SD-Models/ae.safetensors --clip_l SD-Models/clip_l.safetensors --t5xxl SD-Models/t5xxl_fp16.safetensors  -p "a lovely beagle holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir SD-Models --clip-on-cpu

These commands illustrate the flexibility of the stable-diffusion.cpp framework in supporting various model configurations and enhancement techniques while working within the constraints of the RX 580’s hardware specifications.

Performance Considerations

The performance of AI image generation on the RX 580 with Vulkan support will vary based on several factors including model size, generation parameters, and system configuration. The 8GB VRAM limitation means that larger models may require additional CPU offloading or reduced resolution settings to function effectively.

You should expect longer generation times compared to systems with more powerful GPUs, but the approach provides a viable solution for those working with older hardware. The Vulkan implementation helps optimize compute operations and can provide better performance than traditional CPU-based approaches while utilizing the GPU’s parallel processing capabilities.

With these steps completed, you can successfully run AI image generation on their RX 580 graphics card using Vulkan compute capabilities. This setup provides an accessible pathway for leveraging existing hardware investments for modern AI applications without requiring expensive upgrades or specialized software stacks like ROCm.

Installing ComfyUI with Python 3.12 on Debian 13 (Trixie) with CUDA

This guide provides instructions for installing and configuring ComfyUI on Debian 13 (Trixie) using Python 3.12. The process encompasses system preparation, Python version management, dependency installation, and configuration for optimal performance with NVIDIA GPU support.

The installation assumes that NVIDIA graphics hardware and CUDA are properly installed and configured on the system. For users who need guidance on setting up CUDA specifically for Debian 13 (Trixie), a related tutorial is available at: Building Llama.cpp with CUDA on Debian 13 (Trixie).

Prerequisites and System Preparation

Before initiating the ComfyUI installation process, it is crucial to ensure that the system has all necessary dependencies installed. This foundational step involves updating the package repository and installing development tools and libraries required for building and running the ComfyUI application effectively.

The initial system preparation begins with updating the package list to access the latest available packages:

sudo apt update

Following this update, a comprehensive set of build tools and libraries must be installed. These dependencies are fundamental for compiling software, managing Python environments, and supporting graphical operations that ComfyUI requires:

sudo apt install -y build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev git gcc bc

In addition to the core development dependencies, several system-level packages are essential for proper functionality. These include utilities for managing Python virtual environments, graphics libraries for rendering, and core system libraries:

sudo apt install wget git python3 python3-venv libgl1 libglib2.0-0

These packages establish the necessary foundation for Python version management, Git operations, and graphical interface support that ComfyUI requires for optimal performance.

Installing Python 3.12 Using pyenv

ComfyUI requires Python 3.12 for full compatibility with its latest features and performance optimizations. Since Debian 13 (Trixie) may not include this specific Python version in its default repositories, we utilize pyenv to manage the installation and execution of the required Python environment.

The installation process begins with downloading and executing the official pyenv installation script from the pyenv repository:

curl https://pyenv.run | bash

This command fetches and executes the installation script, setting up the pyenv environment in the user’s home directory. Following the installation, proper shell configuration is essential to initialize pyenv correctly for each terminal session.

The configuration involves appending specific environment variable exports to the .bashrc file. These settings ensure that pyenv is properly initialized and that the appropriate Python version paths are included in the system’s PATH:

echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
source ~/.bashrc

With the environment properly configured, the specific Python version can be installed using pyenv. The command below installs Python 3.12.12, which is compatible with ComfyUI requirements:

pyenv install 3.12.12

Creating and Configuring the ComfyUI Environment

After establishing the Python environment, the next step involves creating a dedicated directory for ComfyUI and setting up the project structure. This organization ensures proper isolation of dependencies and facilitates easy management of the installation.

The creation of the ComfyUI directory and navigation into it follows these commands:

mkdir ComfyUI
cd ComfyUI

To ensure that the correct Python version is used for this specific project, set the local Python version to 3.12.12 using pyenv:

pyenv local 3.12.12

This command creates a .python-version file in the current directory, which pyenv will automatically use when entering this directory in future sessions.

With the environment properly configured, the next step involves installing the ComfyUI command-line interface tool. This utility simplifies the installation and management of ComfyUI components:

pip install comfy-cli

Following the installation of the CLI tool, it is recommended to install shell completion support for enhanced usability:

comfy --install-completion

The final step in the initial setup process involves installing all necessary ComfyUI dependencies and components:

comfy install

This command downloads and configures all required packages and models, which may take considerable time depending on network speed and system resources.

Configuring CUDA Support

For users with NVIDIA graphics hardware, configuring CUDA support is essential for optimal performance. The installation process checks for the presence of CUDA by verifying the nvcc compiler version.

To determine if CUDA is properly installed, execute the following command:

nvcc --version

If CUDA is correctly installed, the output will display information similar to:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

If CUDA is detected, install the appropriate PyTorch version with CUDA support using the following command:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

The cu124 suffix corresponds to the CUDA compilation tools release 12.4, as shown in the example output. This ensures that PyTorch is compiled with support for the installed CUDA version, enabling GPU acceleration for ComfyUI operations.

Launching ComfyUI

With all dependencies properly installed and configured, ComfyUI can be launched using the command-line interface. The basic launch command starts the application with default settings:

comfy launch

For users who require remote access to the ComfyUI interface, the application can be configured to listen on all network interfaces and specific ports. This configuration enables access from other machines on the network:

comfy launch -- --listen 0.0.0.0 --port 8080

This command configures ComfyUI to accept connections from any IP address (0.0.0.0) on port 8080, making it accessible across the network while maintaining security through proper firewall configuration.

It is important to always ensure that you are working within the ComfyUI directory before launching the application. This practice guarantees that the correct Python version and dependencies are used, preventing potential conflicts or errors during execution.

With these steps completed, ComfyUI is successfully installed and configured to run with Python 3.12 on Debian 13 (Trixie). The system is now ready for use with NVIDIA graphics hardware and CUDA support, providing users with a powerful and flexible interface for creating complex image generation workflows.

Installing Stable Diffusion WebUI with Python 3.10 on Debian 13 (Trixie)

This guide provides detailed instructions for installing and configuring the Stable Diffusion WebUI on Debian 13 (Trixie), utilizing Python 3.10. The process involves several key steps including system preparation, Python version management, repository cloning, and configuration adjustments to enable network accessibility.

The installation assumes that NVIDIA graphics hardware and CUDA are already properly installed and configured on the system. For users who need guidance on setting up CUDA specifically for Debian 13 (Trixie), a related tutorial is available at: Building Llama.cpp with CUDA on Debian 13 (Trixie).

Prerequisites and System Preparation

Before beginning the installation process, it is essential to ensure that the system has all necessary dependencies installed. This includes development tools and libraries required for building and running the Stable Diffusion WebUI application.

The first step involves updating the package list to ensure access to the latest available packages. This is followed by installing a comprehensive set of build tools and libraries that are fundamental for compiling software and managing Python environments:

sudo apt update
sudo apt install -y build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev git gcc bc

In addition to the core development dependencies, several system-level packages are required for proper functionality. These include utilities for managing Python virtual environments, graphics libraries for rendering, and core system libraries:

sudo apt install wget git python3 python3-venv libgl1 libglib2.0-0

These packages provide the foundation necessary for Python version management, Git operations, and graphical interface support that the Stable Diffusion WebUI requires.

Installing Python 3.10 Using pyenv

The Stable Diffusion WebUI specifically requires Python 3.10, which may not be available in the default repositories for Debian 13 (Trixie). To address this requirement, we utilize pyenv, a powerful tool for managing multiple Python versions on a single system.

The installation of pyenv begins with downloading and executing the official installation script from the pyenv repository:

curl https://pyenv.run | bash

This command fetches the installation script and executes it, setting up the pyenv environment in the user’s home directory. Following the installation, it is necessary to configure the shell environment to properly initialize pyenv each time a new terminal session is started.

The configuration involves appending specific environment variable exports to the .bashrc file. These settings ensure that pyenv is correctly initialized and that the appropriate Python version paths are included in the system’s PATH:

echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
source ~/.bashrc

Once the environment variables are properly configured, the specific Python version can be installed using pyenv. The command below installs Python 3.10.6, which is compatible with the Stable Diffusion WebUI requirements:

pyenv install 3.10.6

Cloning and Configuring the Stable Diffusion WebUI Repository

With the Python environment properly established, the next step involves obtaining the source code for the Stable Diffusion WebUI. This is accomplished by cloning the official repository from GitHub, which contains all necessary files and dependencies for running the web interface.

The cloning process retrieves the complete repository including all branches, commit history, and configuration files:

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

After successfully cloning the repository, navigate into the newly created directory. This is where all subsequent configuration and setup operations will take place:

cd stable-diffusion-webui

To ensure that the correct Python version is used for this specific project, set the local Python version to 3.10.6 using pyenv:

pyenv local 3.10.6

This command creates a .python-version file in the current directory, which pyenv will automatically use when entering this directory in future sessions.

Launching the WebUI Application

With all prerequisites met and the environment properly configured, the final step involves starting the Stable Diffusion WebUI application. This is accomplished by executing the webui.sh script, which handles the initialization process including dependency installation and server startup:

webui.sh

The execution of this script may take some time as it downloads required model files and dependencies, initializes the Python environment, and prepares the web server for operation. Users should allow sufficient time for this process to complete fully.

Configuring Network Accessibility

By default, the Stable Diffusion WebUI is configured to only accept connections from the local machine. For users who wish to access the interface from other devices on the network, a configuration change is necessary.

The configuration file webui-user.sh contains various settings that can be adjusted to modify the behavior of the web application. To enable network accessibility, this file must be edited:

nano webui-user.sh

Within this file, locate the line that begins with #export COMMANDLINE_ARGS="". This line is commented out by default and serves as a placeholder for additional command-line arguments. To modify the behavior to accept external connections, change this line to:

export COMMANDLINE_ARGS="--listen"

This configuration change instructs the web application to listen on all available network interfaces rather than restricting access to localhost only. This modification enables remote access to the Stable Diffusion WebUI from other machines within the same network. The interface can be reached at http://<server-ip>:7860

With these comprehensive steps completed, the Stable Diffusion WebUI is successfully installed and configured to run with Python 3.10 on Debian 13 (Trixie). The system is now ready for use with NVIDIA graphics hardware and CUDA support, providing users with a fully functional interface for generating images using stable diffusion models.

Building llama.cpp with CUDA on Debian 13 “Trixie”

If you’ve recently upgraded to Debian 13 or are fresh on a Trixie system, you may be eager to tap the power of your NVIDIA GPU for machine‑learning workloads. This post walks you through every step required to set up the necessary drivers, libraries, and build environment.


Why Enable CUDA in llama.cpp?

The original binaries of llama.cpp run on the CPU, which is perfectly fine for small models but can become a bottleneck with larger weights. By enabling the -DGGML_CUDA=ON flag, the project compiles the CUDA kernels that allow your NVIDIA GPU to perform inference. The result is a dramatic reduction in latency and a higher throughput for text generation tasks.


Prerequisites

  • A Debian 13 machine with an NVIDIA GPU that supports CUDA 11 or later.
  • Sudo access (or root) to install packages and modify system configuration.
  • An active internet connection so the package manager can fetch the necessary files.

Step 1 – Update Kernel Headers

Your system needs the headers that match the running kernel so that the NVIDIA driver can compile its kernel modules.

apt install linux-headers-$(uname -r)

This command pulls the headers for the current kernel release and installs them into the standard package locations.


Step 2 – Add Non‑Free Firmware Repositories

The Debian base repositories do not expose the proprietary firmware and driver packages needed for NVIDIA GPUs. By creating an additional source list file, we allow apt to pull the required non‑free components.

Create the file /etc/apt/sources.list.d/non‑free.sources and paste the following content:

Types: deb deb-src
URIs: http://deb.debian.org/debian/
Suites: trixie
Components: non-free-firmware contrib
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

Types: deb deb-src
URIs: http://security.debian.org/debian-security/
Suites: trixie-security
Components: non-free-firmware contrib
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

Types: deb deb-src
URIs: http://deb.debian.org/debian/
Suites: trixie-updates
Components: non-free-firmware contrib
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

After saving the file, refresh the package lists so the new entries become available:

apt update

Step 3 – Install the NVIDIA Driver and CUDA Toolkit

3.1 Bring in the NVIDIA Keyring

The NVIDIA distribution for Debian ships a keyring package that allows your system to verify the authenticity of the driver packages.

wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb

3.2 Install Driver Packages

apt -V install nvidia-driver-cuda nvidia-kernel-dkms

The meta‑package nvidia-driver-cuda pulls the latest driver binaries and the CUDA toolkit for the current kernel. It also installs nvidia-kernel-dkms, which provides a Dynamic Kernel Module Support interface so the driver can be built against any future kernel version.

3.3 Regenerate Initramfs and Update GRUB

After installing the driver modules, you must ensure that the initramfs contains the new driver and that GRUB will boot into the updated kernel configuration.

update-initramfs -u -k all
update-grub

Reboot the machine to let the new driver take effect.

3.4 Install the CUDA Toolkit

With the driver in place, install the toolkit components that provide nvcc, libraries, and headers used by llama.cpp.

apt install nvidia-cuda-toolkit

Step 4 – Install Build Dependencies

The build process for llama.cpp requires several libraries and developer tools. Installing them up front keeps the compile step straightforward.

apt install libtcmalloc-minimal4 libcurl4-openssl-dev glslc cmake make git pkg-config

These packages provide memory allocation utilities, SSL support, the GLSL compiler, CMake, Make, Git, and generic build configuration tools.


Step 5 – Clone and Compile llama.cpp

With the environment prepared, fetch the source code and build it.

cd ~
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build
cmake .. \
  -DGGML_AVX=ON \
  -DGGML_AVX_VNNI=ON \
  -DGGML_AVX2=ON \
  -DGGML_CUDA=ON \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLAMA_CURL=ON
make -j8
echo 'export PATH=$PATH:'$(realpath bin) >> ~/.bashrc

After the build finishes, log out and back in again so the newly added binaries become visible in your shell path.


Step 6 – Keep the Driver in Sync with Kernel Updates

Kernel upgrades are common, and the driver must be rebuilt against each new kernel. The following routine ensures the driver modules stay current.

apt install linux-headers-$(uname -r)
apt install --reinstall nvidia-driver-cuda nvidia-kernel-dkms
apt install nvidia-cuda-toolkit
update-initramfs -u -k all
update-grub

Running this sequence after any kernel upgrade guarantees that the driver continues to load correctly.


Step 7 – Updating the Source Tree

When the upstream llama.cpp project publishes a new release or a bug fix, refresh your local copy and rebuild:

cd ~
cd llama.cpp/

# Clean the working directory
git clean -xdf
mkdir build

# Pull the latest changes and submodules
git pull
git submodule update --recursive

# Rebuild
cd build/
cmake .. \
  -DGGML_AVX=ON \
  -DGGML_AVX_VNNI=ON \
  -DGGML_AVX2=ON \
  -DGGML_CUDA=ON \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLAMA_CURL=ON
make -j8

Proxmox VE GPU Passthrough on an AMD RX 560

This guide walks you through every step required to expose an AMD RX 560 graphics card to a Proxmox Virtual Environment (VE) virtual machine. The same procedure applies to other AMD GPUs such as the RX 570, RX 580, RX 7600, RX 7700, RX 7900 XT, and many others.

1. Pulling the ROM from the GPU

  1. Install the GPU
    Insert the card into any PCI‑e slot (some systems require it not to be the first slot).
  2. Boot Proxmox VE
    Log into the console.
  3. Locate the device
    Run lspci -nnk and look for a line similar to:
   01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] [1002:67ef] (rev e5)
     Subsystem: Gigabyte Technology Co., Ltd Device [1458:230a]
     Kernel driver in use: amdgpu
     Kernel modules: amdgpu
   01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
     Subsystem: Gigabyte Technology Co., Ltd Device [1458:aae0]
     Kernel driver in use: snd_hda_intel
     Kernel modules: snd_hda_intel

The PCI bus address of the VGA controller is 01:00.0. To form the sysfs path, prepend 0000::

   /sys/bus/pci/devices/0000:01:00.0/
  1. Extract the ROM
   cd /sys/bus/pci/devices/0000\:01\:00.0/
   echo 1 > rom
   cat rom > /usr/share/kvm/RX560-4096.rom
   echo 0 > rom

The file /usr/share/kvm/RX560-4096.rom now contains the GPU ROM.


2. Configuring the Proxmox Server for PCIe Passthrough

  1. Load required kernel modules
    Edit /etc/modules and add:
   vfio
   vfio_iommu_type1
   vfio_pci
   vfio_virqfd
  1. Blacklist the native driver
    Edit /etc/modprobe.d/pve-blacklist.conf and add:
   blacklist amdgpu
  1. Create VFIO configuration
    Create /etc/modprobe.d/vfio.conf with:
   options vfio-pci ids=1002:67ff,1002:aae0 disable_vga=1
   softdep amdgpu pre: vfio-pci

The IDs come from the lspci -nnk output: 1002:67ef (VGA) and 1002:aae0 (Audio).

  1. Enable IOMMU in the kernel
    Edit /etc/default/grub and add either intel_iommu=on or amd_iommu=on to the GRUB_CMDLINE_LINUX_DEFAULT line, e.g.:
   GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
  1. Apply changes
   update-grub
   update-initramfs -u -k all
  1. Reboot
    Shut down the Proxmox host. If necessary, move the GPU to the first PCI‑e slot before powering on again.

3. Configuring a VM for PCIe Passthrough

  1. Create the VM
    Use the usual VM creation flow in Proxmox, but set the Machine type to q35.
  2. Add the GPU as a Raw Device
    In the VM hardware list:
  • Type: PCI Device
  • Bus address: the same as the GPU (e.g., 0000:01:00)
  • Enable ROM‑Bar, PCI‑Express, and Primary GPU
  1. Edit the VM configuration file
    For a VM with ID 101, edit:
   /etc/pve/nodes/pve/qemu-server/101.conf

Add or modify the hostpci0 line to reference the ROM file:

   hostpci0: 0000:01:00,pcie=1,x-vga=1,romfile=RX560-4096.rom
  1. Start the VM
    The guest will now use the passthrough GPU.

Following these steps will give you a fully functional AMD RX 560 passthrough in Proxmox VE, and the same methodology works for other AMD GPUs such as the RX 570, RX 580, RX 7600, RX 7700, RX 7900 XT, etc.


Sources

R2-D2 Revival: Hacking a Vintage Star Wars Light into an IoT Device

A 10-Year-Old R2-D2 Light

A decade ago, my wife gifted me what looked like a charming 3D Light FX Star Wars R2-D2 3D-Deco LED Wall Light (yes, that was the name). I never mounted it to a wall, I just had it resting on a shelf waiting for a good moment. When I finally decided to mount my R2-D2 light in 2025, I was disappointed by its basic capabilities. The internal inspection revealed a simple circuit board containing only an infrared sensor, two basic LEDs, and a manual switch. This minimalistic approach, while charming, left much to be desired.

Full Hardware Overhaul

This wasn’t going to be a simple upgrade. The original board was not worth keeping, so I decided to gut it completely. Here’s what I installed:

LED System: Three WS2812B addressable RGB LEDs controlled by an ESP8266. These provide the classic R2-D2 with RGB lighting effects with full color control.

Audio System: A Adafruit Audio FX Sound Board with 2x2W amplifier and 16MB storage for WAV/OGG audio files. This handles the authentic R2-D2 sounds with crystal clear audio quality.

Smart Control: The ESP8266 runs custom Arduino firmware that connects to WiFi and listens for MQTT commands. This allows integration with my smart home ecosystem.

Implementation

The Arduino code running on the ESP8266 handles multiple tasks simultaneously:

  • WiFi connection management
  • MQTT protocol for external control
  • WS2812B LED sequencing and color effects
  • Audio trigger management through the Adafruit board

The Arduino code manages LED color sequences, intensity variations, and audio playback synchronization. Each sound trigger is carefully timed to match the visual effects, creating an immersive experience that captures the essence of the beloved astromech droid.

The true magic lies in the integration with my existing smart home infrastructure. Other systems can now trigger specific R2-D2 responses based on:

  • Motion detection from security cameras
  • Scheduled calendar events
  • Random sound and lighting effect triggers to simulate R2-D2’s intermittent consciousness, due to damage.

What started as a simple wall light is now a sophisticated IoT device. The transformation took about a weekend of work, but the end result is worth every minute.

Hardware Summary:

  • ESP8266 (NodeMCU clone)
  • 3x WS2812B LEDs
  • Adafruit Audio FX Sound Board + 2x2W Amp with 16MB of storage
  • Standard USB power supply

Running Large Language Models on Cheap Old RX 580 GPUs with llama.cpp and Vulkan

LLMs and GPUs

In recent years, the landscape of artificial intelligence has shifted dramatically with the rise of large language models (LLMs). These models are incredibly powerful but also resource-intensive — typically requiring high-end GPUs like NVIDIA’s RTX 4090s or AMD’s latest Radeon Instinct series to run effectively.

But what if you don’t have access to such hardware? What if your budget is limited, or you already own older GPUs like the AMD Radeon RX 580? Surprisingly, there’s still a way to get meaningful performance out of these aging cards — especially with the right software stack and a bit of ingenuity.

This guide walks through how to leverage the AMD Radeon RX 580 — an aging yet capable GPU — to run large language models using llama.cpp via Vulkan API support, even though ROCm (the newer AMD compute framework) no longer supports it.


Hardware Overview: The Radeon RX 580

The Radeon RX 580 is part of AMD’s Polaris generation, released in 2016. While not cutting-edge today, it still offers:

  • 8 GB GDDR5 memory (sufficient for many smaller models)
  • 2,304 stream processors
  • 14nm process
  • Good PCIe 3.0 bandwidth

Although it’s no longer officially supported in newer versions of ROCm, the RX 580 retains full compatibility with Vulkan drivers, making it ideal for running modern AI inference engines.


Software Stack: llama.cpp + Vulkan

llama.cpp is a lightweight C++ implementation of the LLaMA architecture that allows you to run LLMs directly on your CPU or GPU.

It supports multiple backends including:

  • CPU (default)
  • CUDA (NVIDIA)
  • Metal (Apple Silicon)
  • Vulkan (AMD & Intel GPUs)

By enabling Vulkan support during compilation, we can tap into the RX 580’s full potential.


Installing Vulkan Drivers on Debian 12

Before we build llama.cpp, we need to ensure the system has proper Vulkan support:

sudo apt install vulkan-tools libtcmalloc-minimal4 libcurl4-openssl-dev glslc cmake make git pkg-config libvulkan-dev

These packages provide:

  • vulkan-tools: Tools for testing Vulkan applications
  • libtcmalloc-minimal4: Memory allocator for performance
  • libcurl4-openssl-dev: For downloading models via HTTP
  • glslc: GLSL shader compiler (needed for Vulkan)
  • cmake, make, git, pkg-config: Build dependencies
  • libvulkan-dev: Required for Vulkan development

Once installed, you can verify Vulkan support:

vulkaninfo | grep -i RX

You should see your GPU listed in the output.


Installing llama.cpp with Vulkan Support

Let’s walk through the full installation process.

Step 1: Clone the Repository

cd ~
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build

Step 2: Configure CMake for Vulkan

Build llama.cpp with Vulkan enabled:

cmake .. \
  -DGGML_AVX=ON \
  -DGGML_AVX_VNNI=ON \
  -DGGML_AVX2=ON \
  -DGGML_VULKAN=ON \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLAMA_CURL=ON

This configuration enables:

  • AVX instructions for faster CPU ops
  • AVX2 / VNNI optimizations (for better performance on supported CPUs)
  • Vulkan backend support for AMD GPUs
  • Curl support for downloading GGUF models from Hugging Face

Step 3: Compile and Install

make -j8
echo 'export PATH=$PATH:'$(realpath bin) >> ~/.bashrc

Log out and back in to update your environment variables so llama-cli and llama-server are available in your terminal.


Running Models with llama-cli and llama-server

Now that everything is built, let’s test it out with some sample commands.

Using llama-cli

Run a model using the CLI interface:

llama-cli -m deepseek-r1:8B --device Vulkan0 -ngl 99

This command:

  • Loads a model named deepseek-r1:8B
  • Uses device Vulkan0 (first Vulkan-compatible GPU detected)
  • Sets -ngl 99 to offload all layers to GPU

You can optionally specify the full model path or use Hugging Face URLs (with the -hf flag if supported).

Using llama-server

To expose your model via an API endpoint:

llama-server --host 0.0.0.0 -hf unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF:Q4_K_M --device Vulkan0 -ngl 99

This starts a server listening on all interfaces (0.0.0.0) and uses:

  • unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF:Q4_K_M as the model (quantized to 4-bit)
  • Device Vulkan0
  • All layers (-ngl 99) loaded into GPU memory

Multi-GPU Setup

If you have more than one RX 580 (or other Vulkan-compatible GPUs), you can split the model across multiple devices:

llama-server --host 0.0.0.0 -hf unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF:Q8_K_XL --device Vulkan0,Vulkan1

And for even larger models, like Qwen3-Coder-30B-A3B-Instruct-GGUF:

llama-server \
  --host 0.0.0.0 \
  -hf unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q8_K_XL \
  -ngl 99 \
  --threads -1 \
  --ctx-size 32684 \
  --temp 0.7 \
  --min-p 0.0 \
  --top-p 0.80 \
  --top-k 20 \
  --repeat-penalty 1.05 \
  --device Vulkan0,Vulkan1,Vulkan2,Vulkan3,Vulkan4

This will use up to five GPUs, distributing load across them and enabling inference of 30B parameter models.


Updating llama.cpp

When new updates are released, just run:

cd ~/llama.cpp/
git clean -xdf
git pull
git submodule update --recursive
cd build/
cmake .. \
  -DGGML_AVX=ON \
  -DGGML_AVX_VNNI=ON \
  -DGGML_AVX2=ON \
  -DGGML_VULKAN=ON \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLAMA_CURL=ON
make -j8

Performance Notes: RX 580 Limitations and Workarounds

While the RX 580 isn’t the fastest GPU on the market, it can still run impressive models when properly configured. Here are some key takeaways:

  • Small to medium-sized models (e.g., 7B–13B parameters) run smoothly with minimal latency.
  • Larger models (like 30B) require:
  • Quantized weights (Q4, Q8_K_XL)
  • Multi-GPU setup
  • Longer wait times for responses
  • Threading optimization (--threads -1)
  • Higher context sizes (--ctx-size)

Despite limitations, a cluster of 5 RX 580s can handle a 30B parameter model, which is quite remarkable for such older hardware.


Final Thoughts

The RX 580 may be old, but it still holds value in the world of AI inference. Thanks to the llama.cpp project’s Vulkan backend support, it’s possible to run large language models on low-cost hardware that would otherwise be unusable for AI workloads.

With careful configuration and the right software stack, you can build a capable local LLM inference rig using nothing more than a few secondhand GPUs. Whether you’re training, experimenting, or just curious about AI, this setup provides a great foundation to get started.

If you’re looking to repurpose an old rig or build a cost-effective edge AI box, the RX 580 + Vulkan + llama.cpp combination is worth exploring — and you might be surprised at what it can do.


Have questions or need help setting up your own RX 580-based LLM cluster? Leave a comment below or share your experience in the comments!

Running Supabase in a Proxmox Docker VM: A Step-by-Step Guide

Supabase and Proxmox

In today’s rapidly evolving tech landscape, developers often need flexible and scalable solutions for hosting applications — especially when looking to self-host services like Supabase. One powerful approach is to deploy Supabase using a Proxmox VE Docker VM. This setup not only offers flexibility and isolation but also allows for easy updates and maintenance.

Why Use Proxmox VE?

Proxmox VE stands out as a free and open-source virtualization platform that supports both KVM and LXC containers. What makes it particularly appealing for developers is its ability to manage virtual machines (VMs) with full OS support, unlike some containerized alternatives.

Furthermore, Proxmox allows for Docker in a VM, which means you get the best of both worlds: the isolation and management of a VM with the lightweight efficiency of Docker containers. Since Docker in LXC containers isn’t as straightforward to maintain, deploying Docker within a Proxmox VM is the recommended way to go.

Deploying a Docker VM in Proxmox

To simplify the process, the community-scripts project provides a convenient script to create a Docker-ready VM. You can find detailed documentation at https://community-scripts.github.io/ProxmoxVE/scripts?id=docker-vm.

Here’s how to get started:

bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/vm/docker-vm.sh)"

This script will set up a Docker-enabled VM in Proxmox, complete with the necessary tools and configurations. Once deployed, you can access the VM via SSH or console, using the default login:

  • Username: root
  • Password: docker

Of course, you should change the default password immediately to ensure security.

🔐 Security Tip: Always update default credentials right after deployment to prevent unauthorized access.

Installing Supabase in Your Docker VM

Now that you have your Docker VM running, it’s time to install Supabase — an open-source Firebase alternative that provides real-time databases, authentication, and more. For detailed installation instructions, refer to the official Supabase documentation: https://supabase.com/docs/guides/self-hosting/docker

Step-by-Step Installation

First, you need to clone the Supabase repository and set up your project directory:

git clone --depth 1 https://github.com/supabase/supabase
mkdir supabase-project

Your folder structure should now look like this:

.
├── supabase
└── supabase-project

Next, copy the necessary Docker Compose files and environment variables:

cp -rf supabase/docker/* supabase-project
cp supabase/docker/.env.example supabase-project/.env

Now, switch to your project directory and pull the latest images:

cd supabase-project
docker compose pull

Then, start all the services in detached mode:

docker compose up -d

You can verify that all services are running correctly with:

docker compose ps

If any service is not running, try starting it manually:

docker compose start <service-name>

Accessing Supabase Studio

Once everything is up and running, you can access Supabase Studio — the admin UI for managing your Supabase project — through the API gateway on port 8000.

For example, if your VM’s IP address is 192.168.1.100, you would visit:

http://192.168.1.100:8000

By default, the login credentials are:

  • Username: supabase
  • Password: this_password_is_insecure_and_should_be_updated

As mentioned earlier, it’s critical to update the default password immediately for security reasons.

Why Choose This Approach?

Using Proxmox VE to host Supabase offers several advantages:

  • Full OS support: Unlike containers, VMs allow for full control over the guest OS.
  • Easy updates: VMs can be upgraded independently, making them more manageable than LXC containers.
  • Isolation: Each VM functions as a separate unit, improving stability and security.
  • Scalability: You can easily scale resources or replicate your setup for development and production.

Conclusion

Deploying Supabase in a Proxmox VE Docker VM is an efficient and secure way to self-host your Supabase infrastructure. It leverages the strengths of both virtualization and containerization, offering scalability and maintenance benefits.

Whether you’re a developer looking to host a custom backend or a team managing multiple projects, this setup provides an excellent foundation.


Reolink RLC-520A Camera Review: Cutting-Edge POE Security Surveillance

The Reolink RLC-520A is a feature-packed POE (Power over Ethernet) security camera designed to keep your property safe and secure. In this review, we’ll explore its specifications and compare it to two similar POE security cameras to help you make an informed decision.

Specifications (4.5/5):

  • Resolution: The Reolink RLC-520A boasts an impressive 5-megapixel Super HD resolution, capturing sharp and clear video footage. It’s a notable upgrade from standard 1080p cameras, offering enhanced detail.
  • Lens and Field of View: It features a 4mm lens with a wide 80-degree field of view, ideal for covering large areas with a single camera. The lens is fixed, so there’s no optical zoom, but the high resolution compensates for this limitation.
  • Night Vision: Equipped with 18 infrared LEDs, the RLC-520A provides outstanding night vision with a range of up to 100 feet (30 meters), ensuring around-the-clock surveillance.
  • POE Connectivity: The Power over Ethernet feature simplifies installation by allowing both power and data transmission through a single cable, reducing clutter and the need for additional power sources.
  • Weather Resistance: This camera is rated IP66 weatherproof, making it suitable for outdoor use, capable of withstanding rain, snow, and extreme temperatures.
  • Motion Detection and Alerts: The RLC-520A supports customizable motion detection zones and sends alerts to your smartphone or email when motion is detected. This helps reduce false alarms.
  • Audio: It features a built-in microphone for capturing audio, which can be a valuable addition to your surveillance setup.

Performance (4/5):

The Reolink RLC-520A performs admirably, offering high-resolution video quality, reliable night vision, and robust weather resistance. Its POE capability simplifies installation and ensures a stable connection. The motion detection and alert system is effective, but users may find it a bit sensitive at times, leading to more notifications than desired.

Comparison to Similar POE Security Cameras:

  1. Reolink RLC-520A vs. Amcrest IP8M-T2499EW:
    • The Amcrest IP8M-T2499EW is a strong contender with similar specifications, including 5-megapixel resolution, POE support, and excellent night vision. However, it offers a slightly wider 97-degree field of view, which can be advantageous for certain setups. Both cameras are weatherproof and suitable for outdoor use, making the choice between them largely dependent on your field of view preferences.
  2. Reolink RLC-520A vs. Hikvision DS-2CD2143G0-I:
    • The Hikvision DS-2CD2143G0-I is another 4MP POE camera with good image quality and night vision capabilities. It offers a slightly lower resolution than the RLC-520A but makes up for it with a wider 98-degree field of view. Hikvision is known for its reliability and durability, making it a worthy competitor. The choice between these two cameras may come down to your specific requirements and brand preferences.

Conclusion: The Reolink RLC-520A is a dependable POE security camera with impressive specifications, offering high-resolution video, excellent night vision, and weather-resistant construction. When compared to similar POE cameras, it holds its own, but the choice between them depends on your specific needs, such as field of view requirements and brand preferences. Overall, the RLC-520A is a solid choice for those seeking advanced surveillance capabilities.

Conway AP-1512HH Air Purifier Review: A Breath of Fresh Air

The Conway AP-1512HH Air Purifier is a popular choice among homeowners seeking clean and fresh indoor air. In this review, we’ll dive into its specifications and compare it to two similar air purifiers to help you make an informed decision.

Specifications (4/5):

  • CADR (Clean Air Delivery Rate): The Conway AP-1512HH boasts a CADR of 246 for dust, 240 for pollen, and 233 for smoke. This means it can effectively filter out particles of varying sizes, making it suitable for rooms up to 360 square feet.
  • Filtration System: Equipped with a 4-stage filtration process, including a pre-filter, a True HEPA filter, an optional charcoal filter, and a UV-C light. This comprehensive filtration system tackles dust, allergens, pet dander, smoke, and even common household odors.
  • Fan Speeds: The purifier offers four fan speeds, allowing you to adjust the filtration intensity to your liking. The lower speeds are quieter and energy-efficient, while the highest setting provides rapid purification.
  • Noise Level: The Conway AP-1512HH is relatively quiet, especially on lower fan speeds, with a noise range of 29-53 dB. This makes it suitable for use in bedrooms or quiet spaces.
  • Dimensions: Compact and space-saving, measuring 16.8 x 9.1 x 18.3 inches, it can easily fit into most rooms without being obtrusive.

Performance (4/5): The Conway AP-1512HH excels in its primary mission: improving indoor air quality. Its True HEPA filter efficiently captures particles as small as 0.3 microns, effectively reducing allergens and contaminants. The UV-C light adds an extra layer of protection against germs and bacteria. It’s especially suitable for allergy sufferers and pet owners.

Comparison to Similar Air Purifiers:

  1. Coway AP-1512HH vs. Levoit LV-H132:
    • The Levoit LV-H132 is a compact and budget-friendly air purifier. It has a smaller coverage area (129 square feet), making it ideal for small rooms or offices. However, it lacks the comprehensive 4-stage filtration system of the Coway AP-1512HH and doesn’t have a UV-C light. If you’re on a budget and have a smaller space, the Levoit is a decent choice, but the Coway offers more comprehensive filtration and coverage.
  2. Coway AP-1512HH vs. Honeywell HPA300:
    • The Honeywell HPA300 is a powerful air purifier with a CADR of 300 for dust, pollen, and smoke, making it suitable for larger rooms (up to 465 square feet). It features a True HEPA filter and a turbo setting for quick purification. However, it is bulkier and noisier compared to the Coway AP-1512HH. If you need a purifier for a larger space and don’t mind the size and noise, the Honeywell is a strong contender, but the Coway offers better noise levels and a more compact design.

Conclusion: The Conway AP-1512HH Air Purifier impresses with its efficient 4-stage filtration system and a balanced set of features. It’s a solid choice for medium-sized rooms, offering quiet operation and good air quality improvement. When compared to similar air purifiers, it competes favorably in terms of performance and features. However, your choice ultimately depends on your specific room size and filtration needs.