Storage Optimization for High-Performance AI Servers in Proxmox
Don't let a few missed check marks starve your GPU. A deep dive into storage configurations maximizing AI training performance on Proxmox.

The Scenario: The "Adequate" Server
Picture this: You are a DevOps engineer who has just commissioned a beast of a server. We’re talking about a modern Intel Xeon or Core processor, fast DDR5 ECC RAM, enterprise-grade NVMe storage, and a high-end GPU like an RTX 5090.
On paper, this machine is a Ferrari.
You install Proxmox, spin up a VM for AI testing, and launch a PyTorch training run. It works. The epochs are cycling. It’s... fine.
But "fine" isn't what you paid for. You open nvtop and notice the GPU utilization isn't pinned at a solid 100%. Instead, it fluctuates—dipping to 85%, then 92%, then back up. Every dip represents milliseconds where your expensive GPU is waiting for data to be read from the disk and pre-processed by the CPU.
Even if your server is performing adequately, running with default storage settings means you are likely leaving free performance on the table. Before you go chasing complex kernel tweaks or blaming the Python code, you need to ensure your storage foundation is solid.
In Proxmox VE, the defaults are designed for maximum compatibility, not maximum speed. This guide breaks down the architecture of VirtIO SCSI Single and provides the exact configurations you need to ensure your storage layer isn't the weak link.
TL;DR: The Optimization Cheat Sheet
If you are just here for the settings, apply these to your VM hardware config immediately to stop the I/O bottleneck.
For NVMe / SSDs (Fast Storage):
Controller:
VirtIO SCSI Single(Essential for parallelism)IO Thread:
Enabled(Prevents VM stutter)Discard:
Enabled(Required for TRIM/Health)SSD Emulation:
Enabled
For HDDs (Spinning Rust):
Controller:
VirtIO SCSI SingleIO Thread:
EnabledDiscard:
Disabled(Prevent fragmentation)SSD Emulation:
Disabled
Why this matters for AI: Without IO Thread and VirtIO Single, your CPU gets interrupted by disk reads, causing your GPU to sit idle (starvation).
1. The Controller Architecture: Standard vs. Single
The most fundamental choice you make when creating a VM is the SCSI Controller type. This defines how the Guest OS (the VM) talks to the storage subsystem.
VirtIO SCSI (Standard)
Architecture: Shared. One single virtual controller handles every disk attached to the VM.
The Bottleneck: Even if you enable "IO Thread," a single thread manages the queue for all drives. If
Drive Ais under heavy load,Drive Bmust wait in line.
VirtIO SCSI Single (Recommended)
Architecture: Dedicated. Proxmox creates a separate, independent virtual PCI controller for every disk attached to the VM.
The Advantage: This unlocks true parallelism.
IO Thread Optimized: Because every disk has its own controller, enabling "IO Thread" gives each disk its own dedicated worker thread on the host CPU. Heavy database writes on one drive will not stutter the OS on another.
Verdict: Always use VirtIO SCSI Single for modern deployments.
2. The "Big Four" Settings Explained
To optimize performance, you must understand four key checkboxes in the Hard Disk configuration.
IO Thread: Offloads disk processing from the main VM CPU loop to a separate thread. This prevents "interface stutter" during heavy file transfers.
Discard: Enables TRIM support. It allows the Guest OS to tell the host which blocks are deleted.
SSD Emulation: A flag that tells the Guest OS, "I am a solid-state drive." This changes how the Guest OS handles maintenance (TRIM vs. Defrag).
Cache: Usually set to
Write Backfor the best balance of speed and safety (assuming a UPS is present).
Visualizing the IO Thread Advantage
The graph below illustrates how enabling IO Thread stabilizes VM performance under heavy disk load.

3. Scenario A: The Modern Standard (SSD / NVMe)
If your underlying storage is Flash (SSD or NVMe), your goal is to minimize latency and maintain drive health via TRIM.
The "Gold Standard" Configuration
| Setting | Value | Reasoning |
| Controller | VirtIO SCSI Single | Maximizes parallel throughput. |
| IO Thread | Enabled | Prevents high I/O from freezing the VM. |
| Discard | Enabled | Critical. Passes TRIM commands to the physical SSD to prevent write amplification. |
| SSD Emulation | Enabled | Critical. Forces the OS (especially Windows) to recognize the drive as flash, enabling TRIM and disabling Defrag. |
4. Scenario B: The "Spinning Rust" Config (HDD)
Mechanical drives function differently. They excel at sequential reads but struggle with random access (seeking). Applying SSD optimizations here will actively degrade performance.
The "Thick Provisioning" Configuration
| Setting | Value | Reasoning |
| Controller | VirtIO SCSI Single | Still the best choice for thread management. |
| IO Thread | Enabled | Crucial. Mechanical seek times are slow; offloading this prevents the VM from locking up while waiting for the disk head. |
| Discard | Disabled | Critical. "Hole punching" a mechanical drive creates massive fragmentation, forcing the drive head to seek constantly. |
| SSD Emulation | Disabled | Critical. You want the OS to see a rotational drive so it runs background defragmentation to keep files contiguous. |
5. Advanced: Physical Disk Passthrough
When passing physical hardware to a VM, the rules change slightly depending on how you pass it.
Method 1: Controller Passthrough (PCIe)
You pass an entire HBA/SAS card to the VM.
Impact: Proxmox settings are irrelevant. The VM sees the raw hardware directly.
Note: This uses VFIO to isolate the hardware. The host OS (Proxmox) completely loses visibility of the controller and any attached drives; they vanish from the host to appear exclusively inside the VM.
Method 2: Disk Passthrough (qm set)
You map a specific physical block device to a VM.
For SSDs: Enable Discard and SSD Emulation. Without these, the VM cannot send TRIM commands to the physical SSD, leading to degraded health over time.
For HDDs: Disable both. Mechanical drives generally do not support TRIM, and sending these commands can cause errors or "stalls" on certain controllers.
6. The High Stakes: AI, Machine Learning, and "GPU Starvation"
While these settings improve the user experience for general productivity VMs, they are mission-critical for AI workloads.
If a Windows Desktop VM has a storage hiccup, the user might notice a 500ms delay when opening a folder. It is annoying, but not catastrophic. In AI training, however, storage latency translates directly to wasted money and time.
The "Feeding the Beast" Problem
Deep Learning pipelines (PyTorch, TensorFlow) function like a high-speed assembly line. This pipeline must move at the speed of the GPU. If the Storage or CPU falters, the GPU stops.
Storage: Fetches massive datasets.
CPU: Pre-processes data (decompression, augmentation).
GPU: Crunches the matrices.
How Bad Storage Configs Cause GPU Starvation
Remember those fluctuating graphs in nvtop mentioned in the introduction? That is caused by CPU interrupts.
If you fail to enable IO Threads or use the wrong controller settings:
Interrupt Storms: Every disk read triggers a CPU interrupt.
Context Switching: The CPU cores—which should be busy preparing data for the GPU—are forced to pause and handle storage I/O logic.
The Stall: The CPU falls behind. The GPU finishes its batch, turns around to ask for more data, and finds the "CPU Chef" is still busy reading from the disk.
The Result: Your GPU sits at 0% utilization for milliseconds at a time, thousands of times per hour. Over a week-long training run, this inefficiency can add significant time to completion.
The AI Recommendation
For AI/ML rigs, "good enough" isn't enough.
Minimum: Use VirtIO SCSI Single + IO Thread to keep CPU cores free for data loading.
Optimal: Use PCIe Controller Passthrough for your NVMe drives. By bypassing the virtualization layer entirely, you eliminate the CPU interrupts associated with virtual storage, ensuring your GPU is never left starving.
Summary Checklist
| Underlying Storage | Controller | IO Thread | Discard | SSD Emulation | Provisioning |
| SSD / NVMe | VirtIO SCSI Single | ON | ON | ON | Thin (Fine) |
| HDD (Mechanical) | VirtIO SCSI Single | ON | OFF | OFF | Thick (Recommended) |
| AI Workstation | Passthrough (Best) | N/A | N/A | N/A | N/A |
By aligning your VM settings with the physics of your underlying hardware, you ensure that your Proxmox environment—whether a humble file server or a cutting-edge AI rig—remains stable, responsive, and efficient.