External RTX 4070 Super on Unraid via OCuLink

I’ve been wanting to run larger AI models locally on my Unraid server — but I was totally out of PCIe space, and I didn’t want to rebuild my entire setup. Full-length GPUs also don’t fit well in most server cases. So I decided to try something different:

Run a full-size GPU in an external enclosure and connect it to my Unraid server using an OCuLink PCIe extension.

And you know what? It worked. Really well. No BIOS hacks, no PCIe bifurcation, no soldering — just hardware and Docker containers.

Here’s how I added an external NVIDIA RTX 4070 SUPER to Unraid 7.2, and got it working with Ollama + Open-WebUI to run local AI models like LLaMA 3, Phi-3, and more — all accelerated by the GPU.

🧱 Hardware + Software Setup

Here’s everything that worked in my build:

🖥️ Server OS

  • Unraid 7.2

🎮 External GPU

  • NVIDIA RTX 4070 SUPER 12GB

📦 External GPU Enclosure

  • Aegeus PCIe 4.0 x4 to PCIe x16 eGPU enclosure https://www.amazon.com/dp/B0F9FBN5P5
    • Fits full-size GPUs
    • Includes cooling fans
    • Requires a standard PSU (ATX or SFX)

🔌 PCIe Host Adapter

  • PCIe 4.0 x4 to external OCuLink port https://a.co/d/5Wp3mEz
    • No PCIe bifurcation needed
    • Just plug into any free x4/x8/x16 slot

🔗 Cable

  • Shielded OCuLink SFF-8612 cable (50cm)

🔋 Power for Enclosure

  • Old Corsair ATX 650W PSU

🧠 AI Stack

  • Ollama (model inference)
  • Open-WebUI (chat interface)
  • NVIDIA Driver plugin for GPU support in Unraid

🤔 Why Do This?

I wanted to:

  • Run local LLMs (LLaMA, Mistral, Phi-3, etc.) for offline experiments
  • Keep Unraid cool and clean without rebuilding the server
  • Use an extra GPU I already had lying around
  • Avoid the BIOS limitations around PCIe bifurcation

This external OCuLink method gave me:

  • ✅ ~8GB/s PCIe bandwidth (PCIe Gen4 x4)
  • ✅ Full NVIDIA GPU support in Docker
  • ✅ Clean separation between server and GPU hardware
  • ✅ No mods, no hacks — it just works

🛠️ Setup Walkthrough

1. Install the PCIe → OCuLink Adapter in Unraid

I plugged the host adapter into an open PCIe x4 slot and screwed the bracket into place.

2. Connect the External GPU Enclosure

  • Installed the RTX 4070 SUPER into the enclosure
  • Connected it to a spare ATX power supply
  • Plugged in the OCuLink cable to the enclosure and into the PCIe host card

3. Boot Unraid and Verify the GPU

Once the enclosure was powered on, I rebooted the server and ran:

lspci | grep -i nvidia

And there it was — the GPU showed up like it was installed directly inside the case. 🔥

4. Install NVIDIA Drivers (Unraid Plugin)

Installed the NVIDIA Driver from Community Apps, then rebooted.

5. Confirm GPU Availability

Ran:

nvidia-smi

Got a clean status output with the RTX 4070 SUPER listed with all its VRAM and CUDA cores.

💥 Real-World Performance

I’ve successfully run:

  • LLaMA 3 8B
  • Phi-3 Medium (14B)
  • Qwen 2.5
  • Some multimodal/image models too

With the RTX 4070 SUPER, I regularly get:

  • ~50–120 tokens per second in local chat models
  • Smooth inference even with bigger models in 4-bit or 8-bit quantization
  • Near-zero latency for web chat over my LAN

✅ Final Thoughts

Adding an external GPU to Unraid using OCuLink was way easier than I thought. No PCIe hacks, no soldering, no custom drivers — and I now have a full GPU available for AI workloads, Docker tasks, or media transcoding.

If you’re running out of room in your Unraid server or just want a modular GPU setup, I highly recommend this approach. The external enclosures are affordable, and OCuLink is like the “PCIe extension cable” you never knew existed.

Let me know if you’d be interested in a follow-up with:

  • Full build photos
  • Links to my Docker templates
  • Benchmarks across a few local AI models

Happy tinkering 🛠️

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *