Remote Proxmox Server

Overview

This project began as an attempt to reduce friction in my daily lab workflow.

Running a lab on my primary desktop created constant tradeoffs. Spinning up multiple VMs required shutting down other applications, restarting VMs between sessions discouraged long-running projects, and keeping a high-performance desktop powered on 24/7 introduced heat, noise, and power concerns that were impractical in a dorm environment.

I set out to build a dedicated, always-on virtualization server that could host labs continuously, survive power events, and be fully managed remotely without physical access.

Core Objectives

Reliable 24/7 VM hosting
Strong data protection and recovery
Remote out-of-band management
Cost efficiency using repurposed hardware where possible

System Design and Constraints

Rather than purchasing used enterprise hardware, I focused on maximizing performance per dollar while implementing enterprise-like reliability and management features.

CPU: AMD Ryzen 9 3950X. High core count for virtualization with acceptable power consumption and heat output. Required no additional investment, as it was already obtained for a prior project.
RAM: 64 GB ECC DDR4. Chosen explicitly to reduce silent data corruption risk with ZFS.
Storage:
- 2x NVMe SSDs for boot drives. One for Proxmox VE, one for Proxmox Backup Server
- 3x 1 TB NAS-rated SATA SSDs in RAIDZ1 for VM storage
- 2x 4TB HDDs in ZFS mirror dedicated to backup storage
UPS: APC Back-UPS series for affordability and strong community support.
Out-of-Band Management: JetKVM and Tailscale for remote access and troubleshooting.

Architecture

Platform Selection and Early Friction

This project was originally scoped around Hyper-V Server.

In practice, running Hyper-V in a workgroup environment introduced persistent friction. Windows Admin Center required domain services for meaningful management and MMC required unreliable WinRM and CredSSP workarounds. That proved to be a chore for macOS clients and inconvenient for off-site management.

Proxmox VE offered:

Platform-agnostic management infrastructure
First-class ZFS support
Robust backup tooling
A strong documentation and community ecosystem

Switching platforms reduced administrative overhead and enabled a more appliance-like server model.

Installer Compatibility Issue

During initial deployment, the graphical Proxmox installer failed during early driver initialization due to a known NVIDIA compatibility issue. Installation was completed in text mode using a documented workaround.

Storage Architecture and Data Integrity

Virtual machine storage is hosted on a ZFS RAIDZ1 pool backed by NAS-grade SSDs.

RAIDZ1 was selected to balance cost, storage efficiency, and acceptable performance for lab workloads. Given the relatively small pool size, ECC memory, and regular ZFS scrubbing, this tradeoff was considered appropriate for the system’s intended use.

Snapshots are used heavily for experimentation and rollback during lab work.

Backup Strategy and Recovery

Rather than relying solely on Proxmox's built-in backup tooling, I deployed Proxmox Backup Server as a VM with a dedicated storage pool. The OS and backup disks are passed directly through to the PBS VM using stable disk identifiers. This allows the entire PBS VM to be migrated to bare metal or another system if the host fails catastrophically.

Access Control and Least Privilege

PBS access from Proxmox VE is performed using a dedicated service account with only the 'DataStoreBackup' role assigned. This enforces least-privilege access.

Backup Characteristics

Incremental, deduplicated backups
Daily snapshots with longer retention for recent data
Weekly and monthly retention for historical recovery points
Automated verification and garbage collection

Known Tradeoff

Virtualizing the backup server introduces a shared failure domain, which I explicitly acknowledge. This tradeoff was accepted to ship a working system quickly, with plans to move PBS to dedicated hardware in a future revision.

Remote Management and Out-of-Band Access

Remote access was a hard requirement from the start.

Out-of-Band management: JetKVM provides hardware-level access to UEFI, bootloader, display, and ISO mounting.
Secure Remote Access: Tailscale, backed by a yubikey and Passkey login, provides encrypted remote access without exposing services publicly.
In-band management: Proxmox web UI and SSH for routine administration.

Lab traffic is separated from home traffic using a dedicated VLAN.

Power Management and Graceful Shutdown

As this server is hosted remotely, power loss was one of the most important failure modes to handle correctly.

Using Network UPS Tools in a server-client configuration, the UPS communicates with the Proxmox host and signals an orderly shutdown sequence during extended outages.

Integration Challenge

Initial NUT configuration failed due to USB device permission and service ownership issues. Community documentation highlighted common pitfalls when running NUT on Proxmox VE. Aligning udev rules and service permissions resolved the issue and ensured clean and reliable NUT driver initialization.

Outcomes:

Virtual machines shut down cleanly
Backup jobs terminate safely
The host powers off last, protecting ZFS integrity

Shutdown timing and battery thresholds were tuned based on observed runtime rather than defaults.

UPS status is also surfaced through HomeKit via a Homebridge container, providing at-a-glance visibility into power state.

Validation and Testing

To ensure reliability beyond initial configuration, key failure scenarios were validated:

UPS-triggered shutdowns
Verified backup restore functionality from Proxmox Backup Server
Tested remote recovery using JetKVM during host reboots

What I Learned

This project reinforced that technical success often comes from choosing the path with the least long-term friction, not the one that looks best on paper or easiest to get running.

Pivoting away from an all-Windows stack improved reliability and manageability, even though it required deeper Linux administration and research. Documentation, community knowledge, and troubleshooting skills were invaluable.

Most importantly, I learned that failure is a natural part of the learning process, and that it's important to be willing to pivot and try new things.

What I'd Do Differently

Native IPMI: A motherboard with built-in BMC would simplify telemetry and reduce reliance on external hardware.
Simplified Remote Network Access: A dedicated Tailscale subnet router would reduce per-host configuration overhead.
Dedicated Backup Server: My one true regret. Separating PBS onto a dedicated, low-power system would remove the shared failure domain.

Future Roadmap

Active Directory lab environment
Isolated cybersecurity and DFIR sandboxes
Potential local AI workloads as hardware prices improve

Conclusion

This project represents a transition from ad-hoc experimentation to deliberate system design.

By repurposing hardware, embracing open-source tooling, and prioritizing recoverability, I built a platform that mirrors many of the operational realities of enterprise infrastructure at a personal scale.

This system now serves as a stable foundation for coursework, security labs, and future enterprise-focused experimentation.

Demonstrated Technologies & Skills

Virtualization (Proxmox VE) System Architecture Backup & Disaster Recovery Remote & Out-of-Band Management Power & Availability Engineering Linux Systems Administration Security Fundamentals (RBAC, Least Privilege) Operational Validation & Testing