Lenovo ThinkSystem SR685a V3 — 8x MI300X for Large-Memory Inference
192GB-per-GPU inference node tuned for high-throughput ROCm serving.
We help you choose, configure, and deliver the right system — no obligation.




Configuration at a Glance
Tailored per engagement. Full technical overview below.
Configuration Options
Core specifications for this system. Every component is configurable to your workload — request a quote for a tailored build.
8x AMD Instinct MI300X, 192GB HBM3 each (1.5TB total)
Dual AMD EPYC 9004/9005 (Genoa/Turin)
Lenovo TruDDR5 up to 6000 MHz (24 DIMMs)
Up to 16x hot-swap PCIe Gen5 NVMe
Overview
The Lenovo ThinkSystem SR685a V3 hosts eight MI300X accelerators on dual AMD EPYC processors, an 8U platform optimized for serving large models where per-GPU memory capacity is the binding constraint. Nexus Compute configures the ROCm inference stack, GPUDirect networking, and storage, then delivers each tested system warranty-backed through authorized Lenovo channels.
Who This Solution Is For
Business Benefits
Capacity-led inference
192GB per GPU and 1.5TB pooled HBM3 keep large models resident for high-throughput, low-fragmentation serving.
Tuned serving stack
Configured with ROCm and vLLM-class engines so token throughput is production-ready on delivery.
Lenovo platform support
Sourced through authorized Lenovo channels with enterprise warranty and serviceability.
Typical Business Use Cases
High-concurrency LLM and generative-model serving
Private, on-premises inference for regulated data
Retrieval-augmented generation at scale
Long-context inference workloads
Industry Applications
Technical Overview
Built on the Lenovo ThinkSystem SR685a V3 8U platform with eight MI300X accelerators linked by AMD Infinity Fabric at 6.4TB/s GPU-to-GPU bandwidth, driven by dual AMD EPYC 9004/9005 processors. Lenovo TruDDR5 memory, up to 16 PCIe Gen5 NVMe drives, and eight front-accessible GPUDirect PCIe Gen5 slots support inference-serving deployments.
| Accelerator | 8x AMD Instinct MI300X, 192GB HBM3 each (1.5TB total) |
| GPU Interconnect | AMD Infinity Fabric, 6.4TB/s GPU-to-GPU |
| CPU | Dual AMD EPYC 9004/9005 (Genoa/Turin) |
| System Memory | Lenovo TruDDR5 up to 6000 MHz (24 DIMMs) |
| Storage | Up to 16x hot-swap PCIe Gen5 NVMe |
| Networking/Fabric | 8x front PCIe Gen5 FHHL slots with GPUDirect |
| Form Factor | 8U rackmount |
| Management | Lenovo XClarity Controller (XCC) |
| Warranty | Lenovo enterprise warranty (configurable term) |
Specifications are indicative and configured to each engagement. Request a quote for a configuration tailored to your requirements.
Warranty, Support & Fulfillment
Every system ships from an authorized channel, configured and tested, with the documentation enterprise buyers need — backed by warranty and a dedicated account team.
Enterprise Warranty
Full manufacturer warranty with optional on-site, next-business-day support and extended coverage.
Authorized Channel
Sourced through Tier-1 distribution and OEM partners — never grey market. Asset & warranty records included.
Lead Time & Deployment
48-hour quotes, then configured, burn-in tested, and delivered on a committed schedule.
Nationwide Fulfillment
Coordinated logistics, rack-and-stack, and delivery wherever your infrastructure lives.
Frequently Asked Questions
Why pick MI300X for inference specifically?
Its 192GB per GPU keeps large models resident with room for big KV caches and long contexts, raising concurrency before you add nodes. We benchmark your model and SLA to confirm fit.
Which serving software do you configure?
Commonly vLLM on ROCm, with PyTorch and your gateway of choice. We tune batching and parallelism to your latency and throughput targets before delivery.
How does this differ from the training-focused MI300X servers?
Same accelerator, different tuning: this SR685a V3 build is optimized for serving (memory residency, throughput, availability), while liquid-cooled training nodes prioritize sustained clocks and multi-node fabric.
Hardware Assistance
Configure the Lenovo ThinkSystem SR685a V3 — 8x MI300X for Large-Memory Inference with Nexus Compute
Tell us your requirements and a hardware specialist will help you specify, configure, and quote the right system — typically within two business days. No obligation.