ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server
Cost-efficient single-socket EPYC node for high-throughput AI inference
We help you choose, configure, and deliver the right system — no obligation.




Configuration at a Glance
Tailored per engagement. Full technical overview below.
Configuration Options
Core specifications for this system. Every component is configurable to your workload — request a quote for a tailored build.
6x NVIDIA L4 24GB (Ada Lovelace, single-width, ~72W)
1x AMD EPYC 9454 (4th Gen Genoa, 48 cores, PCIe Gen5)
768GB DDR5 (12x 64GB RDIMM, 4800 MHz)
8x 2.5-inch NVMe Gen5 hot-swap SSDs
Overview
This SR655 V3 combines a single high-core-count AMD EPYC processor with six single-width NVIDIA L4 GPUs for dense, energy-efficient inference. Nexus Compute specifies, configures, and tests each unit, delivering it warranty-backed through authorized Lenovo channels.
Who This Solution Is For
Business Benefits
Single-socket efficiency
One EPYC processor with up to 96 cores avoids dual-socket overhead while feeding six GPUs.
High inference density
Six NVIDIA L4 accelerators in 2U maximize concurrent inference streams per node.
Validated and warranty-backed
Each server is tested to spec and covered by Lenovo onsite warranty sourced through authorized distribution.
Typical Business Use Cases
High-throughput model serving
Batch and streaming video analytics
Recommendation and search ranking
Multi-tenant inference endpoints
Industry Applications
Technical Overview
The Lenovo ThinkSystem SR655 V3 is a 1-socket 2U server built on AMD EPYC 9004 (Genoa) processors with up to 96 cores and 128 PCIe Gen5 lanes. That single-socket lane budget cleanly drives up to eight single-width GPUs, here six NVIDIA L4 accelerators, for dense and power-efficient inference.
| GPU | 6x NVIDIA L4 24GB (Ada Lovelace, single-width, ~72W) |
| CPU | 1x AMD EPYC 9454 (4th Gen Genoa, 48 cores, PCIe Gen5) |
| Memory | 768GB DDR5 (12x 64GB RDIMM, 4800 MHz) |
| Storage | 8x 2.5-inch NVMe Gen5 hot-swap SSDs |
| Networking | OCP 3.0 dual-port 25GbE (up to 100GbE options) |
| Form Factor | 2U rack, single-socket |
| Management | Lenovo XClarity Controller (XCC) |
| Power | Dual redundant 1800W 80 PLUS Titanium PSUs |
| Warranty | 3-year Lenovo onsite limited warranty, 9x5 NBD |
Specifications are indicative and configured to each engagement. Request a quote for a configuration tailored to your requirements.
Warranty, Support & Fulfillment
Every system ships from an authorized channel, configured and tested, with the documentation enterprise buyers need — backed by warranty and a dedicated account team.
Enterprise Warranty
Full manufacturer warranty with optional on-site, next-business-day support and extended coverage.
Authorized Channel
Sourced through Tier-1 distribution and OEM partners — never grey market. Asset & warranty records included.
Lead Time & Deployment
48-hour quotes, then configured, burn-in tested, and delivered on a committed schedule.
Nationwide Fulfillment
Coordinated logistics, rack-and-stack, and delivery wherever your infrastructure lives.
Frequently Asked Questions
Why a single-socket server for inference?
A single EPYC processor supplies up to 96 cores and 128 PCIe Gen5 lanes, enough to feed six GPUs while reducing cost, power, and NUMA complexity versus a dual-socket node.
How does this compare to the SR630 V3 L4 build?
The SR655 V3 doubles GPU count to six in 2U for higher aggregate inference throughput, whereas the 1U SR630 V3 with three L4s targets space-constrained edge sites.
What is the right node count for my throughput target?
We benchmark your model and batch profile against L4 throughput, then size cluster count and networking to meet your queries-per-second and latency SLOs.
Hardware Assistance
Configure the ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server with Nexus Compute
Tell us your requirements and a hardware specialist will help you specify, configure, and quote the right system — typically within two business days. No obligation.