Home Solutions LenovoThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server

LenovoNew

ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server

Cost-efficient single-socket EPYC node for high-throughput AI inference

Request Quote Download Datasheet

Full manufacturer warrantyAuthorized channel48-hour quote

We help you choose, configure, and deliver the right system — no obligation.

ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server — Lenovo enterprise hardware

ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server hardware detail 1

ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server hardware detail 2

ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server hardware detail 3

Configuration at a Glance

GPU6x NVIDIA L4 24GB (Ada Lovelace, single-width, ~72W)

CPU1x AMD EPYC 9454 (4th Gen Genoa, 48 cores, PCIe Gen5)

Memory768GB DDR5 (12x 64GB RDIMM, 4800 MHz)

Storage8x 2.5-inch NVMe Gen5 hot-swap SSDs

Tailored per engagement. Full technical overview below.

Configuration Options

Core specifications for this system. Every component is configurable to your workload — request a quote for a tailored build.

GPU / Accelerator

6x NVIDIA L4 24GB (Ada Lovelace, single-width, ~72W)

Processor

1x AMD EPYC 9454 (4th Gen Genoa, 48 cores, PCIe Gen5)

Memory

768GB DDR5 (12x 64GB RDIMM, 4800 MHz)

Storage

8x 2.5-inch NVMe Gen5 hot-swap SSDs

Overview

This SR655 V3 combines a single high-core-count AMD EPYC processor with six single-width NVIDIA L4 GPUs for dense, energy-efficient inference. Nexus Compute specifies, configures, and tests each unit, delivering it warranty-backed through authorized Lenovo channels.

Who This Solution Is For

AI teams optimizing inference cost-per-query

SaaS platforms serving models at scale

Teams favoring single-socket simplicity

Computer vision and analytics pipelines

Business Benefits

Single-socket efficiency

One EPYC processor with up to 96 cores avoids dual-socket overhead while feeding six GPUs.

High inference density

Six NVIDIA L4 accelerators in 2U maximize concurrent inference streams per node.

Validated and warranty-backed

Each server is tested to spec and covered by Lenovo onsite warranty sourced through authorized distribution.

Typical Business Use Cases

High-throughput model serving

Batch and streaming video analytics

Recommendation and search ranking

Multi-tenant inference endpoints

Industry Applications

AI & Machine LearningSaaS & SoftwareMedia & EntertainmentTelecom

Technical Overview

The Lenovo ThinkSystem SR655 V3 is a 1-socket 2U server built on AMD EPYC 9004 (Genoa) processors with up to 96 cores and 128 PCIe Gen5 lanes. That single-socket lane budget cleanly drives up to eight single-width GPUs, here six NVIDIA L4 accelerators, for dense and power-efficient inference.

GPU	6x NVIDIA L4 24GB (Ada Lovelace, single-width, ~72W)
CPU	1x AMD EPYC 9454 (4th Gen Genoa, 48 cores, PCIe Gen5)
Memory	768GB DDR5 (12x 64GB RDIMM, 4800 MHz)
Storage	8x 2.5-inch NVMe Gen5 hot-swap SSDs
Networking	OCP 3.0 dual-port 25GbE (up to 100GbE options)
Form Factor	2U rack, single-socket
Management	Lenovo XClarity Controller (XCC)
Power	Dual redundant 1800W 80 PLUS Titanium PSUs
Warranty	3-year Lenovo onsite limited warranty, 9x5 NBD

Specifications are indicative and configured to each engagement. Request a quote for a configuration tailored to your requirements.

Warranty, Support & Fulfillment

Every system ships from an authorized channel, configured and tested, with the documentation enterprise buyers need — backed by warranty and a dedicated account team.

Enterprise Warranty

Full manufacturer warranty with optional on-site, next-business-day support and extended coverage.

Authorized Channel

Sourced through Tier-1 distribution and OEM partners — never grey market. Asset & warranty records included.

Lead Time & Deployment

48-hour quotes, then configured, burn-in tested, and delivered on a committed schedule.

Nationwide Fulfillment

Coordinated logistics, rack-and-stack, and delivery wherever your infrastructure lives.

Frequently Asked Questions

Why a single-socket server for inference?

A single EPYC processor supplies up to 96 cores and 128 PCIe Gen5 lanes, enough to feed six GPUs while reducing cost, power, and NUMA complexity versus a dual-socket node.

How does this compare to the SR630 V3 L4 build?

The SR655 V3 doubles GPU count to six in 2U for higher aggregate inference throughput, whereas the 1U SR630 V3 with three L4s targets space-constrained edge sites.

What is the right node count for my throughput target?

We benchmark your model and batch profile against L4 throughput, then size cluster count and networking to meet your queries-per-second and latency SLOs.

Hardware Assistance

Configure the ThinkSystem SR655 V3 Single-Socket EPYC with 6x NVIDIA L4 Inference Server with Nexus Compute

Tell us your requirements and a hardware specialist will help you specify, configure, and quote the right system — typically within two business days. No obligation.

Request Quote Speak to an Infrastructure Specialist