Hardware Datasheet · GPU Server
Nexus Compute
Nexus MI300X Training Pod — Multi-Node Cluster (8-Rail 400G Fabric)
Rack-scale MI300X cluster engineered for distributed model training.
Overview
The Nexus MI300X Training Pod links multiple 8-GPU MI300X nodes over an 8-rail 400G fabric with shared parallel storage and ROCm-based orchestration, delivered as one engineered system rather than a parts list. Nexus Compute designs, sources, stages, and tests the full pod across compute, fabric, and storage, then delivers it warranty-backed through authorized channels.
Specifications
| Compute Nodes | Multiple 8x MI300X servers (192GB HBM3 per GPU) |
| Intra-Node Interconnect | AMD Infinity Fabric, all-to-all per node |
| Cluster Fabric | 8-rail fat tree, 400G RoCEv2 or InfiniBand NDR (1:1 GPU:NIC) |
| Shared Storage | High-throughput parallel filesystem |
| Orchestration | Slurm or Kubernetes with ROCm |
| Scale | 16 to 64+ MI300X GPUs (configurable) |
| Monitoring | GPU, fabric, and job health monitoring |
| Deployment | Design, sourcing, staging, and commissioning support |
Typical Use Cases
- ·Distributed training (FSDP, Megatron, DeepSpeed) on ROCm
- ·Foundation and large custom model training
- ·Shared multi-team research compute
- ·Building an owned AMD AI training platform
Industries
Warranty & Support
Supplied through authorized channels with full manufacturer warranty. On-site, next-business-day support options available. Every system is configured, tested, and documented before delivery, with asset and warranty records provided for enterprise audit requirements.
Request a tailored quote
Configurations are tailored per engagement — contact us for pricing and lead times.
sales@nexus-compute.com
+1 737 276 1016
nexus-compute.com
Specifications are indicative and configured to each engagement. All product names, logos, and trademarks are the property of their respective owners. Nexus Compute is an independent enterprise hardware supplier.