This GPU cluster features 8 x H100 (80GB SXMS) GPUs per node with a 3.2 Tb/s network. It’s currently set up with RoCE V2, but InfiniBand (IB) can be configured for larger, long-term customers. Each node includes dual Intel Xeon Platinum 8468 48-core processors, 2TB of RAM, and 15.36TB of additional SSD storage per node, alongside a 1.8TB SSD for the OS.
Networking is flexible, starting with a 1Gbps internet connection that can scale up to 10Gbps or even 100Gbps for specific needs. Inter-node networking runs on a 400G Ethernet/IP/RoCEv2 setup, with dual 25G links for north/south traffic and a dedicated 100G Ethernet/IP fabric for storage (2 x 100G per node). For long-term leases, 400G IB fabric can also be accommodated.
The storage system is powered by Ceph, offering 2.5PB of high-speed NVMe SSD storage and 19.9PB of scalable object storage.
This cluster is coming online now. We can facilitate proof of concept (PoC) access and accommodate specific requirements. Pricing is flexible and can be tailored based on term length and individual needs. For more details, feel free to reach out.