By incorporating Supermicro’s BigTwin as a fundamental building block for OpenShift with Supermicro’s GPU servers, this solution becomes one of the industry leading containerized solutions for AI and Deep Learning. The Supermicro Red Hat OpenShift Deep Learning solution is based on industry-leading GPU servers with the latest Intel ® Xeon ® processors, NVIDIA ® Volta ® GPUs, NVLink technology, making it an ultimate powerhouse for all your AI needs. This outcome demonstrates that application containerization provides the benefit of software portability, improved collaboration, and data reproducibility without significantly affecting performance. The Red Hat/Supermicro MLPerf Training v0.6 benchmark results closely match the NVIDIA DGX-1 closed division published results (within -6.13% to +2.29% where negative means slower, and positive values are faster than the NVIDIA results). In Supermicro’s Cloud Center of Excellence (CCoE) in San Jose, CA, we created a reference architecture running Red Hat OpenShift Container Platform software and ran the latest version of MLPerf Training to assess its performance relative to published results from NVIDIA. Last year a group of researchers from industry and universities developed MLPerf, a suite of compute intensive AI benchmarks representing real world AI workloads, for measuring performance of AI infrastructure. Training DNNs is very compute-intensive so it is worthwhile to reduce training times with hardware accelerators and software tuning. These DNNs are getting a lot of attention because they can outperform humans at classifying images and can beat the world’s best Go player. Companies are creating and training increasingly complex deep learning neural networks, or DNNs, in their datacenters. Executive SummaryĪI infrastructure is increasingly requiring higher performance computing. In addition to excellent performance, we demonstrate how OpenShift provides easy access to high-performance machine learning model training when running on this SUPERMICRO reference architecture. To our knowledge, this is the first time Red Hat Enterprise Linux-based containers were created for MLPerf v0.6 and running on Supermicro Multi-GPU system and 100G network, as opposed to commonly used NVIDIA NGC containers and Nvidia DGX-1. This Supermicro and Red Hat reference architecture for OpenShift with NVIDIA GPUs describes how this AI infrastructure allows you to run and monitor MLPerf Training v0.6 in containers based on Red Hat ® Enterprise Linux ®. Red Hat OpenShift is a leading enterprise Kubernetes platform, powered by open source innovation and designed to make container adoption and orchestration simple. In this whitepaper we run the AI workload, MLPerf Training v0.6, on the Red Hat ® OpenShift ® Container Platform with SUPERMICRO ® hardware and compare it to the MLPerf Training v0.6 results published by NVIDIA.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |