Osaka University (Osaka U) is a leading research university in Japan. Its Cybermedia Center (CMC) hosts the university’s supercomputing resources. Historically, supercomputers at Osaka U were built to support both research and general education needs. To continue to attract leading researchers, CMC built a world-class, heterogeneous cluster targeted at scientific computing for a variety of workloads programmed for different architectures. The OCTOPUS cluster now attracts new users running a wide variety of workloads, from simulation to AI and machine learning.
Innovation in research often begins with brilliant minds supported by latest-generation High-Performance Computing (HPC) resources. Osaka U’s CMC supports a large variety of scientific fields that rely on supercomputing resources for breakthroughs, including high-energy physics, molecular dynamics, material, life, dental, social sciences, and others. Recently, a researcher used CMC systems to understand vortex breakdowns in supersonic flows. His breakthroughs are expected to help contribute to a supersonic combustion ramjet engine for air and space planes. Other activities are described in the university’s research profile.
“There is a growing demand for supercomputing in every field of science,” stated Susumu Date, Associate Professor at Osaka U’s CMC, “because researchers today rely heavily on scientific computing prior to the experimental stage and afterwards to analyze and correlate the results of observations.”
With earlier computing resources in CMC, the system was designed to support both HPC and non-research needs. Some of the challenges users experienced were related to the conflicts of trying to partition for both general users and parallel computing users, resulting in an unreliable resource for scientific computing. Seeking to continue to support important research areas, and guided by feedback from its users, Osaka U needed to expand its parallel computing capabilities beyond the existing systems in its data center.
“Our users’ biggest challenge, in most cases, is to achieve inter-node and intra-node parallelism,” added Professor Date. “Many are working with MPI and OpenMP coding to achieve greater parallelism. We needed to deliver more resources that supported their work.”
CMC’s research and user feedback resulted in the building of a new petascale heterogeneous supercomputer that supports a variety of scientific computing domains—simulation, visualization, AI/machine learning, and HPDA—on a single system.
Built on Intel® Xeon® Scalable processors, Intel® Xeon Phi™ processors, and GPUs, OCTOPUS supports a wide range of scientific research.
The Osaka University Cybermedia Center’s Over-Petascale Universal Supercomputer (OCTOPUS) supports researchers using a wide variety of coding and application environments, from open sourced and commercial codes written for x86 Intel® Architecture (IA) to CUDA-based GPUs, targeting traditional simulation, AI frameworks, genomics, and other fields of research.
“We had to explore the architecture of a new HPC system in terms of both hardware and software,” explained Professor Date, “so more people could take advantage of supercomputing resources. In particular, we had to look at an integrated architecture approach for HPC and HPDA, using x86 and other architectures.”
One of the key challenges in designing the system was to increase compute capacity within the data center’s power and cooling budget. Leveraging the performance and power efficiency of latest generation CPUs and GPUs and integrating Asetek’s RackCDU Direct-to-Chip liquid cooling on all compute nodes (including GPUs), CMC could maintain reliable and stable performance across the cluster without increasing operational and power budgets.
The new system delivers 1.463 petaFLOPS1 of throughput using multiple types of processor architectures and a Lustre* filesystem interconnected with InfiniBand* Architecture at 100 Gbps. OCTOPUS was built by NEC using Intel® Xeon® Scalable processors, Intel® Xeon Phi™ 7210 processors based on Many Integrated Core (MIC) architecture, Tesla* P100 GPUs (CUDA architecture), and a DirectData Networks (DDN) EXAScaler* storage system. It went into production in December of 2017.
Osaka University OCTOPUS Supercomputer at a Glance:
- Heterogeneous supercomputer to meet widely diverse research needs in simulation, visualization, AI/machine learning, and high-performance data analytics (HPDA)
- Intel® Xeon® Gold 6126 processors (236 nodes), Intel® Xeon® Platinum 8153 processors (2 nodes), Intel Xeon Phi 7210 processors (44 nodes)
- Intel Xeon Gold 6126 processors with four (per node) NVIDIA Tesla P100 using NVIDIA NVLINK* (37 nodes)
- 5X larger compute capacity compared to previous system for less cost1
Osaka University Cybermedia Center.
The new supercomputer boosts Osaka U’s scientific computing capacity by five times, which has given researchers a new level of resources to work with.
“The new system is leading to an increase of users, which is a good impact,” concluded Professor Date.
Because OCTOPUS is heterogeneous, users can choose the resources they need based on their particular codes and research—IA or MIC Intel CPUs or CUDA GPUs. CMC has completed user surveys, in which users have reported higher performance than their previous system.
“Today, OCTOPUS is running machine learning and other AI-related jobs, which we have not seen before,” said Professor Date. “Plus, we are seeing other new types of work from users. We designed the new system for these new workloads.”
Osaka U’s CMC needed to enhance its computing capabilities to keep and attract researchers from around the world. Based on research and user feedback, it specified a one-plus petaFLOPS supercomputer with a heterogeneous architecture. Built on Intel® Xeon® Scalable processors, Intel® Xeon Phi™ processors, and the latest GPUs, the new OCTOPUS cluster delivers 1.463 petaFLOPS, supporting a wide variety of workloads across many scientific fields and drawing new users to the university.
Osaka’s OCTOPUS cluster supports a wide variety of workloads, from simulation to AI and machine learning.
- NEC LX* Servers 406 Rh-2 with Intel® Xeon® Scalable processors
- NEC LX* Server 102Rh-1G with Intel® Xeon® Scalable processors and NVIDIA P100 GPUs
- NEC Express5800/HR110c-M* Servers with Intel® Xeon Phi™ processors
- NEC LX* 116Rg servers with Intel® Xeon® Scalable processors
- DDN EXAScaler (3.1 PB) Lustre storage cluster