Treaties forbid the detonation of nuclear test weapons — which creates problems for national defense developers who need to efficiently certify the effectiveness of their arsenal. Luckily for them, a powerful new supercomputer is now able to replicate the physical impact of nuclear explosions — albeit digitally. And luckily for us, the resulting innovations may spill over to more constructive areas.

The number-crunching required to simulate an actual nuclear explosion is staggering. Computer scientists need to simulate molecular-scale reactions taking place over the course of milliseconds. To get this level of detail, researchers at Purdue and the National Nuclear Security Administration's (NNSA) Lawrence Livermore National Laboratory had to coordinate over 100,00 machines. They also had to split multiple processes in parallel on separate machines in large computer clusters.

And as any computational scientist worth his grain of salt will tell you, once you start to scale this high, you are virtually guaranteed to experience failed error detection and bottlenecks in communication and computation — and this is exactly what started to happen. Initially, the researchers discovered that natural faults in the execution environment frequently resulted in errors, resulting in corrupted memory and failed communication between machines. The challenge, therefore, was in managing the scale.


Their solution: Scalable supercomputer clustering.

In their revised configuration, each machine in the supercomputer cluster contained several processors with each one running a "process" during a simulation. The researchers created an automated method for "clustering," or grouping the large number of processes into a smaller number of "equivalence classes" with similar traits. Grouping the processes into equivalence classes made it possible for them to quickly detect and pinpoint problems.


Their breakthrough marked an important step forward in the development of ultra-precise simulations. It is thought that the same simulation architecture used by these researchers could eventually be applied to such areas as climate modeling and studying the dynamic changes in a protein's shape.

The findings will be presented during the Annual IEEE/IFIP International Conference on Dependable Systems and Networks from June 25-28 in Boston. Recent research findings were detailed in two papers last year, one presented during the IEEE Supercomputing Conference and the other during the International Symposium on High-Performance Parallel and Distributed Computing.

Via. Top image via NDD. Inset image via Lawrence Livermore National Laboratory.