Terrill Dicki
Might 13, 2026 17:28
NVIDIA’s XANI workflow slashes nanoscale imaging knowledge evaluation from 9 months to beneath 4 hours utilizing Grace Blackwell Superchips.
NVIDIA has unveiled a serious breakthrough in nanoscale imaging with its Accelerated X-ray Evaluation for Nanoscale Imaging (XANI) workflow. Utilizing its Grace Blackwell Superchips, the corporate has reduce down knowledge processing time for X-ray free-electron laser (XFEL) amenities from 9 months to beneath 4 hours—an enchancment of over 1,000x.
XFEL amenities, equivalent to LCLS-II within the U.S. and European XFEL in Germany, generate large datasets whereas probing the atomic and digital dynamics of superior supplies like semiconductors, batteries, and catalysts. These amenities produce as much as 1 million X-ray pulses per second, capturing structural shifts on the atomic stage in actual time. Nevertheless, analyzing the ensuing terabytes of multidimensional knowledge has historically been a computational bottleneck.
NVIDIA’s XANI answer leverages the GB200 Grace Blackwell Superchips to speed up this course of. By combining GPU-based processing with CUDA Python and distributed computing, the group compressed the evaluation of 42 terabytes of information to beneath 4 hours whereas sustaining precision. This can be a stark distinction to conventional CPU-bound workflows, which regularly course of simply 10% of a dataset throughout experiments.
Key Improvements in XANI
A number of technical developments underpin XANI’s efficiency:
GPU Acceleration: XANI achieved a 43x speedup on a single GPU and a 1,000x enhance on 64 GPUs in comparison with earlier CPU-based strategies.
cuPyNumeric Libraries: New libraries, like LMFIT and multithreaded HDF5, improved GPU utilization and enabled 165x quicker I/O throughput.
GPUDirect Storage (GDS): By straight loading knowledge into GPU reminiscence, XANI bypasses CPU bottlenecks, enabling learn speeds of as much as 700GB/s throughout 16 Grace Blackwell nodes.
The workflow additionally introduces a distributed reminiscence structure that simplifies scientific computing. By swapping NumPy imports for cuPyNumeric, researchers can routinely parallelize operations throughout clusters with out writing advanced MPI code. This makes XANI accessible to fields past physics, together with supplies chemistry and quantum computing.
Scaling for Subsequent-Gen Analysis
The XANI structure is designed for scalability. With its GPU-centric distributed mannequin, scientists can now analyze knowledge in actual time, offering stay suggestions throughout experiments. This functionality might redefine how XFEL amenities function, decreasing delays between knowledge assortment and actionable insights.
Because of advances in nonlinear least-squares algorithms and batched GPU computation, XANI can course of high-resolution imaging knowledge right down to the pixel stage. The workflow’s means to suit damped oscillations to detector knowledge in parallel ensures quicker and extra exact outcomes than ever earlier than.
Implications for Scientific Discovery
NVIDIA’s XANI workflow represents a paradigm shift for high-performance computing in scientific analysis. By decreasing evaluation instances from months to hours, it accelerates discoveries in supplies science, quantum physics, and past. XFEL amenities worldwide now stand to profit from these efficiencies, unlocking new alternatives for real-time experimentation.
For researchers, the implications are clear: superior GPU-based methods like Grace Blackwell Superchips have gotten indispensable instruments in tackling the info challenges of recent science.
Picture supply: Shutterstock







