Mu2e Simulation Campaign:Utilizing the Open Science Grid to increase sensitivity to new physicsWhen designing an experiment, one of the greatest challenges is having confidence that you have optimized the sensitivity of your measurement, squeezing every last bit of precision out of your resources. The Mu2e experiment is preparing to go through DoE Critical Decision 3c review next summer and they face this challenge when designing their apparatus including the solenoid system, shielding, and detector components. In order to model the various beam transport options, detector designs, and estimate background rates, the experiment estimated they would need to run millions and millions CPU hours of simulation in the months leading up to their review. This was well beyond the Mu2e allocation of computing resources at Fermilab, and so Mu2e turned to the Open Science Grid for opportunistic CPU hours.
The daily CPU hours utilized by the Mu2e experiment on the Open Science Grid during Oct 1 - 8, 2015. The peak utilization is greater than 500,000 CPU hours in a single day with the average CPU usage for the work being equivalent to 13.5K cores of computing which is more than the entirety of Fermilab GPGrid resources.
The simulation and calculations needed by Mu2e fit perfectly within the model of OSG operations. The calculations had relatively little input (small text files to configure the simulation parameters), heavy CPU utilization (modeling the transport of millions of particles through the solenoid), and small output files consisting of art-event root files of ~100 MB per job. With the help of the FIFE group, Mu2e was able to distribute their simulation code to worker nodes with CVMFS, and then opportunistically run jobs on 15 different OSG sites across the country. Mu2e was able to utilize more than 20 million CPU hours in only 5 months, averaging about 8000 concurrent jobs with peaks usage as high as 20,000 simultaneous jobs, all without any direct capital expenditures from Mu2e. Mu2e has already met their base goal and this successful campaign continues as they work to achieve their stretch goal of simulating 24 times their planned operational live time for three critical backgrounds. The results of these calculations have drastically increased the understanding of the experiment design well beyond anything what might have achieved with Fermilab CPU resources alone. If your experiment, proposal, or idea needs more computing resources from the OSG, please contact the FIFE group to learn how you too can access all this "free" computing.
Layout of the Mu2e Apparatus from http://mu2e.fnal.gov/public/gen/
- Mike Kirby