Physicists at Brookhaven National Laboratory manage and analyze petabytes of data from the ATLAS Experiment at CERN's Large Hadron Collider in search of answers about the universe's smallest constituents.
One hundred meters below the Swiss-French border, an enclosed 27-kilometer ring serves as an exclusive race track of sorts. Instead of cars with souped-up engines, bunches of protons race around this track and physicists, not pit crews, keep things running smoothly. The goal here is not to cross the finish line first, but rather to cause collisions. When bunches of protons collide, the ATLAS (A Toroidal LHC Apparatus) detector records the photo finish—that is, the particles that are the byproducts of the collision.
The ATLAS Experiment, part of the Large Hadron Collider (LHC)—the world's biggest particle accelerator—at CERN (the European Organization for Nuclear Research), represents a worldwide effort to answer big questions about the smallest particles. "We want to know, what are the fundamental particles in the universe, what are they made of, could there be additional particles that we don't know about, and how do they interact?" says Howard Gordon, U.S. ATLAS deputy operations program manager at Brookhaven National Lab (BNL) in New York, the host laboratory for ATLAS in the U.S.
Finding the answers to these questions—such as what gives particles mass and what comprises dark matter—could greatly advance not only the world's knowledge of high-energy physics, but a variety of fields in science and beyond. "We don't necessarily know what the applications of this research will be at this point—this is inquiry-based research," says Gordon.
The task of identifying the universe's fundamental particles is complex enough that it takes 3,000 scientists at 174 institutions around the world to operate the ATLAS detector and analyze the data generated. Data generated at CERN, the Tier-0 computing facility, is transferred to Tier-1, -2, and -3 centers through a federated grid-based computing system. BNL, the largest ATLAS Tier-1 center in the world, is responsible for 23% of the total Tier-1 computing and storage capacity, says Michael Ernst, manager, Relativistic Heavy Ion Collider/ATLAS Computing Facility at BNL.
Srini Rajagopalan, a physicist at BNL who is currently transitioning to Gordon's position, explains the ATLAS detector as a "gigantic, multibillion pixel camera." Protons pass through each other at the center of the detector 20 million times every second (currently every 50 nanoseconds). "Imagine that you take that many pictures over and over, every 50 nanoseconds of every minute, every hour for months," says Rajagopalan. The amount of data that is produced is far too much to record and store, especially given ATLAS' limited running time (about 30% of the year).
"We just don't have the technology to write out that many events so we have to run algorithms to get the numbers down," says Rajagopalan, who for the past five years has been working on ATLAS' trigger—the name for the algorithms that are programmed to look for specific patterns associated with different physics phenomena. The algorithms must be carefully programmed to suppress what Rajagopalan calls "fakes" (or background events) and to keep events which could lead to new physics results. Even with the aid of the trigger, "We write 300 1.5 megabyte (MB) events to disk every second. That's a Justin Bieber CD's worth of information every second," says the father of a teenage girl with a laugh.
When the LHC first started, there were significantly less protons per bunch, but as time goes on the luminosity (how many protons are colliding) of the LHC is increasing. And more collisions equal more data. Rajagopalan's challenge is to fine-tune the trigger to keep up with this influx of data, while preserving the most potentially useful information (events that are not picked up by the trigger are not stored for future use because of the immense volume of data coming in). "We have to look at 20 million events every second, pick around 300 events most interesting to physics under study, and trash the rest immediately." To do this well, the algorithms through which the data passes must be both fast and accurate.
Data management and analysis
ATLAS raw data, which originates at the LHC, is maintained in a distributed fashion at Tier-1 centers through the Worldwide LHC Computing Grid. A grid system (as opposed to one central data repository) is ideal not only because of the sheer amount of data generated, but because a federated grid spurs a diversity of approaches that leads to the adoption of best practices, says Ernst. "Scientific innovation in data-intensive science requires distributed access to that data."
Through the Worldwide LHC Computing Grid, BNL receives its share of raw data, almost instantly, from CERN. BNL reprocesses this data with improved processing capabilities (interpreting the electronic signals produced by the detector to determine the original particles that passed through, their momenta and directions, and the primary vertex of the event) and then adds it back into the complete data set (comprised of the results from all Tier-1 facilities). While a complete dataset is kept at CERN, it is not stored in a form that can be used for analysis. Instead, BNL and other Tier-1 facilities create derived datasets for physicists to use for further analysis, including duplicate datasets for the most popular data. "We must provide access to a huge data volume, requested by several thousand users worldwide simultaneously, in addition to managing more than 50,000 concurrently running user analysis jobs," says Ernst. The data hosted at BNL was replicated to Tier-1, -2, and -3 sites at a rate of about 200 MB/second over the past six months.
The volume of raw ATLAS data totals about 1 petabyte a year, but that volume is multiplied several times over (there was a total of 7 petabytes of ATLAS data created in 2010) when secondary data analysis and derived datasets are included.
The hunt for Higgs
One of the highest-profile ATLAS projects is the search for the Higgs boson, a hypothetical elementary particle. "We are looking for the Higgs because we believe it is what gives particles mass," says Rajagopalan. Multiple triggers have been designed, each focusing on specific particles into which the Higgs might decay. Each trigger selects certain physics events that could provide evidence of the existence of the Higgs.
About 1 petabyte of raw data has been filtered through the Higgs triggers so far, and Ernst estimates it will take another petabyte of data (amounting to about another year of data collection) before physicists can hopefully confirm or rule out the existence of the Higgs. Thus far, physicists have noted a little bit of an excess of events that might point to a Higgs particle, but it is not sufficient to say definitively whether it exists or not.
Either way, work related to the mass of particles is far from over, says Rajagopalan. "If the Higgs is discovered, we'll know that it's there, but we'll need to understand its properties, how it works, how it interacts with other particles. If it's ruled out, we have to work to discover an alternative explanation of what gives particles mass."
The future of ATLAS
"The ATLAS detector will run for as long as the next 20 years," says Gordon, given periodic breaks for maintenance and upgrades. Many of the physicists working on ATLAS at BNL contributed to the original construction of ATLAS detector parts. Now, they have a role in upgrading the parts to allow for more physics capabilities. "We have to improve the trigger to extract the events of interest," says Gordon. "We have some ideas about how to improve the trigger [which is currently running at 20% to 30% of the intended design intensity] when the intensity of beams gets higher."
The next scheduled ATLAS shutdowns are in 2013–2014, 2018, and 2022. In 2022, physicists are scheduled to replace parts that will have become damaged by radiation. The upgrades are not inexpensive—but they are necessary, says Rajagopalan. "The investment in science provides a foundation for our future. Where we are today—all of the advancements in technology, science, and medicine, is because of a solid foundation in basic research. It's important to continue to build that foundation so we have a brighter future tomorrow."
— Diana Friedman
Top image: A view of the ATLAS detector at CERN's Large Hadron Collider.