Free Newsletter
Register for our Free Newsletters
Advanced Composites
Amorphous Metal Structures
Analysis and Simulation
Asbestos and Substitutes
Associations, Research Organisations and Universities
Automation Equipment
Building Materials
Bulk Handling and Storage
CFCs and Substitutes
View All
Other Carouselweb publications
Carousel Web
Defense File
New Materials
Pro Health Zone
Pro Manufacturing Zone
Pro Security Zone
Web Lec
Pro Engineering Zone

Berkeley Lab technology dramatically speeds up searches of large databases

DOE/Lawrence Berkeley National Lab : 16 July, 2006  (Company News)
In the world of physics, one of the most elusive events is the creation and detection of 'quark-gluon plasma,' the theorized atomic outcome of the 'Big Bang' which could provide insight into the origins of the universe. By using experiments that involve millions of particle collisions, researchers hope to find unambiguous evidence of quark-gluon plasma.
Scientists describe such a collision with unambiguous evidence as a 'rare event,' which may be an understatement. For example, out of hundreds of millions of particle collisions in one experiment, an analysis found that only 80 collisions or 'events' merited further study as scientists search for evidence of 'jet quenching,' a phenomenon that may indicate the existence of quark-gluon plasma. Other research into such exotic physics phenomena as 'strangelets' needs to go through similar search processes.

In this full-energy collision between gold ions at Brookhaven Lab's Relativistic Heavy Ion Collider, as captured by the STAR detector, the tracks indicate the paths taken by thousands of subatomic particles produced in the collisions as they pass through STAR's Time Projection Chamber, a large, 3-D digitial camera, designed at Berkeley Lab.

Compounding the complexity of the search is the fact that the data files are on mass storage systems around the world, so locating and extracting these scientific needles from a virtual haystack of information would be very time-consuming and labor-intensive. For example, the brute-force approach of reading every record of the petabytes of distributed data from the Relativistic Heavy Ion Collider experiment called STAR at Brookhaven National Laboratory could take weeks at a time. The key to speeding up the searching process is to quickly locate those interesting events while ignoring millions of others so the important data can be extracted for further analysis.

Now, a search technology developed by researchers at the U.S. Department of Energy's Lawrence Berkeley National Laboratory makes the job much easier. The technology, known as the Word-Aligned Hybrid compression method, was developed and recently patented by John Wu, Arie Shoshani and Ekow Otoo of Berkeley Lab's Scientific Data Management Research Group.

WAH is currently used in a software package called FastBit to compress bitmap indexes. A bitmap index is a method of reducing the response time of queries involving common types of conditions in data objects, such as 'state = CA' and 'age >= 21.' It achieves this by storing certain pre-computed answers as bitmaps. For example, a bitmap index for 'state' might have one bitmap for each state in the U.S. Because computers can manipulate bitmaps efficiently, bitmap indices are efficient in searching for interesting records in large datasets.

WAH compression makes the bitmap index optimal in terms of computational complexity. A small number of the most efficient indexing schemes have this optimality property. What makes the new technology unique is that WAH-compressed indexes significantly outperform other schemes in tests.

'In tests conducted using actual data from high-energy physics experiments, we confirmed that our FastBit software is an order of magnitude faster than the best-known bitmap indexing schemes on average,' according to Wu, the lead developer of FastBit.

Thanks to work led by Berkeley Lab, the physicists working on the Solenoidal Tracker at RHIC high-energy physics experiment at Brookhaven now have a much more efficient tool in their search for evidence of the quark-gluon plasma. As their research evolves and the complexity of the problem unravels, scientists are finding that a new state of matter was definitely created at STAR, but to unambiguously characterize this new state of matter as the quark-gluon plasma, they need more sophisticated search criteria to locate the 'rare' collision events that would contain the clear signatures of the plasma.

Grid Collector, the software module for the STAR analysis framework, uses two technologies to provide STAR analysts with a new way of accessing collision data. The first is FastBit's searching capability, and the second is Storage Resource Managers, which provide access to files stored on remote storage systems. Both technologies were developed as part of DOE's Scientific Discovery through Advanced Computing Program. Instead of selecting the data files that contain the desired events as was previously done, analysts can now select events based on physically meaningful attributes known as tags. Through Grid Collector, analysis programs only read the selected events, instead of every event in the selected data files. Since most analysis jobs use only a fraction of the events in data files, the Grid Collector can significantly improve the turnaround time.

Without Grid Collector, many analysis jobs involving searches for rare events were considered nearly unfeasible. For example, Markus Oldenburg of Berkeley Lab's Nuclear Science Division and his colleagues were interested in 80 special events collected in 2001. Most participants in the project thought they could make more progress by pursuing alternative signatures, rather than spending the time to extract these 80 events. With Grid Collector, the researchers were able to extract the events in 15 minutes.

'The Grid Collector has opened new avenues for many challenging analysis jobs that we had to ignore or delay. These jobs are now practical with this innovative technology,' said Jerome Lauret, software coordinator for the RHIC/STAR experiment. 'By using FastBit, we may have very well abolished one limiting factor for our field.'
Bookmark and Share
Home I Editor's Blog I News by Zone I News by Date I News by Category I Special Reports I Directory I Events I Advertise I Submit Your News I About Us I Guides
   © 2012
Netgains Logo