Menu


Find the EGU on

Follow us on Twitter Find us on Facebook Find us on Google+ Find us on LinkedIn Find us on YouTube

Tag your tweets with #egu2012
(What is this?)

Please note that this session was withdrawn and is no longer available in the respective programme. This withdrawal might have been the result of a merge with another session.

ESSI2.11

Large scale data analytics and mining in the Earth Science domains
Convener: S. Fiore  | Co-Conveners: G. Aloisio , D. Arctur , P. Fox

In the last decade, improvements in environmental sensor technology, data collection systems as well as in supercomputing platforms and modeling activity have been dramatically increasing, leading to the continuous production of huge amounts of data. Many traditional application domains in the geoscience context can be considered today “data intensive”. This is strongly impacting on the definition of new requirements for peta/exa-scale data analysis and mining. In such a context, the access, analysis, visualization and mining of large volumes of distributed data, play a key role in the scientific productivity of researchers and scientists. At the same time, the increasing scale and complexities of the source data structures and data management frameworks can obscure the "fitness for use" of individual data sources and derived products. Provenance, uncertainty, trust, and other properties of source data must be captured and properly managed through processing workflows to enable, e.g., accurate understanding and distinction of anomalies from valid data patterns. Significant improvements in the data management field, therefore, will be critical to solve complex scientific problems. Advances in database technologies, data mining, high performance data storage & management, data visualization, and metadata management are strongly needed to address in a scalable way e-Science data management issues. The goal of this session is to present to researchers, professionals and practitioners the state-of-the-art related to novel approaches and methodologies for: - extreme-scale data analytics, - high performance data mining, - high performance I/O, - parallel data analysis tools, - peta/exa-scale data processing - high performance database management, - modern hardware to support analytical data processing, - cloud-based data analysis and mining, - massive datasets visualization, - distributed/grid data analytics and mining, - performance comparisons between clouds and HPC systems to support data intensive applications, - capture, management, and incorporation of provenance, uncertainty, trust, and other metadata needed for data consumers and decision makers to understand and utilize, as transparently as possible, the validity and fitness for use of original and derived data from distributed sources through various stages of processing, in the Earth Science domains.