Data Storage, Data Lakes, Mining, Curation, Analytics, Memory

Both Climate Modeling and Earth System Modeling entail petabytes (~1015 bytes), if not exabytes (~1018 bytes) of observational data and sensor network data, as well as the vast amounts of data output from the simulation process itself.

Generally speaking, observational data is best stored locally, at a place near to where the data has been collected - simply because moving data has a cost, and the 'pride of ownership' factor helps preserve the quality and integrity of such data on a long term basis. Simulation output data on the other hand, is best stored near to the computing centres at which the simulations are made.

When useful data is stored in several separate facilities, it needs to be 'federated' and 'harmonized' so as to become accessible and useful. Gaining access to remotely stored data through global networks is required in all such cases.

Data is only useful if it is of known quality and the circumstances of its collection are well understood. Such 'meta-data' is itself another level of data that needs to be carefully curated and kept along with the original underlying source data.

This section covers many aspects of the entire data life cycle process: 

A geodata fabric for the 21st Century   

The evolution of the Data Scientist       The New Era of Computing: An Interview with "Dr. Data"

Pulling insights from unstructured data     Five steps to de-mystify Big Data Analytics    Big data, big dreams

Data warehouse modernization in the age of big data analytics

Is GOLAP the next wave for big data warehousing?

Data lakes and overcoming the waste of 'data janitor' duties

The poyglot problem: solving the paradox of the 'right' database

DOE exascale roadmap highlights big data    Mission analytics: data-driven decision making in government 

Different databases for different strokes    Software as a service for data scientists 

How to move 80 petabytes of data without down time     Data-Intesive System Evolution

What exactly is Big Data - if it's neither Big nor Data?    Global datasphere to hit 175 zettabytes by 2025

Expert Panel: What’s Around the Bend for Big Data?    5 factors driving the graph database explosion

Array databases: the next big thing in data analytics?     Rating the advanced aalytics vendors

Software-defined storage takes off as big data gets bigger

The CAP Theorem's growing impact    EarthServer  

Tool Enables Scientists to Uncover Patterns in Vast Data Sets

New Techniques Turbo-Charge Data Mining    The Evolving Art (and Business) of Data Curation

Codesign challenges for exascale systems: performance, power and reliablility

DOE Focuses on Scientific Data Integration      Why science really needs big data

The top 5 reasons to use multi-tier storage for managing scientific data

Understanding data intensive analysis on large-scale HPC compute systems

Stepping up to the life science storage system challenge

To Know, but Not Understand: David Weinberger on Science and Big Data

From Data to Knowledge: machine-learning with real-time and streaming applications

Self-driving databases are coming: what next for DBAs?

Los Alamos releases file index product to software community

From Microprocessors to Nanostores: Rethinking Data-Centric Systems    

What CIOs and CTOs need to know about Big Data and Data Intensive Computing

Availability in Globally Distributed Storage Systems   High performance scalable unified storage 

Big Data, Big Demand: Navigating the Cloud Storage Landscape    SDSC Cloud Storage Services

Supercomputer sails through world history

Storage systems for 'big data' dramatically speeds access to information

Big data revolution in astrophysics     Astronomers Leverage "Unprecedented" Data Set

Big Data in Space: Martian Computational Archeology      

Next Generation Team Science Platform     Optimize Storage Placement in Sensor Networks

As Supercomputers Approach Exascale, Experts Wrestle with Big Data    Storage at Exascale

The switch that could double USB memory 

Vendor specific storage:

The future of storage: hardware

HP: Exascale Data Center     Multiparadigm Data Storage for Enterprise Applications

Fujitsu Lets Big Data Cloud Flag Fly     Fujitsu Develops World's First Cloud Platform to Leverage Big Data

IBM storage breakthrough paves way for 330TB tape cartridges
IBM big data VP surveys landscape

IBM Design Wins the Storage Challenge at SC10

IBM Demos Record-Breaking Parallel File System Performance

Parallel File System OrangeFS Starts to Build a Following

IBM Announces HPC Storage Solution for Streaming Data

The Complexity of VMware storage management

ArongoDB reaping the fruits of its multi-modal labor

Forrester reshuffles the deck on BI and analytic tools

MINE: Maximal Information-based Nonparametric Exploration

MINE: Detecting novel associations in large data sets

Big Data file formats demystified


The State of the Lustre Community

Why Lustre Is Set to Excel in Exascale 

Xyratex announces acquisition of Oracle's Lustre assets


Hate Hadoop? Then you are doing it wrong

Hadoop: Big Data, Big Analytics, Big Insights

Large-scale seismic signal processing with Hadoop

Why Hadoop isn't the Big Data solution that you think it is

Spark just passed Hadoop in popularity on the web - here's why

SQL vs non-SQL:

Crowded NoSQL wave shows abundant options

How SQL++ makes JSON more queryable

The new math driving NoSQL analytics

Graph Databases:

A look at the Graph Database landscape

AWS unveils 'Neptune' graph database

In-Memory Computing:

In-memory computing is the key to real-time analytics

Using an In-Memory Data Grid for near real-time data analysis

Using In-Memory Data Grids for global data integration

In-memory boosts Oracle OLTP by 2X, analytics by 1000X 

Memory Technolgies:

Next generation photonic memory devices are 'light-written', ultrafast and energy efficient

Makng steps toward improved data storage

New 3D chip combines computing and data storage

Samsung now mass producing industry's first 2nd generation, 10-nanometer class DRAM 

Room-temperature operation of low-voltage, non-volatile, compound semiconductor memory cells 

Industry leaders join forces to promote new high performance interconnect    Future memory technology

T-rays will 'speed up' computer memory by a factor of 1,000 

Hybrid memory cube angles for exascale    

New angle for optical memories    New technology of ultrahigh density optical storage

Towards data storage at the single molecule level

A single-atom magnet breaks new ground for future data storage

Write speeds for phase-change memory reach record limits

UK Researchers develop  super-fast memory chip

Patent Granted for Super-Fast MRAM Data Storage     

Rice, UCLA slash energy needs for next-generation memory

Tantalizing discovery may boost memory technology

Storage approach mimics DNA in fossils    World's amllest hard disk

IBM Scientists Demonstrate Phase-Change Memory Breakthrough    IBM announces 3 bits/cell PCM 

Phase Change Memory-Based Moneta System Points to the Future of Computer Storage   

Battery and memory device in one    New computer memory can hold data 20 years without pwer

5D 'Superman memory crystal' heralds unlimited lifetime data storage

Solid state quantum memories set endurance records

Data storage using individual molecules   

Data storage in DNA becomes a reality    DNA storage crams 700 terabytes of data into a single gram

Scientists stored an Amazon gift card on some DNA