Skip directly to content

Big Data

RxR's picture

The Evolving Data Lifecycle

on Sat, 10/19/2013 - 01:16

This essay is based on my presentation at eResearch Conference, Brisbane Australia 10/21/2013

The spotlight is on Data

Data within the research process has now taken center stage. The amount of data ranges the enormous quantities produced by large planned science missions to the smaller amounts produced by individual researchers, the so-called long tail of science. While the current focus in on data, it is important to look at data in context to the research process it self -- the data life cycle.

Looking at the Data Life Cycle

A scientific research process can be represented as a data lifecycle consisting of a series of stages through which data passes during its lifetime.  These stages include data processing, archiving, discovery, and finally use. Use by itself encompasses several sub-stages of access, integration, visualization, analysis, and sharing. These stages may have slight variations within different science domains and applications but in general remain consistent across many domains. The goal of informatics researchers is to make this process efficient for researchers, address existing gaps/hurdles, seamlessly integrate new evolving technology, and enable new types of research capabilities.

Factors Impacting Data Life Cycle

The data life cycle is dynamic, constantly evolving driven by several factors. The factors drive changes to the life cycle at both micro and macro level. At micro level, the changes are to the individual steps within the cycle where as at the macro level, the steps that constitute the cycle may get modified.  While these factors may overlap, they can be categorized based on four different perspectives. These are:

1. Data Perspective