Skip directly to content

Science Informatics – What is in a name?

RxR's picture
on Fri, 07/05/2013 - 21:10
What's in a name? that which we call a rose / By any other name would smell as sweet.” –Shakespeare, Romeo and Juliet
 
Informatics has become a commonly used term in a wide variety of science domains, yet what is really meant when we use this term? Does informatics mean the same thing for all domains, or are there nuanced differences in scope and meaning associated with the term? 
 
In order to investigate this, the definition and scope of different science informatics terms were reviewed and then compared along the dimensions of their defined objectives and the data life cycle components which they encompass. These different terms and their definitions are presented below in chronological order:
 
Bioinformatics:  Coined in 1970, the initial definition for this term was “the study of informatics processes in biotic systems” [6]. As evolutions in the field led to exponential increases in sequence data the definition evolved as well, eventually coming to mean the development and use of computational methods for data management and analysis of sequence data.
 
The current objective of Bioinformatics is to provide solutions for data management and analysis of bio-medical datasets. Bioinformatics focuses on the data life cycle components of data management and analysis.
 
Environmental Informatics: This term was proposed by Husar in 1990 [5] and defined as the application of scientific methods and contemporary technology to environmental information systems in order to improve their storage, retrieval, and optimal use (analysis).  In keeping with this definition, the objective of Environmental Informatics was to improve efficiency by addressing data access and manipulation issues which were viewed as current roadblocks at the time. Data life cycle components covered three phases: data acquisition, data exploration and analysis, and the presentation/dissemination of data.
 
Ecological Informatics: The first reference to this term can be found online in the technical program for the 3rd Conference of the International Society of Ecological Informatics [4]. Ecological Informatics was considered a newly emerging discipline defined as the design and management of ecological databases and their exploration by machine learning of ecosystem analysis, synthesis, and prediction. The focus was also heavily on the design and application of biologically inspired computational techniques for ecological analysis, synthesis, and forecasting.
 
The 4th Conference of the International Society for Ecological Informatics was held in 2004 [9], and two years later the Ecological Informatics Journal began publication. Reflecting changes in the field, the definition of Ecological Informatics was modified to:
 
“[an] interdisciplinary framework promoting the use of advanced computational technology for the elucidation of principles of information processing at and between all levels of complexity of ecosystems—from genes to ecological networks and aiding transparent decision-making in relation to important issues in ecology such as sustainability, biodiversity, and global warming.” [9]
 
With this updated definition, the objective of Ecological Informatics now focused on data integration across ecosystem categories as well as levels of complexity, inference from data patterns to ecological processes, and the adaptive simulation and prediction of ecosystems.  Meanwhile, biologically-inspired computation techniques such as fuzzy logic, cellular automata, artificial neural networks, evolutionary algorithms, and adaptive agents were still considered important topics within the purview of Ecological Informatics. Data life cycle components of data archival, retrieval, analysis, visualization, synthesis, and forecasting remained in the scope of Ecological Informatics.
 
X Informatics:  This term was proposed for more general use, with the ‘X’ intended to be chosen as representing any science domain. Baker and Fox [8] define X Informatics as both the science and engineering that span the gap between Information Technology systems and Cyber-Infrastruture and the use of digital data, information, and related services for research and knowledge generation.
 
Three layers proposed within this definition are: cyberinformatics, the interface with computing infrastructures; core informatics, concerned with informatics as a discipline; and science informatics (X Informatics) which enables science-domain-specific customizations needed to deliver relevant and valuable services to a wide range of users such as research workers, decision makers, and the general public.
 
The objective of X Informatics is to design solutions focused on the organization of data and information, resulting in increased knowledge extraction. The data life cycle components of X Informatics therefore focus on data discovery, location, acquisition, format conversion, analysis, and visualization.
 
AstroInformatics: Defined as a “formalization of data-intensive astronomy and astrophysics for research and education” [7], in this “big data”-centric view informatics serves as a layer between data and Knowledge Discovery in Databases (KDD) components, thus providing a standardized presentation of information as well as serving as a source of information for the aforementioned KDD components. 
As one would expect from this functionality, the data life cycle components within this definition focus primarily on information extraction for use within KDD processes.
 
Climate Informatics: Climate Informatics is defined as the “interface of climate science with machine learning, data mining, statistics, and related fields” [2]. The objective of Climate Informatics is to replicate the impact that machine learning has made on natural sciences such as bioinformatics. As with Astro informatics, the data life cycle components central to Climate Informatics are analysis and knowledge extraction.
 
EcoInformatics:  EcoInformatics is defined by Michener [3] as “[a] framework that enables scientists to generate new knowledge through innovative tools and approaches for discovering, managing, integrating, analyzing, visualizing and preserving relevant biological, environmental, and socio-economic data and information”. The primary focus of EcoInformatics is on increasing scientific productivity by supporting faster and easier data discovery, integration, and analysis. All stages of the data life cycle are covered including planning, collection of metadata needed both to understand the context and for archiving, data discovery, integration/synthesis, and finally analysis.
 
Observations
 
Having reviewed several of these science informatics terms, there are two schools of thought to consider. First, we can consider the scope of informatics as being primarily focused on the application of computational tools to analysis, knowledge extraction, and data management activities focused on supporting knowledge extraction; this is a specialized view, focused on the linear progression from data to information to knowledge. Alternatively, we can consider that technology now plays an important role in every stage within the data life cycle; consequently, all aspects of informatics need to be considered. This view is more holistic, concerned with the “data life cycle” as an entity instead of its disparate parts and sequences.
 
We are then left with the question: just what is Science Informatics? One could use Tolliver’s definition [1] that “it is a focus on a specific science domain in which information and computational sciences (including information science, library science, computer science, cognitive science, organizational science, etc.) are utilized to support research, education, and application”. Adopting such a holistic view towards informatics is of great benefit. As we are slowly recognizing data as a first-class research object within scientific research domains as opposed to something that is consumed and discarded after knowledge extraction, understanding the importance of data life cycle as a whole becomes increasingly important. 
 
As technology now plays a crucial role within this data life cycle, making possible improved research productivity, reduced redundancy of effort and allowing many scientific tool improvements, it also has the potential to change the data life cycle itself.  Technology has opened the doors for new improvements in interdisciplinary research, allowing researchers from disparate disciplines to effectively use both data and tools from foreign domains. These data and tools support not only research and education as they become increasingly accessible but also the communication of scientific results and information to the general public, which in turn builds awareness of research. 
 
By viewing informatics through a holistic rather than specialized lens, we can now easily influence every component within the data life cycle, consequently significantly impact current and future scientific research and education process. 
 
 
References
 
[1] J. Tolliver, “Musings on ‘informatics’ found on the Web,” Web Page, 2008. [Online]. Available: https://rhino.ncsa.illinois.edu/display/extra/How+others+define+informatics. [Accessed: 17-Jun-2013].
[2] The N. Y.  Academy of Sciences, “The First International Workshop on Climate Informatics,” Web Page, 2011. [Online]. Available: http://www.nyas.org/Events/Detail.aspx?cid=462a8558-34c0-4e9e-8cca-97ffd.... [Accessed: 17-Jun-3013].
[3] W. K. Michener and M. B. Jones, “Ecoinformatics: supporting ecology as a data-intensive science.,” Trends in ecology & evolution, vol. 27, no. 2, pp. 85–93, Feb. 2012.
[4] ISEI, “3rd Conference of the International Society of Ecological Informatics,” Web Page, 2001. [Online]. Available: http://www.isei3.org/. [Accessed: 17-Jun-2013].
[5] R. Husar, O. Todd, and H. Edwards, “Environmental Informatics: Implementation Through the Voyager Data Exploration Software,” in 83rd Annual Meeting of the Air and Waste Management Association, 1990.
[6] P. Hogeweg, “The roots of bioinformatics in theoretical biology.,” PLoS computational biology, vol. 7, no. 3, p. e1002021, Mar. 2011.
[7] K. D. Borne, “Astroinformatics: data-oriented astronomy research and education,” Earth Science Informatics, vol. 3, no. 1–2, pp. 5–17, May 2010.
[8] D. N. Baker, C. E. Barton, W. K. Peterson, and P. Fox, “Informatics and the 2007–2008 Electronic Geophysical Year,” Eos, Transactions American Geophysical Union, vol. 89, no. 48, pp. 485–486, 2008.
[9] ISEI, “4th Conference of the International Society for Ecological Informatics (ISEI),” Web Page, 2004. 
 
Editing credit - Shannon Flynn UAH
 

Post new comment