Skip directly to content

Data Mining

RxR's picture

What is Analytics?

on Mon, 08/19/2013 - 18:47

Analytics is a term that is now often used interchangeably with Data Mining.

Everyone knows the Fayyad definition [1] on data mining, which is “the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data”.

Analytics, however, does not have a well-known definition.  Most attribute “analytics” to the Kohavi et. al [2], paper focusing on business analytics. In the footnote of this paper, the authors state that they use analytics and data mining interchangeably, yet there are some nuanced differences between data mining and analytics in the body of the work. While analytics is indeed applied data mining, there remain three important distinctions to consider between the two.

First, analytics focuses on effective customization of data mining via “verticalization”. Verticalization implies incorporating task-related domain knowledge into the analytic tools, removing the data analyst from the loop, and optimizing the performance of the tools in regard to execution speeds.

Secondly, analytics focuses on the usability of results. Results have to be presented in a manner such that business users can quickly gain insight from  intuitive visualizations of the results rather than sophisticated statistical plots.

Finally, analytics is itself an integral part of the data collection and decision support system rather than being an activity that is conducted outside of this system using separate sets of tools.

Given these distinctions, the question to ask within Earth Science is: do we really have analytics capability? Unfortunately, the answer is No.

[1]      U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “From Data Mining to Knowledge Discovery: An Overview,” Advances in Knowledge Discovery and Data Mining. The MIT Press, 1996.

rramachandran's picture

Data Mining and Semantic Web losing steam?

on Fri, 03/09/2012 - 23:15

If you look at Google trends data for Data Mining and Semantic Web term, you find interesting results.

 Data Mining vs Data Analytics

Based on the search volumes, it seems the interest in data mining is slowing eroding where as the interest in "data analytics" has picked up from 2007. However, the news references for data mining seem to be increasing. The news references are definitely trending upwards for data analytics.

Possible inference: Data mining is now well understood and commonly used that most people dont utilize the search engine for finding related resources. Or the relabeling to data analytics has now shifted the interest to a new term meaning the same thing. This is supported by the increase in the news references to data analytics.

Semantic Web vs Linked Data

The trends for these two terms are really interesting. The steady decline in the search volumes for the term semantic web indicates possible disillusionment perhaps. It is also interesting to note that the number of Linked Data searches have almost crossed over Semantic Web searches. Clearly, the momentum seems to have shifted to Linked Data (aka practical Semantic Web) even though the news references remain about the same.