Data mining is the process of discovering patterns in large
data sets involving methods at the intersection of
machine learning,
statistics, and
database systems.
[1] Data mining is an
interdisciplinary subfield of
computer science and
statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use.
[1][2][3][4] Data mining is the analysis step of the "knowledge discovery in databases" process or KDD.
[5] Aside from the raw analysis step, it also involves database and
data management aspects,
data pre-processing,
model and
inference considerations, interestingness metrics,
complexity considerations, post-processing of discovered structures,
visualization, and
online updating.
[1]