1 Introduction to Data Mining

1.1 What is Data Mining?

• Process of discovering interesting patterns of knowledge from huge amounts of data.

• Interesting patterns: Valid, Novel, Useful, Understandable

Example

• Retailers collect data about customer purchases at the checkout counters

• Customer purchasing patterns: Identify which items are frequently sold together?

• Products that are likely to be purchased together.

Why it is useful?

• Can make a purchase suggestion to their customers

• Gives an idea that how we can arrange items in a store to as a strategy for boosting sales.

• Scalability

• High dimensionality

• Heterogeneous and complex data

• Data ownership and distribution

Predictive tasks: Predict the value of a particular attribute based on the values of other attributes
Descriptive tasks: Find human-interpretable patterns that describe data

Many more…