Problem: Customer Signature
Data Mining algorithms assume records are independently and identically distributed (i.i.d)
Need to summarize transactions/clickstreams into one record
Solutions:
- Provide aggregation/rollup operations.
- Avg/min/max for numeric values (e.g., transaction price)
- Count/percentages for values of discrete values (credit card brand)
- Provide powerful expression language