More Challenges
How can we prune the number of associations generated?
What is the loss metric for displaying a cross-sell?
- If you display the product most-likely to be purchased, would you display bread on all pages until bread was purchased?
- If you display product with highest lift, would you still display it if the probability was 0.01% (up from 0.00001%)?
Are there algorithms that could take a star-schema and mine it without flattening it (e.g., Query flocks)?
Bots/Crawlers tend to skew statistics dramatically.
How can marketing campaigns be taken into account?