Challenge IV - Scalability
Yahoo had 465 million page views per day in December of 1999 (*)
That’s about 2-4GB of clickstream an hour, depending on the amount of clickstream information stored
What can we do with such volumes?
Are there useful aggregations of such data that can be done on the fly?