Assuming Twitter uses last 24 hours (wild guess) of hashtags frequencies of tweets to compute the trending topics. To get the top 100 (let's say) trending topics, a MapReduce job can be run and top 100 values can be picked. This can be further optimized by maintaining hourly aggregates of hashtags.
But, I am not sure if it makes sense to compute all the hashtags frequencies (probably millions) just to find the top 100 trending topics. Probably, there is a better way.
Similar problem also arises in finding search query trends of Google searches and its probably harder too.
0 comments:
Post a Comment