Monday, January 23, 2012

Summarize large amounts of frequency data in sublinear space

Count Min Sketch is a sublinear space datastructure which can be used for approximate answers to data streams for points, ranges and etc. It can be used for finding the most frequent items (approximately) and also extended to find anomalies or differences in streams for monitoring.
Original paper: http://www.eecs.harvard.edu/~michaelm/CS222/countmin.pdf
Related paper: Finding significant differences in Network Data Streams

No comments:

Post a Comment