Thursday, November 29, 2012

On (not) mixing static methods in stateful classes

Pure_function sometimes are modeled as static methods in Java should not be present in classes which also have state. Static methods are appropriate for things that don't have associated state. Some factory methods, "purely functional" methods like Math.min, etc are all perfectly acceptable static methods. Purely functional methods have no need for dependency injection, or to interact with the enclosing object and hence should be refactored into their own utility classes.
At the meager cost of a few more files and some more code we will have simpler objects and a better set of interfaces. Such a version might only be a couple of lines shorter than the other implementation but it knows much less about its constituent parts. This ensures, the functionality of helper functions is no longer locked in the context of another object. The Unix philosophy illustrates that small components that work together through an interface can be extraordinarily powerful. Nesting an aspect of your domain as an implementation detail of a specific model conflates responsibilities, bloats code, makes tests less isolated and slower, and hides concepts that should be first-class in your system. In "The Art of Unix Programming" Eric Raymond states Rule of Modularity as “write simple parts connected by clean interfaces.” This philosophy is a powerful strategy to manage complexity. Like Unix, systems/libraries should consist of many small classes each of which are focused on a specific task and work with each other to accomplish a larger task. Finally to quote Kent Beck from the Smalltalk Best Practice Patterns "Good code invariably has small methods and small objects. I get lots of resistance to this idea, especially from experienced developers, but no one thing I do to systems provides as much help as breaking it into more pieces.

Thursday, January 26, 2012

TED talk: Algorithms that shape our world

Kevin Slavin talks about how people will one day terraform the earth just to make algorithms get access to data faster. Talks about how algorithms
 i. on Amazon.com caused the price of the book "The Making of a Fly" to become 23 million USD
ii. caused 9% of wealth of US stock markets to disappear in the Flash Crash of 2:45



Monday, January 23, 2012

Summarize large amounts of frequency data in sublinear space

Count Min Sketch is a sublinear space datastructure which can be used for approximate answers to data streams for points, ranges and etc. It can be used for finding the most frequent items (approximately) and also extended to find anomalies or differences in streams for monitoring.
Original paper: http://www.eecs.harvard.edu/~michaelm/CS222/countmin.pdf
Related paper: Finding significant differences in Network Data Streams


Sunday, January 22, 2012

Discussion on Spatial indexing algorithms

For people interested in Spatial Databases there is an interesting list of algos used for indexing (Quadtrees, Geohashes and Hilbert curves) at http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-indexing-with-Quadtrees-and-Hilbert-Curves

LZMA algo and XZ Utils

XZ Utils is a data compression software with pretty high compression ratio. It uses the LZMA algorithm and has much better compression ratios than bzip2