Thursday, November 29, 2012

On (not) mixing static methods in stateful classes

Pure_function sometimes are modeled as static methods in Java should not be present in classes which also have state. Static methods are appropriate for things that don't have associated state. Some factory methods, "purely functional" methods like Math.min, etc are all perfectly acceptable static methods. Purely functional methods have no need for dependency injection, or to interact with the enclosing object and hence should be refactored into their own utility classes.
At the meager cost of a few more files and some more code we will have simpler objects and a better set of interfaces. Such a version might only be a couple of lines shorter than the other implementation but it knows much less about its constituent parts. This ensures, the functionality of helper functions is no longer locked in the context of another object. The Unix philosophy illustrates that small components that work together through an interface can be extraordinarily powerful. Nesting an aspect of your domain as an implementation detail of a specific model conflates responsibilities, bloats code, makes tests less isolated and slower, and hides concepts that should be first-class in your system. In "The Art of Unix Programming" Eric Raymond states Rule of Modularity as “write simple parts connected by clean interfaces.” This philosophy is a powerful strategy to manage complexity. Like Unix, systems/libraries should consist of many small classes each of which are focused on a specific task and work with each other to accomplish a larger task. Finally to quote Kent Beck from the Smalltalk Best Practice Patterns "Good code invariably has small methods and small objects. I get lots of resistance to this idea, especially from experienced developers, but no one thing I do to systems provides as much help as breaking it into more pieces.

Thursday, January 26, 2012

TED talk: Algorithms that shape our world

Kevin Slavin talks about how people will one day terraform the earth just to make algorithms get access to data faster. Talks about how algorithms
 i. on caused the price of the book "The Making of a Fly" to become 23 million USD
ii. caused 9% of wealth of US stock markets to disappear in the Flash Crash of 2:45

Monday, January 23, 2012

Summarize large amounts of frequency data in sublinear space

Count Min Sketch is a sublinear space datastructure which can be used for approximate answers to data streams for points, ranges and etc. It can be used for finding the most frequent items (approximately) and also extended to find anomalies or differences in streams for monitoring.
Original paper:
Related paper: Finding significant differences in Network Data Streams

Sunday, January 22, 2012

Discussion on Spatial indexing algorithms

For people interested in Spatial Databases there is an interesting list of algos used for indexing (Quadtrees, Geohashes and Hilbert curves) at

LZMA algo and XZ Utils

XZ Utils is a data compression software with pretty high compression ratio. It uses the LZMA algorithm and has much better compression ratios than bzip2

Tuesday, December 14, 2010

What winter in Pacific Northwest does to you

winter up north
Is it the same in every part of the high northern hemisphere? Does it always feel like you just woke up from a fever breaking, at noon, except it’s dark.Something about cloud cover, and axial tilt, and the moon, and magnets.

Wednesday, September 29, 2010

Frequency with which various adjectives are intensified with obscenities

'Fucking ineffable' sounds like someone remembering how to do self-censorship halfway through a phrase."

Tuesday, March 16, 2010

AdSense now serving multiple ad networks

Here is the mail from Adsense announcing this.


We have launched a new capability in AdSense allowing Google-certified ad networks compete directly within AdSense, which means that advertisers from these third-party networks will be able to compete with AdWords advertisers to show on the Google Content Network.

These new capabilities will automatically be enabled for your account, and you'll see a new section in your Ad Review Center where you can allow or block specific ad networks or all networks except AdWords. Please note that we'll gradually be adding new ad networks to AdSense accounts over the next few months, so you won't see any immediate impact on your ads or your earnings.

To ensure the quality of the ads that appear on AdSense publisher websites, Google will certify all participating ad networks for adherence to our standards for user privacy, ad quality, and speed. Some participating ad networks use targeting methods similar to Google's interest-based advertising to show more relevant ads to users on the sites they visit. These ad networks won't be permitted to collect data from your site for the purpose of subsequent interest-based advertising, but we'll allow networks that comply with user privacy guidelines to show ads using these tools. Publishers can opt out of user interest targeting from these ad networks, and Google has changed our requirements for third-party ad serving to reflect this.

We are currently only accepting ads from Google-certified ad networks in North America and Europe, but we will make this feature available to ad networks in additional parts of the world in the future.

To learn more and manage the ad networks appearing on your pages, visit the AdSense Help Center at and visit our blog post at

The Google AdSense Team

Wednesday, February 3, 2010

[Online Dating Advice] Exactly What To Say In A First Message

Ok, here’s the experiment. We analyzed over 500,000 first contacts on our dating site, OkCupid. Our program looked at keywords and phrases, how they affected reply rates, and what trends were statistically significant. The result: a set of rules for what you should and shouldn’t say when introducing yourself online. Let’s go:
Rule 1
Be literate
Netspeak, bad grammar, and bad spelling are huge turn-offs. Our negative correlation list is a fool’s lexicon: ur, u, wat, wont, and so on. These all make a terrible first impression. In fact, if you count hit (and we do!) the worst 6 words you can use in a first message are all stupid slang.
Language like this is such a strong deal-breaker that correctly written but otherwise workaday words like don’t and won’t have nicely above average response rates (36% and 37%, respectively).
Interesting exceptions to the “no netspeak” rule are expressions of amusement. haha (45% reply rate) and lol (41%) both turned out to be quite good for the sender. This makes a certain sense: people like a sense of humor, and you need to be casual to convey genuine laughter. hehe was also a successful word, but much less so (33%). Scientifically, this is because it’s a little evil sounding.
So, in short, it’s okay to laugh, but keep the rest of your message grammatical and punctuated.
Rule 2
Avoid physical compliments
Although the data shows this advice holds true for both sexes, it’s mostly directed at guys, because they are way more likely to talk about looks. You might think that words like gorgeous, beautiful, and sexy are nice things to say to someone, but no one wants to hear them. As we all know, people normally like compliments, but when they’re used as pick-up lines, before you’ve even met in person, they inevitably feel…ew. Besides, when you tell a woman she’s beautiful, chances are you’re not.
On the other hand, more general compliments seem to work well:
The word pretty is a perfect case study for our point. As an adjective, it’s a physical compliment, but as an adverb (as in, “I’m pretty good at sports.”) it’s is just another word.
When used as an adverb it actually does very well (a phenomenon we’ll examine in detail below), but as pretty’s uses become more clearly about looks, reply rates decline sharply. You’re pretty and your pretty are phrases that could go either way (physical or non-). But very pretty is almost always used to describe the way something or someone looks, and you can see how that works out.
Rule 3
Use an unusual greeting
We took a close look at salutations. After all, the way you choose to start your initial message to someone is the “first impression of your first impression.” The results surprised us:
The top three most popular ways to say “hello” were all actually bad beginnings. Even the slangy holla and yo perform better, bucking the general “be literate” rule. In fact, it’s smarter to use no traditional salutation at all (which earns you the reply rate of 27%) and just dive into whatever you have to say than to start with hi. I’m not sure why this is: maybe the ubiquity of the most popular openings means people are more likely to just stop reading when they see them.
The more informal standard greetings: how’s it going, what’s up, and howdy all did very well. Maybe they set a more casual tone that people prefer, though I have to say, You had me at ‘what’s up’ doesn’t quite have the same ring to it.
Rule 4
Don’t take it outside
Obviously, all successful OkCupid relationships outgrow our in-site messaging feature. But an offer to chat or of an email address right off the bat is a sure turn off. One of the things online dating has going for it is its relative anonymity, and if you start chipping away at that too early, you’ll scare the other person off.
Also, don’t ask for or give away a cell number (10%). I thought that was a no-brainer.
Rule 4
Bring up specific interests
There are many words on the effective end of our list like zombie, band, tattoo, literature, studying, vegetarian (yes!), and metal (double yes!) that are all clearly referencing something important to the sender, the recipient, or, ideally, both. Talking about specific things that interest you or that you might have in common with someone is a time-honored way to make a connection, and we have proof here that it works. We’re presenting just a smattering: in fact every “niche” word that we have significant data on has a positive effect on messaging.
Even more effective are phrases that engage the reader’s own interests, or show you’ve read their profile:
Rule 5
If you’re a guy, be self-effacing
Awkward, sorry, apologize, kinda, and probably all made male messages more successful, yet none of them except sorry affects female messages. As we mentioned before, pretty, no doubt because of its adverbial meaning of “to a fair degree; moderately” also helps male messages. A lot of real-world dating advice tells men to be more confident, but apparently hemming and hawing a little works well online.
It could be that appearing unsure makes the writer seem more vulnerable and less threatening. It could be that women like guys who write mumbly. But either way: men should be careful not to let the appearance of vulnerability become the appearance of sweaty desperation: please is on the negative list (22% reply rate), and in fact it is the only word that is actually worse for you than its netspeak equivalent (pls, 23%)!
Rule 6
Consider becoming an atheist
Mentioning your religion helps you, but, paradoxically, it helps you most if you have no religion. We know that’s going to piss a lot of people off, and we’re more or less tongue-in-cheek with this advice, but it’s what the numbers say.
These are the religious terms that appeared a statistically significant number of times. Atheist actually showed up surprisingly often (342 times per 10,000 messages, second only to 552 mentions of christian and ahead of 278 for jewish and 142 for muslim).
Though very few people actually do it, invoking the sky-breaking thunderbolts of zeus does help a person get noticed (reply rate 56%), but maybe that shouldn’t be a surprise on a site that is itself named for a member of the Classical pantheon. So if you can’t bring yourself to deny the deity, consider opening yourself up to a whole wacky bunch of them. But ideally you should just disbelieve the whole thing. It can help your love life, and, besides, if there really was a god, wouldn’t first messages always get a reply?

Sunday, January 31, 2010

Moscow in slow-motion

Moscow in slow-motion - Amazing video