Wednesday, November 25, 2009

Sabina Stobrawe: Divorce lift

Sabina Stobrawe: Divorce lift

A typical wedding photo was affixed to door lift doors in a law firm. Unfortunately, every time the doors opened, the couple split up. But help was at hand for everyone in the same position as soon as they stepped into the lift: a sign showed the name of the law firm and which floor the office was on.


Tuesday, November 17, 2009

Closure Compiler

From Closure Compiler - Google Code:

"What is the Closure Compiler?

The Closure Compiler is a tool for making JavaScript download and run faster. It is a true compiler for JavaScript. Instead of compiling from a source language to machine code, it compiles from JavaScript to better JavaScript. It parses your JavaScript, analyzes it, removes dead code and rewrites and minimizes what's left. It also checks syntax, variable references, and types, and warns about common JavaScript pitfalls.

To help you work with your transformed code, you can also install the Closure Inspector, a tool that makes it easy to use the Firebug JavaScript debugger with the Compiler's output.

The Closure Compiler has also been integrated with Page Speed, which makes it easier to evaluate the performance gains you can get by using the compiler."

Monday, November 16, 2009

NoSQLs Compared

A good review of the various NoSQLs products for
1) supporting multiple datacenters,
2) the ability to add new machines to a live cluster transparently to the your applications,
3) data model,
4) Query API and
5) Persistence Design.
The systems reviewed are: Cassandra, CouchDB, HBase, MongoDB, Neo4J, Redis, Riak, Scalaris, Tokyo Cabinet and Voldemort.
Article: http://www.rackspacecloud.com/blog/2009/11/09/nosql-ecosystem/#

Friday, November 13, 2009

Funny Facebook Groups (Part 3)

When I was your age, Pluto was a planet
People Who Always Have To Spell Their Names For Other People
I Use my Cell Phone to See in the Dark
I Flip My Pillow Over to Get To The Cold Side
Enough with the Poking, Lets Just Have Sex
If this group reaches 4,294,967,296 it might cause an integer overflow
Fuck Iraq, We have to catch Voldemort
I read the group name, I laugh, I join, I never look at it again.
Honestly, I write "lol" and I’m not Even Laughing
I Will Go Out of My Way To Step On a Leaf That Looks Particularly Crunchy
It wasnt awkward until you said "well, this is awkward". now its awkward.
If I Fail My Exams, Its Facebook’s Fault

Funny Facebook Groups (Part 2)

Bollywood gave me unrealistic expectations about desi women

I Thought You Were Hot Until I Clicked on "View More Pictures"

Alcohol Improves my Foreign Language!

I Wish I Were Your Derivative So I Could Lie Tangent To Your Curves!

When I was your age, we had to blow on the video games to make them work...

Do you believe in love at first sight, or should I walk by again?

For those who have ever pushed a "pull" door

Geometry can kiss my Angle-Side-Side

If I were an enzyme i would be DNA helicase so i could unzip your genes

I Secretly Want To Punch Slow Walking People In The Back Of The Head

Funny Facebook Groups (Part 1)

Anteater-Mushroom Alliance

Writing Papers Single Spaced First Makes My Double Spaced Result Climactic

Most Replys Inner Scrotum!!!

I Paint My Nails Like a Blind Parkinson's Patient
I Dont care How Comfortable Crocs Are, You Look Like A Dumbass.

If You're OCD And You Know It Wash Your Hands!

Dora the Explorer is soo an Illegal Immigrant...

BRB... IM NOT REALLY GOING ANYWHERE, BUT NEITHER IS THIS CONVERSATION

DAMM (drunks against mad mothers)

When I Was Your Age, Shoes Didn't Have Wheels

Worst. Facebook Group. Ever.

it's obvi you're just jeal cause I speak in abrevs so whatev

Monday, November 9, 2009

Prefix Hash Tree

Distributed Hash Tables are scalable, robust, and self-organizing peer-to-peer systems that support exact match lookups. This paper describes the design and implementation of a Prefix Hash Tree - a distributed data structure that enables more sophisticated queries over a DHT. The Prefix Hash Tree uses the lookup interface of a DHT to construct a trie-based structure that is both efficient (updates are doubly logarithmic in the size of the domain being indexed), and resilient (the failure of any given node in the Prefix Hash Tree does not affect the availability of data stored at other nodes).
Paper: http://berkeley.intel-research.net/sylvia/pht.pdf

Sunday, November 8, 2009

A List of Peoples' Names That Are Dirty Sounding

From DirtySounding.com
  • Alotta Bush
  • Ben Dover
  • Dick N. Butts
  • Dixie Normous
  • Fonda Cox
  • Eaton Beaver
  • Giv M. Head
  • Harry Balls
  • Hugh Jorgan
  • Jack Schitt
  • Justin Yermouth
  • Master Bates
  • Moe Lester
  • Neil Down
  • Oliver Bush
  • Rolinda Joint
  • Sarah Tonin
  • Sawyer Crack
  • Seymour Butts
  • Willie Layer

Friday, November 6, 2009

Donkeys hired as zebras


In Gaza, there's a shortage of zebras. Donkeys get dye-job, take on zebra role:Two white donkeys dyed with black stripes delighted Palestinian kids at a small Gaza zoo on Thursday who had never seen a zebra in the flesh.

Nidal Barghouthi, whose father owns the Marah Land zoo, said the two female donkeys were striped using masking tape and women's hair dye, applied with a paint-brush.

'The first time we used paint but it didn't look good,' he said. 'The children don't know so they call them zebras and they are happy to see something new.'

A genuine zebra would have been too expensive to bring into Israel-blockaded Gaza via smuggling tunnels under the border with Egypt, said owner Mohammed Bargouthi. 'It would have cost me $40,000 to get a real one.'

Thursday, October 29, 2009

Memcached telnet interface

Memcached telnet commands

Command
Description
Example
get
Reads a value
get mykey
set
Set a key unconditionally
set mykey 0 60 5
add
Add a new key
add newkey 0 60 5
replace
Overwrite existing key
replace key 0 60 5
append
Append data to existing key
append key 0 60 15
prepend
Prepend data to existing key
prepend key 0 60 15
incr
Increments numerical key value by given number
incr mykey 2
decr
Decrements numerical key value by given number
decr mykey 5
delete
Deletes an existing key
delete mykey
flush_all
Invalidate specific items immediately
flush_all
Invalidate all items in n seconds
flush_all 900
stats
Prints general statistics
stats
Prints memory statistics
stats slabs
Prints memory statistics
stats malloc
Print higher level allocation statistics
stats items

stats detail

stats sizes
Resets statistics
stats reset
version
Prints server version.
version
verbosity
Increases log level
verbosity
quit
Terminate telnet session
quit

Wednesday, October 28, 2009

Tennis Pros: Hilarious Replies and One-Liers

Over the years, we have watched great tennis players who have played some beautiful tennis. But they have also given us great replies and one-liners. So, here we take a look at that.

Andy Roddick will definitely contribute to the list. Go A-Rod.

Hilarious Questions and Replies

After Wimbledon win, Roger Federer had this conversation with an interviewer:

Interviewer: 'After you had won Wimbledon, you were given a cow called Juliette when you returned to Switzerland. Is there another Juliette waiting for you?'

Federer: 'I hope not. By the way, Juliette is expecting a calf.'

Interviewer: 'Congratulations!'

Federer: 'Thanks, but I’m not the father.'


Before US open '05, Roddick was asked:

Interviewer: 'What are your chances in the US Open?

Roddick: 'As good as anybody not named Roger.'


Another Roddick gem:

Interviewer: 'You have a very fast serve.'

Roddick: 'It killed a small dog.'

After which comment, he claimed he was joking because she was not laughing at all...

Roddick: 'I'm joking, I am joking...The dog was huge.'


The press conference he did after losing to Roger in the Aussie '07 SF...had some interesting transcriptions...

Reporter: 'What was it like for you being on the end of that?'
Andy: 'It was frustrating. It was miserable. It sucked. It was terrible. Besides that it was fine.'

Reporter: 'What did Jimmy (Connors—his coach) say after?'
Andy: 'He gave me a beer.'

Reporter: 'Take us from 4-4, because up 'til then you were in the match. Then you got broken.'
Andy: 'Then I broken three more times. And two more times in the third set, and it was over 26 minutes later. Is that about what you saw too?'

Reporter: 'How do you rate Haas's or Gonzalez's chances in the final?'
Andy: 'Slim.'

Reporter: 'You're performance on here is better than on court.'
Andy: 'No shit. If there were rankings for press conferences I wouldn't have to worry about falling out of the top five I hope.'

Reporter: 'After a defeat like this do you sleep well?'
Andy: 'Depends how much I drink tonight.'

Reporter: 'How much would you have paid in order to not have to come to the press conference tonight?'
Andy: 'That's about the best question that's been asked. Well, I can't really say an amount because I would have gotten fined $20,000 (for not coming to the press conference). So, it would have to be less than that, right? If we're thinking logically. But it wouldn't be about the money. It would be about running away and not facing it. I would pay a lot of money if people would make stuff up and pretend I said it. But my dad didn't raise me like that, so here I am.'

The last one of his replies; Roddick was invited to some show, and the conversation with the host was like this:

Host: 'Do you have any hint for me? Did you bring me present of any sort?'
Roddick: 'A present?'
Host: 'Yeah'
Roddick: 'It's compulsory?'
Host: 'Ya, Great Agassi came to the show, and he gave me the racket he won the Davis Cup with.'
Roddick: 'Really!!'
Host: 'You didn't bring me anything?'
Roddick: 'I CAN'T BRING YOU SHIT...'

Reporter: 'What did Jimmy say? US Open he got on a real roll too. Did you talk about what to do if Roger got on a roll, change strategy, slow it down?'
Andy: 'There's a lot of strategy talk. But not if you're down 6-4 6-0 2-0. We didn't really talk about that. Oops.'

One-Liners

Now, some good one-liners:

'It's just unreal, I'm shocked myself. I've played good matches here, but never really almost destroyed somebody. It's a match for him to forget...and for me to remember!'
—Roger Federer, after defeating Andy Roddick in the AO '07 SF

'Hey—you guys with the ladder. If you come here I'll buy you pizza.'
—Andy Roddick, calling out to firefighters in the process of rescuing Roddick and other hotel guests from a fire in Rome.

'I've got to feel good because (Djokovic) has got about 16 injuries'
—Andy Roddick on Djovokic's injuries.


Roddick: 'Isn't it both of them? And a back? and a hip? And a cramp... bird flu... anthrax. SARS, common cough, and a cold.'

'My hobbies include underwater fire extinguishing.'
—Andy Roddick

“If Pete’s child is a girl, my son will like her; if he’s a boy, my son will defeat him.” —Agassi.

Asked what it feels like to be the World No. 1, Roger jokingly replied:

'It's great. Everybody suddenly rates my good strokes as outstanding, and my poorer strokes as almost outstanding.'

'When I was 40, my doctor advised me that a man in his 40s shouldn't play tennis. I heeded his advice carefully and could hardly wait until I reached 50 to start again.'
—Hugo L. Black

'I did my job, got a beautiful cup and a beautiful cheque. That’s it. I didn’t change the world'
—Marat on his Slam wins

“Yeah, I choked, but shit happens”
—Marat Safin

'I'd like just one time to see you guys step up and do something for us.'
—Andy Roddick venting on ATP Supervisor Gayle Bradshaw after getting no love from the chair umpire at the ATP Scottsdale event.

'I don't go out there to love my enemy. I go out there to squash him.' —Jimmy Connors

'I am the best tennis player who cannot play tennis.' —Ion Tiriac

'Winners aren't popular, losers often are.' —Virginia Wade

'If you put two monkeys on to play you'd still pack the centre court.' —Neil Fraser, commenting on Wimbledon's popularity

'One day when a linesman starts to laugh, I swear I will hit the guy over the head with my racket. I think it will be the end of my career, but I will be happy.' —Ilie Nastase

'When I won Wimbledon, I said to God: Just let me win this one tournament and I won't play another match. Maybe God's telling me to go home, but I don't want to go home. We are negotiating at the moment.' —Goran Ivanisevic

'If I can't serve on grass, I can maybe help cut the grass, paint the lines, and serve some strawberries.' —Goran Ivanisevic

'The best doubles pair in the world is John McEnroe and anyone else.' —Peter Fleming

'I can't believe he is dumping me, his buddy for seven years, for a kid he's never seen before.' —Paul Haarhuis complaining about his doubles partner Jacco Eltingh flying home from the US Open for the birth of his son

'How to shake hands.' —Bettina Bunge, on what she had learned from a series of rapid defeats to Martina Navratilova

'Experience is a great advantage. The problem is that when you get the experience, you're too damned old to do anything about it.' —Jimmy Connors

'I love Wimbledon. But why don't they stage it in the summer?' —Vijay Amritraj during the rain-drenched 2007 Championships)

'If you want to talk, it's okay with me. I sit and relax.' —Gael Monfils taking a seat while Nicolas Almagro debated with umpire and match referee, Australian Open 2009

'I'm gonna have to start winning some of the matches to call it a rivalry!' —Andy Roddick on being asked whether he and Roger Federer had a rivalry that would last for years

'Pete is a step and a half slower.' —Greg Rusedski after losing to Pete Sampras in the US Open)

'Against him I don't need to be a step and a half quicker.' —Pete Sampras responding to Greg Rusedski's criticism—he went on to win the title!

'Umpiring, the only job in the world where you can screw up on a daily basis and still have one!' —Andy Roddick

'She doesn't sleep. At night she seems to turn into a vampire.' —Goran Ivanisevic on the joys of fatherhood

'Look, Nastase, we used to have a famous cricket match in this country called Gentlemen versus Players. The Gentlemen were put down on the scorecard as 'Mister' because they were gentlemen. By no stretch of the imagination can anybody call you a gentleman.' —Wimbledon umpire, on being told to address Ilie Nastase as 'Mister'

'Thanks, but no. I want to be a winner.' —Maria Sharapova on being compared to Anna Kournikova

'Going to the dentist. On second thought, I would rather have a root canal than play Santoro.' —Marat Safin, on being asked his biggest fear

'If I don't win tonight, I guess the sun will still come up in the morning.' —Arthur Ashe

'I had a feeling today that Venus Williams would either win or lose.' —Martina Navratilova

'The difference between night and day is, er, night and day.' —Tim Henman

Andrew Castle: 'Where are all these Serbians from?'
Greg Rudseski: 'Serbia?' (during Wimbledon 2007)

'There are no excuses. I could blame it on a lack of match practice time, or on playing the world No. 10...I had a sore stomach as well.' —Sania Mirza, Australian Open 2009

'I broke all my rackets. I didn't have a racket for the fifth set. I broke four. Now, I hold the record. Now, I go home. No rackets. I really don't like these rackets.' —Nikolay Davydenko, US Open 2008

'You're on live TV, you know. You look like a real moron right now.'
—The lovable Andy Roddick, yelling at a chair umpire at Indianapolis

'You're an idiot! Stay in school kids, or you'll end up being an umpire.'


Xbox 360 Games

Checkout the latest Xbox 360 games here and pre-order the upcoming Xbox Games here

Ballmer Peak (Blood Alcohol Concentration Vs Programming Skill)

Ballmer Peak

Tuesday, October 27, 2009

Holidays 2009 Toy List

Apple iPod touch 64 GB, Age 12 and up
Bakugan 7 in 1 Maxus Dragonoid, Ages 5 - 15
Barbie Pink 3-Story Dream Townhouse, Ages 3 - 10
Bop It, Ages 8 - 18
Crayola Crayon Maker, Ages 8 - 15
D-Rex Interactive Dinosaur, Ages 8 - 12
Eyeclops Mini Projector, Ages 8 - 15
Fender Starcaster Strat Pack Electric Guitar with Amp and Accessories, Age 12 and up
Fisher Price Elmo Tickle Hands, Ages 2 - 6
Harry Potter and the Half-Blood Prince, Age 8 and up
Huffy Green Machine 2, Age 6 and up
Infantino Twist and Fold Activity Gym, Ages 1 - 12 months
Kodak Zi8 HD Pocket Video Camera, Age 12 and up
LeapFrog Zippity High-Energy Learning System, Ages 3 - 5
LEGO City Corner, Ages 5 - 12
Liv Fashion Doll Sophie, Ages 4 - 8
Manhattan Toy Baby Stella Doll, Age 12 months and up
Matchbox Mega Rig Pirates Ship in Amazon Frustration-Free Packaging, Ages 4 - 12
Maxell M&M Earbud, Ages 12 and up
Mindflex Game, Ages 8 - 15
Monopoly City Edition, Age 8 and up
Monsters vs. Aliens, Age 5 and up
Nerf N Strike Elite Bundle, Ages 8 - 11
New Super Mario Bros, Ages 8 - 11
Nintendo DSi, Ages 8 - 11
Playskool Chuck My Talking Truck, Ages 3 - 6
Razor A Kick Scooter, Age 5 and up
Razor Rip-Rider 360 Drifting Ride-On, Age 5 and up
Scene It? Twilight Deluxe Edition, Age 13 and up
Schwinn Roadster 12-Inch Trike, Ages 3 - 6
Scribblenauts, Ages 8 - 11
Snow White and the Seven Dwarfs, Age 5 and up
Speakal iPig 2.1 Stereo iPod Docking Station with 5 Speakers, Age 12 and up
Sprig Toys Eco Recycling-Truck in Amazon Frustration-Free Packaging, Ages 3 - 6
Transformers Movie 2 Combiner Construction Devastator, Ages 5 - 12
Up, Age 5 and up
Wii Fit Plus with Balance Board, Age 12 and up
Zune HD 32 GB Video MP3 Player, Age 12 and up

Internet rules and laws

Here are the rules in short:
  1. As an internet discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches certainty.
  2. Without a winking smiley or other blatant display of humor, it is impossible to create a parody of fundamentalism that someone won’t mistake for the real thing.
  3. If it exists, there is porn of it.
  4. 'Any post correcting an error in another post will contain at least one error itself' or 'the likelihood of an error in a post is directly proportional to the embarrassment it will cause the poster.
  5. In any discussion involving science or medicine, citing Whale.to as a credible source loses the argument immediately, and gets you laughed out of the room.
  6. If you have to insist that you’ve won an internet argument, you’ve probably lost badly.
  7. A person’s mind can be changed by reading information on the internet. The nature of this change will be from having no opinion to having a wrong opinion.
  8. Anyone who posts an argument on the internet which is largely quotations can be very safely ignored, and is deemed to have lost the argument before it has begun.
  9. Whoever resorts to the argument that ‘whoever resorts to the argument that... ...has automatically lost the debate’ has automatically lost the debate.
  10. The more exclamation points used in an email (or other posting), the more likely it is a complete lie. This is also true for excessive capital letters.

Amazon Relational Database Service (Amazon RDS) Launched

Amazon Relational Database Service (Amazon RDS) Amazon Relational Database Service (Amazon RDS) is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business.

Amazon RDS gives you access to the full capabilities of a familiar MySQL database. This means the code, applications, and tools you already use today with your existing MySQL databases work seamlessly with Amazon RDS. Amazon RDS automatically patches the database software and backs up your database, storing the backups for a user-defined retention period. You also benefit from the flexibility of being able to scale the compute resources or storage capacity associated with your relational database instance via a single API call. As with all Amazon Web Services, there are no up-front investments required, and you pay only for the resources you use

Monday, October 26, 2009

SuperFreakonomics: Global Cooling, Patriotic Prostitutes, and Why Suicide Bombers Should Buy Life Insurance


SuperFreakonomics by Steven D. Levitt and Stephen J. Dubner is a really good read. Here are the highlights of the book on Amazon

Four years in the making, SuperFreakonomics asks not only the tough questions, but the unexpected ones: What's more dangerous, driving drunk or walking drunk? Why is chemotherapy prescribed so often if it's so ineffective? Can a sex change boost your salary?

SuperFreakonomics challenges the way we think all over again, exploring the hidden side of everything with such questions as:

* How is a street prostitute like a department-store Santa?
* Why are doctors so bad at washing their hands?
* How much good do car seats do?
* What's the best way to catch a terrorist?
* Did TV cause a rise in crime?
* What do hurricanes, heart attacks, and highway deaths have in common?
* Are people hard-wired for altruism or selfishness?
* Can eating kangaroo save the planet?
* Which adds more value: a pimp or a Realtor?

Levitt and Dubner mix smart thinking and great storytelling like no one else, whether investigating a solution to global warming or explaining why the price of oral sex has fallen so drastically. By examining how people respond to incentives, they show the world for what it really is – good, bad, ugly, and, in the final analysis, super freaky.

Types of Cache Explained

Inline cache: A memory cache that resides next to the CPU, but shares the same system bus as other subsystems in the computer. It is faster than lookaside cache, but slower than a backside cache.

Backside cache: A level 2 memory cache that has a dedicated channel to the CPU, enabling it to run at the full speed of the CPU.

Lookaside cache: A memory cache that shares the system bus with main memory and other subsystems. It is slower than inline caches and backside caches.

More info here Intel Cache Overview

Monday, October 19, 2009

Steve Jobs more popular than Jesus



Source: Is Steve Jobs more popular than Jesus?

New Michael Jackson Song Unveiled Online : Music : Entertainment : Blogs.com

Michael Jackson's new song, "This Is It," is now available on his official website. The single is from MJ's upcoming documentary and "will appear on a two-disc set due October 27 that features master versions of some of Jackson's biggest hits in the sequence they appear in the film," reports MTV.

Sunday, September 27, 2009

10 Useful Usability Findings and Guidelines

10 useful usability findings and guidelines that may help you improve the user experience on your websites from Smashing Magazine

1. Form Labels Work Best Above The Field
2. Users Focus On Faces
3. Quality Of Design Is An Indicator Of Credibility
4. Most Users Do Not Scroll
5. Blue Is The Best Color For Links
6. The Ideal Search Box Is 27-Characters Wide
7. White Space Improves Comprehension
8. Effective User Testing Doesn’t Have To Be Extensive
9. Informative Product Pages Help You Stand Out
10. Most Users Are Blind To Advertising

Tuesday, September 22, 2009

Facebook Open Sources FriendFeed’s Real-Time Tech

Facebook is open sourcing a portion of FriendFeed named Tornado, a real-time web framework for Python.
Facebook employee, Dave Recordon, explains the open-sourcing today on Facebook’s Developers blog.

Tornado is a relatively simple, non-blocking Web server framework written in Python, designed to handle thousands of simultaneous connections, making it ideal for real-time Web services.

While Tornado is similar to existing Web-frameworks in Python (Django, Google’s webapp, web.py), it focuses on speed and handling large amounts of simultaneous traffic.
Three key parts of Tornado:
All the basic site building blocks – Tornado comes with built-in support for a lot of the most difficult and tedious aspects of web development, including templates, signed cookies, user authentication, localization, aggressive static file caching, cross-site request forgery protection, and third party authentication like Facebook Connect. You only need to use the features you want, and it is easy to mix and match Tornado with other frameworks.
Real-time services – Tornado supports large numbers of concurrent connections. It is easy to write real-time services via long polling or HTTP streaming with Tornado. Every active user of FriendFeed maintains an open connection to FriendFeed’s servers.

High performance – Tornado is pretty fast relative to most Python web frameworks. Tornado’s baseline throughput was over four times higher than the other frameworks.

Amazon Offering All Non-iPhone AT&T Phones Today For A Penny With New 2-Year Contract

Amazon.com today is offering every AT&T wireless handset - other than the iPhone - for a penny with a new two-year service contract. The site lists 84 handsets that can be yours for one cent, including models from Research In Motion, Samsung, LG, HTC and various other manufacturers. They’re also waiving the activation fee, and promising two-day free shipping.

Memory Mapped Buffers and Non-blocking IO in Java

The new I/O (input/output) packages finally address Java's long-standing shortcomings in its high-performance, scalable I/O. The new I/O packages -- java.nio.* -- allow Java applications to handle thousands of open connections while delivering scalability and excellent performance. These packages introduce four key abstractions that work together to solve the problems of traditional Java I/O:
  1. A Buffer contains data in a linear sequence for reading or writing. A special buffer provides for memory-mapped file I/O.
  2. A charset maps Unicode character strings to and from byte sequences. (Yes, this is Java's third shot at character conversion.)
  3. Channels -- which can be sockets, files, or pipes -- represent a bidirectional communication pipe.
  4. Selectors multiplex asynchronous I/O operations into one or more threads.

A quick review

Before diving into the new API's gory details, let's review Java I/O's old style. Imagine a basic network daemon. It needs to listen to a ServerSocket, accept incoming connections, and service each connection. Assume for this example that servicing a connection involves reading a request and sending a response. That resembles the way a Web server works. Figure 1 depicts the server's lifecycle. At each heavy black line, the I/O operation blocks -- that is, the operation call won't return until the operation completes.

Figure 1. Blocking points in a typical Java server
Let's take a closer look at each step.
Creating a ServerSocket is easy:
ServerSocket server = new ServerSocket(8001);


Accepting new connections is just as easy, but with a hidden catch:
Socket newConnection = server.accept();


The call to server.accept() blocks until the ServerSocket accepts an incoming network connection. That leaves the calling thread sitting for an indeterminate length of time. If this application has only one thread, it does a great impression of a system hang.
Once the incoming connection has been accepted, the server can read a request from that socket, as shown in the code below. Don't worry about the Request object. It is a fiction invented to keep this example simple.
InputStream in = newConnection.getInputStream();
InputStreamReader reader = new InputStreamReader(in);
LineNumberReader lnr = new LineNumberReader(reader);
Request request = new Request();
while(!request.isComplete()) {
  String line = lnr.readLine();
  request.addLine(line);
}
This harmless-looking chunk of code features problems. Let's start with blocking. The call to lnr.readLine() eventually filters down to call SocketInputStream.read(). There, if data waits in the network buffer, the call immediately returns some data to the caller. If there isn't enough data buffered, then the call to read blocks until enough data is received or the other computer closes the socket. Because LineNumberReader asks for data in chunks (it extends BufferedReader), it might just sit around waiting to fill a buffer, even though the request is actually complete. The tail end of the request can sit in a buffer that LineNumberReader has not returned.

This code fragment also creates too much garbage, another big problem. LineNumberReader creates a buffer to hold the data it reads from the socket, but it also creates Strings to hold the same data. In fact, internally, it creates a StringBuffer. LineNumberReader reuses its own buffer, which helps a little. Nevertheless, all the Strings quickly become garbage.
Now it's time to send the response. It might look something like this (imagine that the Response object creates its stream by locating and opening a file):
Response response = request.generateResponse();
OutputStream out = newConnection.getOutputStream();
InputStream in = response.getInputStream();
int ch;
while(-1 != (ch = in.read())) {
  out.write(ch);
}
newConnection.close();
This code suffers from only two problems. Again, the read and write calls block. Writing one character at a time to a socket slows the process, so the stream should be buffered. Of course, if the stream were buffered, then the buffers would create more garbage.

You can see that even this simple example features two problems that won't go away: blocking and garbage.

The old way to break through blocks

The usual approach to dealing with blocking I/O in Java involves threads -- lots and lots of threads. You can simply create a pool of threads waiting to process requests, as shown in Figure 2.

Figure 2. Worker threads to handle requests
Threads allow a server to handle multiple connections, but they still cause trouble. First, threads are not cheap. Each has its own stack and receives some CPU allocation. As a practical matter, a JVM might create dozens or even a few hundred threads, but it should never create thousands of them.
In a deeper sense, you don't need all those threads. They do not efficiently use the CPU. In a request-response server, each thread spends most of its time blocked on some I/O operation. These lazy threads offer an expensive approach to keeping track of each request's state in a state machine. The best solution would multiplex connections and threads so a thread could order some I/O work and go on to something productive, instead of just waiting for the I/O work to complete.

New I/O, new abstractions

Now that we've reviewed the classic approach to Java I/O, let's look at how the new I/O abstractions work together to solve the problems we've seen with the traditional approach.
Along with each of the following sections, I refer to sample code (available in Resources) for an HTTP server that uses all these abstractions. Each section builds on the previous sections, so the final structure might not be obvious from just the buffer discussion.

Buffered to be easier on your stomach

Truly high-performance server applications must obsess about garbage collection. The unattainable ideal server application would handle a request and response without creating any garbage. The more garbage the server creates, the more often it must collect garbage. The more often it collects garbage, the lower its throughput.

Of course, it's impossible to avoid creating garbage altogether; you need to just manage it the best way you know how. That's where buffers come in. Traditional Java I/O wastes objects all over the place (mostly Strings). The new I/O avoids this waste by using Buffers to read and write data. A Buffer is a linear, sequential dataset and holds only one data type according to its class:
java.nio.Buffer Abstract base class
java.nio.ByteBuffer Holds bytes. Can be direct or nondirect. Can be read from a ReadableByteChannel. Can be written to a WritableByteChannel.
java.nio.MappedByteBuffer Holds bytes. Always direct. Contents are a memory-mapped region of a file.
java.nio.CharBuffer Holds chars. Cannot be written to a Channel.
java.nio.DoubleBuffer Holds doubles. Cannot be written to a Channel.
java.nio.FloatBuffer Holds floats. Can be direct or nondirect.
java.nio.IntBuffer Holds ints. Can be direct or nondirect.
java.nio.LongBuffer Holds longs. Can be direct or nondirect.
java.nio.ShortBuffer Holds shorts. Can be direct or nondirect.


Table 1. Buffer classes
You allocate a buffer by calling either allocate(int capacity) or allocateDirect(int capacity) on a concrete subclass. As a special case, you can create a MappedByteBuffer by calling FileChannel.map(int mode, long position, int size).
A direct buffer allocates a contiguous memory block and uses native access methods to read and write its data. When you can arrange it, a direct buffer is the way to go. Nondirect buffers access their data through Java array accessors. Sometimes you must use a nondirect buffer -- when using any of the wrap methods (like ByteBuffer.wrap(byte[])) -- to construct a Buffer on top of a Java array, for example.
When you allocate the Buffer, you fix its capacity; you can't resize these containers. Capacity refers to the number of primitive elements the Buffer can contain. Although you can put multibyte data types (short, int, float, long, and so on) into a ByteBuffer, its capacity is still measured in bytes. The ByteBuffer converts larger data types into byte sequences when you put them into the buffer. (See the next section for a discussion about byte ordering.) Figure 3 shows a brand new ByteBuffer created by the code below. The buffer features a capacity of eight bytes.
ByteBuffer example = ByteBuffer.allocateDirect(8);

Figure 3. A fresh ByteBuffer


The Buffer's position is the index of the next element that will be written or read. As you can see in Figure 3, position starts at zero for a newly allocated Buffer. As you put data into the Buffer, position climbs toward the limit. Figure 4 shows the same buffer after the calls in the next code fragment add some data.
example.put( (byte)0xca );
example.putShort( (short)0xfeba );
example.put( (byte)0xbe );

Figure 4. ByteBuffer after a few puts


Another of the buffer's important attributes is its limit. The limit is the first element that should not be read or written. Attempting to put() past the limit causes a BufferOverflowException. Similarly, attempting to get() past the limit causes a BufferUnderflowException. For a new buffer, the limit equals the capacity. There is a trick to using buffers. Between filling the buffer with data and writing it on a Channel, the buffer must flip. Flipping a buffer primes it for a new sequence of operations. If you've been putting data into a buffer, flipping it ensures that it's ready to read the data. More precisely, flipping the buffer sets its limit to the current position and then resets its position to zero. Its capacity does not change. The following code flips the buffer. Figure 5 depicts the effect of flipping the sample buffer.
example.flip();



Figure 5. The flipped ByteBuffer
After the flip, the buffer can be read. In this example, get() returns four bytes before it throws a BufferUnderflowException.

An aside about byte ordering

Any data type larger than a byte must be stored in multiple bytes. A short (16 bits) requires two bytes, while an int (32 bits) requires four bytes. For a variety of historical reasons, different CPU architectures pack these bytes differently. On big-endian architectures, the most significant byte goes in the lowest address, as shown in Figure 6. Big-endian order is often referred to as network order.

Figure 6. Big-endian byte ordering
Little-endian architectures put the least significant byte first, as in Figure 7.

Figure 7. Little-endian byte ordering


Anyone who programs networks in C or C++ can rant at length about byte-ordering problems. Host byte order, network byte order, big endian, little endian ... they're a pain. If you put a short into a byte array in big-endian ordering and remove it in little-endian ordering, you receive a different number than you put in! (See Figure 8.)

Figure 8. The result of mismatched byte ordering
You might have noticed that the call to example.putShort() illustrated in Figure 4 resulted in 0xFE at Position 1 and 0xBA at Position 2. In other words, the most significant byte went into the lowest numbered slot. Therefore, Figure 4 offers an example of big-endian byte ordering. java.nio.ByteBuffer defaults to big-endian byte ordering on all machines, no matter what the underlying CPU might use. (In fact, Intel microprocessors are little endian.) ByteBuffer uses instances of java.nio.ByteOrder to determine its byte ordering. The static constants ByteOrder.BIG_ENDIAN and ByteOrder.LITTLE_ENDIAN do exactly what you would expect.
Essentially, if you talk to another Java program, leave the byte ordering alone and it will work. If you talk to a well-behaved socket application in any language, you should also leave the byte ordering alone. You fiddle with byte ordering in only two instances: when you talk to a poorly-behaved network application that does not respect network byte ordering, or when you deal with binary data files created on a little-endian machine.

How do buffers help?



So how can buffers improve performance and cut down on garbage? You could create a pool of direct Buffers to avoid allocations during request processing. Or you could create Buffers for common situations and keep them around. The following fragment from our sample HTTP server illustrates the latter approach:
class ReadWriteThread extends Thread {
  ...
  private WeakHashMap fileCache = new WeakHashMap();
  private ByteBuffer[] responseBuffers = new ByteBuffer[2];
  ...
  public ReadWriteThread(Selector readSelector, 
                         ConnectionList acceptedConnections, 
                         File dir) 
      throws Exception 
  {
    super("Reader-Writer");
    ...
    responseBuffers[0] = initializeResponseHeader();
    ...
  }
  ...
  protected ByteBuffer initializeResponseHeader() throws Exception {
    // Pre-load a "good" HTTP response as characters.
    CharBuffer chars = CharBuffer.allocate(88);
    chars.put("HTTP/1.1 200 OK\n");
    chars.put("Connection: close\n");
    chars.put("Server: Java New I/O Example\n");
    chars.put("Content-Type: text/html\n");
    chars.put("\n");
    chars.flip();
    // Translate the Unicode characters into ASCII bytes.
    ByteBuffer buffer = ascii.newEncoder().encode(chars);
    ByteBuffer directBuffer = ByteBuffer.allocateDirect(buffer.limit());
    directBuffer.put(buffer);
    return directBuffer;
  }
  ...
}
The above code is an excerpt from the thread that reads requests and sends responses. In the constructor, we set up two ByteBuffers for the responses. The first buffer always contains the HTTP response header. This particular server always sends the same headers and the same response code. To send error responses, the method sendError() (not shown above) creates a similar buffer with an HTTP error response for a particular status code. It saves the error response headers in a WeakHashMap, keyed by the HTTP status code.

The initializeResponseHeader() method actually uses three buffers. It fills a CharBuffer with Strings. The character set encoder turns the Unicode strings into bytes. I will cover character conversion later. Since this header is sent at the beginning of every response from the server, it saves time to create the response once, save it in a buffer, and just send the buffer every time. Notice the call to flip the CharBuffer after we put our data into it. The third buffer used in initializeResponseHeader() seems a bit odd. Why convert the characters into a ByteBuffer just to then copy them into another ByteBuffer? The answer: because CharsetEncoder creates a nondirect ByteBuffer. When you write a direct buffer to a channel, it immediately passes to native calls. However, when you pass a nondirect buffer to a channel, the channel provider creates a new, direct buffer and copies the nondirect buffer's contents. That means extra garbage and a data copy. It worsens when the buffer with the response header is sent in every HTTP response. Why let the channel provider create a direct buffer on every request if we can do it once and get it over with?

Character encoding

When putting data into ByteBuffers, two related problems crop up: byte ordering and character conversion. ByteBuffer handles byte ordering internally using the ByteOrder class. It does not deal with character conversion, however. In fact, ByteBuffer doesn't even have methods for reading or writing strings. Character conversion is a complicated topic, subject to many international standards, including the Internet Engineering Task Force's requests for comments, the Unicode Standard, and the Internet Assigned Numbers Authority (IANA). However, almost every time you deal with character conversion, you must convert Unicode strings to either ASCII or UTF-8. Fortunately, these are easy cases to handle. ASCII and UTF-8 are examples of character sets. A character set defines a mapping from Unicode to bytes and back again. Character sets are named according to IANA standards. In Java, a character set is represented by an instance of java.nio.charset.Charset. As with most internationalization classes, you do not construct Charsets directly. Instead, you use the static factory method Charset.forName() to acquire an appropriate instance. Charset.availableCharsets() gives you a map of supported character set names and their Charset instances. The J2SE 1.4 beta includes eight character sets: US-ASCII, ISO-8859-1, ISO-8859-15, UTF-8, UTF-16, UTF-16BE (big endian), UTF-16LE (little endian), and Windows-1252.

Charset constructs CharsetEncoders and CharsetDecoders to convert character sequences into bytes and back again. Take another look at ReadWriteThread below. The encoder shows up twice for converting an entire CharBuffer into a ByteBuffer. readRequest, on the other hand, uses the decoder on the incoming request.
class ReadWriteThread extends Thread {
  ...
  private Charset ascii;
  ...
  public ReadWriteThread(Selector readSelector, 
                         ConnectionList acceptedConnections, 
                         File dir) 
      throws Exception 
  {
    super("Reader-Writer");
    ...
    ascii = Charset.forName("US-ASCII");
    responseBuffers[0] = initializeResponseHeader();
    ...
  }
  ...
  protected ByteBuffer initializeResponseHeader() throws Exception {
    ...
    // Translate the Unicode characters into ASCII bytes.
    ByteBuffer buffer = ascii.newEncoder().encode(chars);
    ...
  }
  ...
  protected String readRequest(SelectionKey key) throws Exception {
    SocketChannel incomingChannel = (SocketChannel)key.channel();
    Socket incomingSocket = incomingChannel.socket();
    ...
    int bytesRead = incomingChannel.read(readBuffer);
    readBuffer.flip();
    String result = asciiDecoder.decode(readBuffer).toString();
    readBuffer.clear();
    StringBuffer requestString = (StringBuffer)key.attachment();
    requestString.append(result);
    ...
  }
  ...
  protected void sendError(SocketChannel channel, 
                           RequestException error) throws Exception {
      ...
      // Translate the Unicode characters into ASCII bytes.
      buffer = ascii.newEncoder().encode(chars);
      errorBufferCache.put(error, buffer);
      ...
  }
}

Channel the new way



You might notice that none of the existing java.io classes know how to read or write Buffers. In Merlin, Channels read data into Buffers and send data from Buffers. Channels join Streams and Readers as a key I/O construct. A channel might be thought of as a connection to some device, program, or network. At the top level, the java.nio.channels.Channel interface just knows whether it is open or closed. A nifty feature of Channel is that one thread can be blocked on an operation, and another thread can close the channel. When the channel closes, the blocked thread awakens with an exception indicating that the channel closed. There are several Channel classes, as shown in Figure 9.

Figure 9. Channel interface hierarchy
Additional interfaces depicted in Figure 9 add methods for reading (java.nio.channels.ReadableByteChannel), writing (java.nio.channels.WritableByteChannel), and scatter/gather operations. A gathering write can write data from several buffers to the channel in one contiguous operation. Conversely, a scattering read can read data from the channel and deposit it into several buffers, filling each one in turn to its limit. Scatter/gather operations have been used for years in high-performance I/O managers in Unix and Windows NT. SCSI controllers also employ scatter/gather to improve overall performance. In Java, the channels quickly pass scatter/gather operations down to the native operating system functions for vectored I/O. Scatter/gather operations also ease protocol or file handling, particularly when you create fixed headers in some buffers and change only one or two variable data buffers. You can configure channels for blocking or nonblocking operations. When blocking, calls to read, write, or other operations do not return until the operation completes. Large writes over a slow socket can take a long time. In nonblocking mode, a call to write a large buffer over a slow socket would just queue up the data (probably in an operating system buffer, though it could even queue it up in a buffer on the network card) and return immediately. The thread can move on to other tasks while the operating system's I/O manager finishes the job. Similarly, the operating system always buffers incoming data until the application asks for it. When blocking, if the application asks for more data than the operating system has received, the call blocks until more data comes in. In nonblocking mode, the application just gets whatever data is immediately available. The sample code included with this article uses each of the following three channels at various times:
  • ServerSocketChannel
  • SocketChannel
  • FileChannel


ServerSocketChannel

java.nio.channels.ServerSocketChannel plays the same role as java.net.ServerSocket. It creates a listening socket that accepts incoming connections. It cannot read or write. ServerSocketChannel.socket() provides access to the underlying ServerSocket, so you can still set socket options that way. As is the case with all the specific channels, you do not construct ServerSocketChannel instances directly. Instead, use the ServerSocketChannel.open() factory method.
ServerSocketChannel.accept() returns a java.nio.channel.SocketChannel for a newly connected client. (Note: Before Beta 3, accept() returned a java.net.Socket. Now the method returns a SocketChannel, which is less confusing for developers.) If the ServerSocketChannel is in blocking mode, accept() won't return until a connection request arrives. (There is an exception: you can set a socket timeout on the ServerSocket. In that case, accept() eventually throws a TimeoutException.) If the ServerSocketChannel is in nonblocking mode, accept() always returns immediately with either a Socket or null. In the sample code, AcceptThread constructs a ServerSocketChannel called ssc and binds it to a local TCP port:
class AcceptThread extends Thread {
  private ServerSocketChannel ssc;
  public AcceptThread(Selector connectSelector, 
                      ConnectionList list, 
                      int port) 
      throws Exception 
  {
    super("Acceptor");
    ...
    ssc = ServerSocketChannel.open();
    ssc.configureBlocking(false);
    InetSocketAddress address = new InetSocketAddress(port);
    ssc.socket().bind(address);
    ...
  }

SocketChannel

java.nio.channels.SocketChannel is the real workhorse in this application. It encapsulates a java.net.Socket and adds a nonblocking mode and a state machine.

SocketChannels can be created one of two ways. First, SocketChannel.open() creates a new, unconnected SocketChannel. Second, the Socket returned by ServerSocketChannel.accept() actually has an open and connected SocketChannel attached to it. This code fragment, from AcceptThread, illustrates the second approach to acquiring a SocketChannel:
class AcceptThread extends Thread {
  private ConnectionList acceptedConnections;
  ...
  protected void acceptPendingConnections() throws Exception {
    ...
    for(Iterator i = readyKeys.iterator(); i.hasNext(); ) {
      ...
      ServerSocketChannel readyChannel = (ServerSocketChannel)key.channel();
      SocketChannel incomingChannel = readyChannel.accept();
      acceptedConnections.push(incomingChannel);
    }
  }
}
Like SelectableChannel's other subclasses, SocketChannel can be blocking or nonblocking. If it is blocking, then read and write operations on the SocketChannel behave exactly like blocking reads and writes on a Socket, with one vital exception: these blocking reads and writes can be interrupted if another thread closes the channel.

FileChannel

Unlike SocketChannel and ServerSocketChannel, java.nio.channels.FileChannel does not derive from SelectableChannel. As you will see in the next section, that means that FileChannels cannot be used for nonblocking I/O. Nevertheless, FileChannel has a slew of sophisticated features that were previously reserved for C programmers. FileChannels allow locking of file portions and direct file-to-file transfers that use the operating system's file cache. FileChannel can also map file regions into memory. Memory mapping a file uses the native operating system's memory manager to make a file's contents look like memory locations. For more efficient mapping, the operating system uses its disk paging system. From the application's perspective, the file contents just exist in memory at some range of addresses. When it maps a file region into memory, FileChannel creates a MappedByteBuffer to represent that memory region. MappedByteBuffer is a type of direct byte buffer. A MappedByteBuffer offers two big advantages. First, reading memory-mapped files is fast. The biggest gains go to sequential access, but random access also speeds up. The operating system can page the file into memory far better than java.io.BufferedInputStream can do its block reads. The second advantage is that using MappedByteBuffers to send files is simple, as shown in the next code fragment, also from ReadWriteThread:
protected void sendFile(String uri, SocketChannel channel) throws 
RequestException, IOException {
    if(Server.verbose) 
      System.out.println("ReadWriteThread: Sending " + uri);
    Object obj = fileCache.get(uri);
    
    if(obj == null) {
      Server.statistics.fileMiss();
      try {
            File f = new File(baseDirectory, uri);
            FileInputStream fis = new FileInputStream(f);
            FileChannel fc = fis.getChannel();
            
            int fileSize = (int)fc.size();
            responseBuffers[1] = fc.map(FileChannel.MapMode.READ_ONLY, 0, fileSize);
            fileCache.put(uri, responseBuffers[1]);
      } catch(FileNotFoundException fnfe) {
            throw RequestException.PAGE_NOT_FOUND;
      }
    } else {
      Server.statistics.fileHit();
      responseBuffers[1] = (MappedByteBuffer)obj;
      responseBuffers[1].rewind();
    }
    responseBuffers[0].rewind();
    channel.write(responseBuffers);
  }
The sendFile() method sends a file as an HTTP response. The lines inside the try block create the MappedByteBuffer. The rest of the method caches the memory-mapped file buffers in a WeakHashMap. That way, repeated requests for the same file are blindingly fast, yet when memory tightens, the garbage collector eliminates the cached files. You could keep the buffers in a normal HashMap, but only if you know that the file number is small (and fixed). Notice that the call to channel.write() actually passes an array of two ByteBuffers (one direct, one mapped). Passing two buffers makes the call a gathering write operation. The first buffer is fixed to contain the HTTP response code, headers, and body separator. The second buffer is the memory-mapped file. The channel sends the entire contents of the first buffer (the response header) followed by the entire contents of the second buffer (the file data).

The bridge to the old world



Before moving on to nonblocking operations, you should investigate the class java.nio.channels.Channels. Channels allows new I/O channels to interoperate with old I/O streams and readers. Channels has static methods that can create a channel from a stream or reader or vice versa. It proves most useful when you deal with third-party packages that expect streams, such as XML parsers.

Selectors

In the old days of blocking I/O, you always knew when you could read or write to a stream, because your call would not return until the stream was ready. Now, with nonblocking channels, you need some other way to tell when a channel is ready. In the new I/O packages, Selectors serve that purpose.

In Pattern Oriented Software Architecture, Volume 2, by Douglas Schmidt, Michael Stal, Hans Rohnert, and Frank Buschmann (John Wiley & Son Ltd, 1996), the authors present a pattern called Reactor. Reactor allows applications to decouple event arrival from event handling. Events arrive at arbitrary times but are not immediately dispatched. Instead, a Reactor keeps track of the events until the handlers ask for them.
A java.nio.channels.Selector plays the role of a Reactor. A Selector multiplexes events on SelectableChannels. In other words, a Selector provides a rendezvous point between I/O events on channels and client applications. Each SelectableChannel can register interest in certain events. Instead of notifying the application when the events happen, the channels track the events. Later, when the application calls one of the selection methods on the Selector, it polls the registered channels to see if any interesting events have occurred. Figure 10 depicts an example of a selector with two registered channels.

Figure 10. A selector polling its channels
Channels only register for operations they have interest in. Not every channel supports every operation. SelectionKey defines all possible operation bits, which are used twice. First, when the application registers the channel by calling SelectableChannel.register(Selector sel, int operations), it passes the sum of the desired operations as the second argument. Then, once a SelectionKey has been selected, the SelectionKey's readyOps() method returns the sum of all the operation bits that its channel is ready to perform. SelectableChannel.validOps() returns the allowed operations for each channel. Attempting to register a channel for operations it doesn't support results in an IllegalArgumentException. The following table lists the valid operations for each concrete subclass of SelectableChannel:
ServerSocketChannel OP_ACCEPT
SocketChannel OP_CONNECT, OP_READ, OP_WRITE
DatagramChannel OP_READ, OP_WRITE
Pipe.SourceChannel OP_READ
Pipe.SinkChannel OP_WRITE


Table 2. SelectableChannels and their valid operations


A channel can register for different operation sets on different selectors. When the operating system indicates that a channel can perform one of the valid operations that it registered for, the channel is ready. On each selection call, a selector undergoes a series of actions. First, every key cancelled since the last selection drops from the selector's key set. A key can be cancelled by explicitly calling SelectionKey.cancel(), by closing the key's channel, or by closing the key's selector. Keys can be cancelled asynchronously -- even while the selector is blocking. Second, the selector checks each channel to see if it's ready. If it is, then the selector adds the channel's key to the ready set. When a key is in the ready set, the key's readyOps() method always returns a set of operations that the key's channel can perform. If the key was already in the ready set before this call to select(), then the new operations are added to the key's readyOps(), so that the key reflects all the available operations.
Next, if any keys have cancelled while the operating system checks are underway, they drop from the ready set and the registered key set.
Finally, the selector returns the number of keys in its ready set. The set itself can be obtained with the selectedKeys() method. If you call Selector.selectNow() and no channels are ready, then selectNow() just returns zero. On the other hand, if you call Selector.select() or Selector.select(int timeout), then the selector blocks until at least one channel is ready or the timeout is reached. Selectors should be familiar to Unix or Win32 system programmers, who will recognize them as object-oriented versions of select() or WaitForSingleEvent(). Before Merlin, asynchronous I/O was the domain of C or C++ programmers; now it is available to Java programmers too. See the sidebar, "Is the New I/O Too Platform-Specific?", for a discussion of why Java is just now acquiring asynchronous I/O.

Applying selectors



The sample application uses two selectors. In AcceptThread, the first selector just handles the ServerSocketChannel:
class AcceptThread extends Thread {
  private ServerSocketChannel ssc;
  private Selector connectSelector;
  private ConnectionList acceptedConnections;
  public AcceptThread(Selector connectSelector, 
                      ConnectionList list, 
                      int port) 
      throws Exception 
  {
    super("Acceptor");
    this.connectSelector = connectSelector;
    this.acceptedConnections = list;
    ssc = ServerSocketChannel.open();
    ssc.configureBlocking(false);
    InetSocketAddress address = new InetSocketAddress(port);
    ssc.socket().bind(address);
    ssc.register(this.connectSelector, SelectionKey.OP_ACCEPT);
  }
   
  public void run() {
    while(true) {
      try {
        connectSelector.select();
        acceptPendingConnections();
      } catch(Exception ex) {
        ex.printStackTrace();
      }
    }
  }
  protected void acceptPendingConnections() throws Exception {
    Set readyKeys = connectSelector.selectedKeys();
    for(Iterator i = readyKeys.iterator(); i.hasNext(); ) {
      SelectionKey key = (SelectionKey)i.next();
      i.remove();
      ServerSocketChannel readyChannel = (ServerSocketChannel)key.channel();
      SocketChannel incomingChannel = readyChannel.accept();
      acceptedConnections.push(incomingChannel);
    }
  }
}
AcceptThread uses connectSelector to detect incoming connection attempts. Whenever the selector indicates that the ServerSocketChannel is ready, there must be a connection attempt. AcceptThread.acceptPendingConnections() iterates through the selected keys (there can be only one) and removes it from the set. Thanks to the selector, we know that the call to ServerSocketChannel.accept() returns immediately. We can get a SocketChannel -- representing a client connection -- from the new Socket. That new channel passes to ReadWriteThread, by way of a FIFO (first in, first out) queue.

ReadWriteThread uses readSelector to find out when a request has been received. Because it is only a sample, our server application assumes that all requests arrive in a single TCP packet. That is not a good assumption for a real Web server. Other code samples have already shown ReadWriteThread's buffer management, file mapping, and response sending, so this listing contains only the selection code:
class ReadWriteThread extends Thread {
  private Selector readSelector;
  private ConnectionList acceptedConnections;
  ...
  public void run() {
    while(true) {
      try {
        registerNewChannels();
        int keysReady = readSelector.select();
        if(keysReady > 0) {
          acceptPendingRequests();
        }
      } catch(Exception ex) {
        ex.printStackTrace();
      }
    }
  }
  protected void registerNewChannels() throws Exception {
    SocketChannel channel;
    while(null != (channel = acceptedConnections.removeFirst())) {
      channel.configureBlocking(false);
      channel.register(readSelector, SelectionKey.OP_READ, new StringBuffer());
    }  
  }
  protected void acceptPendingRequests() throws Exception {
    Set readyKeys = readSelector.selectedKeys();
    for(Iterator i = readyKeys.iterator(); i.hasNext(); ) {
      SelectionKey key = (SelectionKey)i.next();
      i.remove();
      SocketChannel incomingChannel = (SocketChannel)key.channel();
      Socket incomingSocket = incomingChannel.socket();
      ...
            String path = readRequest(incomingSocket);
            sendFile(path, incomingChannel);
      ...
    }
  }


The main loops of each thread resemble each other, with the main difference being that ReadWriteThread registers for OP_READ, while AcceptThread registers for OP_ACCEPT. Naturally, the ways in which the respective events are handled differ; overall, however, both threads are instances of the Reactor pattern.
The third argument of register() is an attachment to the SelectionKey, which can be any object. The key holds on to the attachment for later use. Here, the attachment is a StringBuffer that readRequest uses to receive the incoming HTTP request. Each time readRequest reads a Buffer from the socket, it decodes the buffer and appends it to the request string. Once the request string is finished, readRequest calls handleCompleteRequest to parse the request and send the response.

A few gotchas

You might encounter a few tricky points when using selectors. First, although selectors are thread safe, their key sets are not. When Selector.selectedKeys() is called, it actually returns the Set that the selector uses internally. That means that any other selection operations (perhaps called by other threads) can change the Set. Second, once the selector puts a key in the selected set, it stays there. The only time a selector removes keys from the selected set is when the key's channel closes. Otherwise, the key stays in the selected set until it is explicitly removed. Third, registering a new channel with a selector that is already blocking does not wake up the selector. Although the selector appears as if it misses events, it will detect the events with the next call to select(). Fourth, a selector can have only 63 channels registered, which is probably not a big deal.

Monday, September 21, 2009

'AmazonBasics' - Amazon expands private label offerings with foray into CE

Amazon announced that it is expanding it's private label business with the introduction of AmazonBasics. There are 30 products being offered so far in the Consumer Electronic accessories category.

Friday, September 18, 2009

Programmers top 10 sentences

1. WTF!

The most repeated sentence in code reviews…

wtfmOriginal Source


2. It works in my machine!

We all have used this one when blamed for some error…

3. D’oh!

- Hi Homer, have you removed the debug code from production?

homer-simpson-doh

4. It will be ready tomorrow.

The problem with this sentence is that we use it again the next day, and the next day, and the next day…

5. Have you tried turning it off and on again?

The TV series ”The It Crowd” have helped to make this one even more popular…

6. Why?

Why do we keep asking why?

7. Is not a bug, it’s a feature.

- It restarts twice a day?!! Well that makes sure that the temporary files get deleted!


bug_vs_feature

8. That code is crap.

All code is crap except mine.

9. My code is compiling…

xkcdCompiling

Original source

10. No, I don’t know how to fix the microwave.

For some reason, non technical people use to think that every thing with buttons can be fixed by a programmer…


Thursday, September 17, 2009

MySpace OpenSourcing its Recommendation Engine

Qizmt, internally-developed recommendation framework was created by the Data Mining team at MySpace. You can see it in action right now with the “People You May Know” feature. But soon, MySpace plans to roll it out to other areas of the site for recommendations soon. MySpace plans to open-source the technology for anyone to use. They made the announcement today at the Computerworld Conference in Chicago
What makes Qizmt unique is that it was developed using C#.NET specifically for Windows platforms. This extends the rapid development nature of the .NET environment to the world of large scale data crunching and enables .NET developers to easily leverage their skill set to write MapReduce functions. Not only is Qizmt easy to use but based on our internal benchmarks we have shown its processing speeds to be competitive with the leading MapReduce open source projects on a lesser number of cores.
MySpace says it has published the code for Qizmt today. They also note that they have recently open-sourced MSFast, a service they built to help developers track page load performance. Rival Facebook has been doing a bit of its own open-sourcing recently. Last week, they opened up Tornado, the platform that help to power FriendFeed, which they recently acquired.

Hierarchy of caches for high performance and high capacity memcached

This is an idea I've been kicking around for a while and wanted some feedback. Memcached does an amazing job as it is but there's always room for improvement.
Two areas that memcached could be improved are local acceleration and large capacity support (terabyte range). I believe this could be done through a 'hierarchy of caches' with a local in-process cache used to buffer the normal memcached and a disk-based memcached backed by berkeley DB providing large capacity.

The infrastructure would look like this:
in-process memcached -> normal memcached -> disk memcached

The in-process memcached would not be configured to access a larger memcached cluster. Clients would not use the network to get() objects and it would only take up a small amount of memory on the local machine. Objects would not serialize themselves before they're stored so this should act like an ultrafast LRU cache as if it were a native caching system within the VM.
Since it's a local it should be MUCH faster than the current memcached.
Here are some benchmarks of APC-cache vs memcached.
http://www.mysqlperformanceblog.com/2006/09/27/apc-or-memcached/
http://www.mysqlperformanceblog.com/2006/08/09/cache-performance-comparison/

Long story short a local cache can be 4-8x faster than normal memcached. The local in-process cache would be available on every node within this cluster and act as a L1 cache for ultra fast access to a small number of objects. I'm not sure all languages would support this type of cache because it would require access to and storage of object pointers. I believe you can do this with Java by hacking JNI pointers directly but I'm not 100% certain. This cache would be configured to buffer a normal memcached cluster. We're all familiar with this type of behavior so I won't explain this any further. The third component in the is a distributed memcached daemon which uses Berkeley DB (or another type of persistent hashtable) for storage instead of the normal slab allocator. While this might seem like blasphemy to a number of you I think it could be useful to a number of people with ultra large caching requirements (hundreds of gigs) which can't afford the required memory.

There's already a prototype implementation of this in Tugela Cache: http://meta.wikimedia.org/wiki/Tugelacache

For optimal performance the memcached driver would have to do parallel and concurrent getMulti requests so that each disk in the system can seek at the same time. There are a number of memcached implementations (including the Java impl which I've contributed to) which fetch in serial. Since memcached is amazingly fast this really hasn't shown up in any of my benchmarks but this would really hinder a disk-backed memcached.

This would provide ultra high capacity and since the disk seeks are distributed over a large number of disks you can just add spindles to the equation to get higher throughput. This system would also NOT suffer from disk hotspots since memcached and the local in-memory memcached would buffer the disk backend.

From a non-theoretical perspective the local cache could be skipped or replaced with a native LRU cache. These are problematic though due to memory fragmentation and garbage collection issues. I use a local LRU cache for Java but if I configure it to store too many objects I can run out of memory. It also won't be able to reache the 85% capacity we're seeing with the new memcached patches.

I might also note that since it's now backed by a persistent disk backend one could use Memcached as a large distributed hashtable similar to Bigtable.

Some people have commented on how a disk backed memcached would basically be MySQL.The only thing it would have in common is the use of a disk for storage. A disk-backed memcached would scale much better than MySQL and berkeley DB simply due to the fact that you can just keep adding more servers and your read/writes will scale to use the new capacity of the cluster. One disk vs N disks. MySQL replication doesn't help because you can't scale out the writes. You can read my post on MySQL replication vs NDB if you'd like a longer explanation.
This would be closer to Bigtable, S3, or MySQL cluster (different than normal MySQL) but be here today and much simpler. It wouldn't support SQL of course because it would be a dictionary/map interface but this model is working very well for Bigtable and S3 users. To make it practical it would have to support functionality similar to Bigtable including runtime repartitioning.

The core theory of MySQL replication scaling is bankrupt. The idea works in practice because people are able to cheat (cluster partitioning) and make somewhat large installs by selecting the right hardware and tuning their application.The theory behind a clustered DB is that most of the complexity behind writing a scalable application can be removed if you don't have to out-think your backend database (Google goes into this in their GFS and Bigtable papers).

The scalability of MySQL replication is essentially:
N * Tps * ((Tps * wf) - Tps)
where:
N is the number of machines in your cluster
Tps is the raw number of disk transactions per second per machine
wf is the 'write factor' or percentage of your transactions are writes (INSERT/DELETE/UPDATE) from 0-1.0.
You'll note that in the above equation if wf = 1.0 then you've hit your scalability wall and can't execute any more transactions. If wf = 0.0 you're performing all SELECT operation and actually scales fairly well.
The reason MySQL replication has worked to date is that most real world write factors are about .25 (25%) so most people are able to scale out their reads.

If you're running a cluster DB your scalability order is:
N * Tps * qf + ((N * Tps * wf)/2)

where qf is your query factor or the number of queries you need to run per second.
This is much more scalable. Basically this means that you can have as many transactions as you have boxes but writes perform on only 1/2 the boxes due to fault tolerant requirements. In this situation you can even scale your writes (though at 50% rate). MySQL cluster solves this problem as does Bigtable and essentially S3. Bigtable suffer's from lack of a solid Open Source implementation. S3 has a high latency problem since it's hosted on Amazon servers (though you can bypass this by running on EC2). My distributed Memcached backed by a persistent hashtable approach would have the advantage of scaling like the big boys (Bigtable, MySQL cluster, S3) and be simple to maintain and support caching internally. MySQL cluster doesn't do a very good job of caching at present and uses a page cache with is highly inefficient.