Tuesday, June 28, 2011

The promise of P2P

I was falling asleep at my cubicle (as I do everyday after lunch, I try so hard to stay awake but I can't help it) and I started thinking about the cloud and P2P (as I do everyday because I am a nerd that way) and I was thinking how great it is that I get to work on designing P2P system. I'm currently working on this distributed microblogging project using python and it's so exciting to think about the dream that (if implemented properly, which is way harder than you think) millions of people can be using this service run totally on their own machines. As we become more reliant on the cloud for all of our needs (and I am guilty of it as well, I store everything that I have on Google's servers, but I try to make sure I do not become dependent on any proprietary protocols), we are forgetting about the power of gnutella, (one of the most resilient cloud services out there, I mean really, major forces have been trying to shut it down for years and it's still millions of users strong till this day), or services like freenet (which works pretty well by the way, used to publish wikileaks documents), we forget that if can connect millions of our machines together we can create services as powerful as Google, for example, a P2P search engine, where nodes crawl the web and index, and when you do a search, you are actually sending the queries to other nodes in a P2P system and getting the results from them.

The irony in this whole thing is that all of the major companies are doing P2P internally. For example, Google uses hundreds of thousand of commodity computers not that much more powerful than your day-to-day laptop to crawl the web. They do redundancy and expect failures, just like P2P designers do. The only difference is that they have more control over the infrastructure meaning that they assure all nodes have connectivity among each other (which is a hard thing to guarantee in a real P2P system) and they also make sure nodes are online all the time (which makes things easier because churn is one of the biggest pain in the ass to deal with in P2P). Anyways, a lot of the ideas that Google or Facebook uses to make their services reliable, and resilient are lessons learned from P2P research. Once again, the sad thing is that P2P research seems to be dying, everyone is excited about social networking, and cloud computing that noone is spending time trying to figure out ways to design good algorithms that will allow people to communicate and create services for each other without depending on third party servers. So everyday we complain about privacy but if we do not learn to use our own resources for our own services, and depend on companies to give us resources, they will have no choices but to sell our information to make money. Nothing in this world is free, Google or Facebook would not be able to survive without making money and the only way they make money is by giving our information to advertisers. I'm not blaming the companies, it's just business on their end. But if we are not willing to consider truly P2P services, then we want the great user experiences that Google and Facebook provides, then we have no choice but to accept the fact that we are giving up our privacy to do so. There is no free lunch as they say. Anyways, P2P rules and P2P is everywhere, and if you really want to know how real systems work at scale, learn about P2P. As always, booooooooooommmmmm.

No comments:

Post a Comment