Now this is a great white paper... Web Squared: Web 2.0 Five Years On.
We liked it so much that we felt compelled to comment on specific bits of the paper.
All text below marked with “JB” is taken from the Web Squared paper. Each item marked with “W” is a Wowd-oriented response to a specific Web Squared observation.
JB: Chief among our insights was that “the network as platform” means far more than just offering old applications via the network (“software as a service”); it means building applications that literally get better the more people use them, harnessing network effects not only to acquire users, but also to learn from them and build on their contributions.
W: Wowd employs user input to build the index, and to drive ranking calculations too. Wowd results are valuable at small N, but get ever more valuable as N grows. Old-school architectures are like skyscrapers -- they get weaker as they get larger. Wowd is like a geodesic dome: strong when it’s small, and even stronger when it’s large.
JB: Web 2.0 is all about harnessing collective intelligence.
W: Wowd harnesses collective intelligence for search (when you know what you want), discovery (when you want to know what’s hot), and recommendation (when you want ideas about other stuff you might like).
JB: The Web is no longer a collection of static pages of HTML that describe something in the world.
W: Which means that classic web crawlers miss information. The hidden web, the database-backed web, the AJAX-fronted web… all of these sorts of “pages” defeat classic web crawlers. You simply can’t “crawl an application”, and that’s what the web has become -- a great big, worldwide application!
JB: In 1998, Larry Page and Sergey Brin had a breakthrough, realizing that links were not merely a way of finding new content, but of ranking it and connecting it to a more sophisticated natural language grammar. In essence, every link became a vote, and votes from knowledgeable people (as measured by the number and quality of people who in turn vote for them) count more than others.
W: Well… links were votes. But no longer. Why? There are simply too many links, and people don’t follow them in the way that the PageRank math describes. (PageRank assumes a uniform exit probability distribution over the links that leave any given page. That’s just wrong.) Existence of a link now means almost nothing. Link structure has been completely gamed -- consider the problem of link spam. And what about applications? They issue links dynamically. The old PageRank math is wrong for the modern link-happy and link-invisible world.
JB: As it becomes more conversational, search has also gotten faster. Blogging added tens of millions of sites that needed to be crawled daily or even hourly, but microblogging requires instantaneous update – which means a significant shift in both infrastructure and approach. Anyone who searches Twitter on a trending topic has to be struck by the message: "See what’s happening right now" followed, a few moments later by "42 more results since you started searching. Refresh to see them."
W: Real time is required, but hard to do at scale with centralized monolithic architectures. Existing “real time search” providers achieve speed at the expense of coverage. What good is it to have a real-time result if the domain of coverage includes just two or three sites? Wowd delivers information in real-time by using a distributed "cloud" architecture. (An instance of what Steve Jurvetson talks about in this YouTube video.)
JB: Real-time search encourages real-time response. Retweeted "information cascades" spread breaking news across Twitter in moments, making it the earliest source for many people to learn about what’s just happened. And again, this is just the beginning. With services like Twitter and Facebook’s status updates, a new data source has been added to the Web – realtime indications of what is on our collective mind.
W: Wowd employs the Attention Frontier of a group of people. So an information cascade -- or an attention cascade -- is exactly what drives Wowd indexing and crawling (it’s a citizen-based crawler). See Boris Agapiev's blog post on this for more information: http://distributedsearch.blogspot.com/2009/05/attention-frontier.html
Would personalization (my specific need) and localization (my area) be in built or on top?
Posted by: Ravi | August 05, 2009 at 03:28 AM
Personalization is built-in. That's central to the value prop, and to our entire approach.
We're not currently offering much by way of localization (to geography, or language, etc.) but plan to do more in this area in the future.
Posted by: Mark Drummond | August 05, 2009 at 09:43 AM