Suchen:

Lignano Bibione Caorle Jesolo Venedig Triest Lignano Camping Lignano Meer Lignano Ferienwohnung Grado

A Search Engine Architecture Based on Collection Selection

Gesund schnell abnehmen | Abnehmen leicht gemacht
Gesund schnell abnehmen | Abnehmen leicht gemacht Gesund schnell abnehmen | Abnehmen leicht gemacht
Gesund schnell abnehmen | Abnehmen leicht gemacht

Google Tech TalksDecember, 19 2007ABSTRACTWe present a distributed architecture for a Web search engine, basedon the concept of collection selection. We introduce a novel approachto partition the collection of documents, able to greatly improve theeffectiveness of standard collection selection techniques (CORI), anda new selection function outperforming the state of the art. Ourtechnique is based on the novel query-vector (QV) document model,built from the analysis of query logs, and on our strategy ofco-clustering queries and documents at the same time.By suitably partitioning the documents in the collection, our systemis able to select the subset of servers containing the most relevantdocuments for each query. Instead of broadcasting the query to everyserver in the computing platform, only the most relevant will bepolled, this way reducing the average computing cost to solve a query.We introduce a novel strategy to use the instant load at each serverto drive the query routing. Also, we describe a new approach tocaching, able to incrementally improve the quality of the storedresults. Our caching strategy is effectively both in reducingcomputing load and in improving result quality. The proposedarchitecture, overall, presents a trade-off between computing cost andresult quality, and we show how to guarantee very precise results inface of a dramatic reduction to computing load. This means that, withthe same computing infrastructure, our system can serve more users,more queries and more documents.Speaker: Diego Puppin

Channel: Science & Technology
Uploaded: January 4, 2008 at 10:12 am
Author: googletechtalks

Length: 33:01
Rating: 4.60
Views: 4543

Tags: education  engedu  google  googletechtalks  talk  talks  techtalk  techtalks  

Video Url:


Embed Code:

Video Comments

vicaya (January 7, 2008 at 10:59 am)
Sorry, this strategy doesn't work well with long tail and personalized search load. The indexing cost (I'd consider cluster selection an indexing phase) is much higher as well. For aggregate performance, a much simpler caching strategy (multiple (for different types/languages etc.) doc.part + (pre-computed/trained) distributed query cache) can be built that match or outperform this complicated solution.
wildchildplasma (January 5, 2008 at 7:43 am)
The crusing capabilities of ac tive data clouds you mean?One day it'll know the kind of stuff i want and i won't even have to make entries all the time. (Standard unified ratings data).I'll also be able to talk to a bot wich wil adapt it's data personality as to know me better.

Lignano | Bibione | Caorle | Jesolo | Venedig | Triest © 2010 All Rights Reserved.

Poker Strategy | Gesund Abnehmen | Global Warming | Gratisblog | Science to get Rich | Modellauto | Auto Atlas | Sunless Tanning | 50Plus | Web Video | 1 Euro Shop | Casino Poker News | Pop Video Online | Best New Country