Table des matiËres
Prabhakar RaghavanVerity
Graphs on the web
Plan for this talk
The web digraph
Distribution of in-degrees
How many links on a page?
Search by link analysis (Kleinberg)
The hope
Root and base sets
Visualization
Assembling the base set
Distilling hubs and authorities
Iterative update
Scaling
How many iterations?
Japan Elementary Schools
Things to note
Co-citation: signature of a community
Insights from hubs
Communities from cores
Random graphs inspiration
Approach
Finding cores
Initial data & preprocessing
Simple iterative pruning
Elimination/generation pruning
Results after pruning
Results for cores
Sample cores
From cores to communities
Using sample hubs/authorities
Costa Rican hotels and travel
Muslim student orgs.
Modeling the web as a random graph
Why study models?
Content-creation hypothesis
Two desirable model features
Model details
Results
Results
Gnutella
What is the Gnutella graph?
Serving search requests
Which hosts to connect to?
The price of P2P
How long do hosts live?
Diapositive PPT
Model
Protocol overview
Protocol
Cache node replacement rule
Preferred neighbors
Main results
Protocol features
Connectivity argument
Connectivity argument
Diameter argument
P2P search issues
Web anatomy
Web snapshots
Algorithms
Challenges from scale
Scale
May 1999 crawl
Tentative picture
Breadth-first search (BFS)
BFS experiment
Reachability
Net of BFS experiments
Interpreting BFS expts
Web anatomy
|