title


Title: A Survey of Advanced Search in P2P Networks

Author: Alok Prakash
email:  aprakash@cs.kent.edu, homepage: http://www.cs.kent.edu/~aprakash

Prepared for Prof. Javed I. Khan
Department of Computer Science, Kent State University
Date: May 2006



Abstract: As Peer-to-Peer (P2P) networks grow in popularity, the ease, accuracy and efficiency of search for resources stored in the network has become one of the determining factors in the adaptation of any P2P network. Different P2P networks use different overlay schemes and thus a same model of search is not going to be efficient in all the P2P networks. Some of the search techniques are very simple and straightforward while others employ complex techniques, all leading to different levels of efficiency. In this survey paper, I will be focusing on some of the advance search techniques used in P2P network and discuss their working, suitability and possible advantages and disadvantages.

[Keywords]: Peer-to-Peer Search, Range Query, Random Walks, Percolation Search, Hierarchical DHT



Other Surveys on Internetwork-based Applications
Back to Javed I. Khan's Home Page


Table of Contents:




Introduction:

A P2P network is a network of peers where, ideally, there is no central authority. Peers can join and leave the network at any time, at their will. All the peers in the network contribute resources, in form of storage space, bandwidth etc. It is a network based on collaboration, so each peer stores some data. The distribution of data can be random, thus the data stored in a P2P network is usually scattered across a large number of peers. This architecture demands a very efficient search technique for the retrieval of data from the P2P networks.


There are many search techniques available to search for data in P2P networks. These search techniques could be classified in many ways. We will be classifying the search techniques into two groups -




Some Background Information

Here we discuss some of the concepts that would help understanding the techniques discussed in this survey better.

Random Walk

Percolation Theory

Hierarchical DHT

Range Query




Advanced search techniques for P2P networks


1. Random Walk



2. Percolation Search



3. Range Query




Hierarchical DHT




Comparison of Features

Conventional Flooding

Random Walk

Percolation Search

HDHT

Range Query

Performed on unstructured P2P networks

Performed on unstructured P2P networks

Performed on unstructured P2P networks

Performed on DHT P2P networks

Performed on DHT P2P networks

Works same with any network

Works best with multiple queries and peer clustering

Works best with Power Law networks

Works same with all the DHTs

Works best when semantic proximity of keys is maintained

On average search is done in k * N time (k = avg degree of nodes, N = total number of nodes)

On average search is done in log time

On average search is done in log time

On average search is done in log time

On average search is done in log time

Returns single result

Returns single result

Returns single result

Returns single result

Returns a set of results

Very wasteful of resources, as every peer processes each query

Less taxing on resources

Much taxing on resources as it requires lot of pre-search preparation steps and extra storage

Not too taxing on resources

Min-max algorithm uses resources wisely

Is not very fast, as every peer processes each query

Result are reasonably fast

Search should be very fast, as many peers process the query simultaneously

Routing is very fast

Shower algorithm is very fast

Works same for all cases

Not suitable for all cases

Not suitable for all cases

Works same for all cases

Not suitable for all cases




Summary

We surveyed some of the advanced search techniques for P2P networks in this paper. The techniques discussed here are suitable in certain type of networks and work best when certain conditions are met. Outside these conditions these technique might not provide any improvement in search process. So these search techniques should not be only means of search. They should be used along with other more general purpose search techniques for the best results. In summary, the ideal usage should be -

HDHTs are a useful architecture when the search would be done in a big network with a large number of peers and where scalability is an issue.




References

Research Papers for More Information on This Topic

Other Relevant Links




Scope of Survey

Search was conducted on different IEEE digital libraries like the IEEE Computer Society and IEEE Communication Society. Search was also conducted on the ACM and CIDR digital libraries. Papers related to the keywords were the primary reference for this survey. Other papers listed in the reference section were reviewed because of their presentation of the important technical issues related to the keywords.