gzz-commits
[Top][All Lists]

 From: Hermanni Hyytiälä Subject: [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... Date: Thu, 20 Mar 2003 06:51:01 -0500

CVSROOT:        /cvsroot/gzz
Module name:    gzz
Changes by:     Hermanni Hyytiälä <address@hidden>      03/03/20 06:51:01

Modified files:

Log message:

CVSWeb URLs:

Patches:
20 04:06:59 2003
06:51:01 2003
@@ -106,7 +106,7 @@

In this thesis, we evaluate existing Peer-to-Peer approaches and
evaluate them to Fenfire's needs. We start by reviewing existing Peer-to-Peer
approaches,
-algorithms and their key properties. We emphasize that despite the great
amount of proposed
+algorithms and their key properties. Our insight is that despite the great
amount of proposed
Peer-to-Peer systems, we are able to classify \emph{all} systems either to
loosely or
tightly structured approach. We also discuss open problems in
Peer-to-Peer systems and divide problems into three sub-categories: security,
performance, and miscellaneous
@@ -170,8 +170,8 @@
systems fall: the loosely structured approach and the tightly structured
approach. In the loosely
structured approach the construction and the maintenance of the overlay is
controlled
loosely. This approach gives freedom for participating peers
-to perform certain tasks in a Peer-to-Peer network. On the other hand, the
tightly structured
-approach the overlay is constructed determistically, which all participating
peers have to follow.
+to perform certain tasks in a Peer-to-Peer network. On the other hand in the
tightly structured
+approach, the overlay is constructed determistically, which all participating
peers have to follow.

\section{Centralized}
@@ -311,11 +311,11 @@
Pastry \cite{rowston01pastry}, SWAN \cite{bonsma02swan}, Tapestry
\cite{zhao01tapestry}
and Viceroy \cite{malkhi02viceroy} use a circular identifier space of $n$-bit
integers modulo $2^{n}$. The
value of $n$ varies among systems. Again, CAN \cite{ratnasamy01can} uses a
$d$-dimensional Cartesian
-model to implement identifier space.
+model to implement the identifier space.

There are three higher level abstractions which tightly structured overlays
provide
-\cite{zhao03api}. Each of these abstractions fulfill a storage layer in an
overlay, but
-they have semantical differences in the \emph{usage} of overlay. First,
Distributed Hash
+\cite{zhao03api}. Each of these abstractions fulfill a storage layer in the
overlay, but
+have semantical differences in the \emph{usage} of the overlay. First,
Distributed Hash
Table (DHT) (see e.g., \cite{dabek01widearea}, \cite{rowstron01storage}),
implements three operations: \texttt{lookup(key)}, \texttt{remove(key)} and
\texttt{insert(key)}. As the name suggests, DHT implements the same
functionality
@@ -325,14 +325,14 @@
Object Location (DOLR) (see e.g., \cite{kubiatowicz00oceanstore},
\cite{iyer02squirrel}) is a distributed
directory service. DOLR stores \emph{pointers} to data items throughout the
overlay. DOLR's main
operations are \texttt{publish(key)}, \texttt{removePublished(key)} and
\texttt{sendToObject(key)}. The key
-difference between DHT and DOLR abstraction is that DOLR routes overlay's
messages
-to nearest available peer, hosting a specific data item. This form of locality
+difference between the DHT and the DOLR abstraction is that the DOLR
abstraction routes overlay's messages
+to a nearest available peer, hosting a specific data item. This form of
locality
is not supported by DHT. Finally, tightly structured overlay can be used for
-scalable group multicast/any cast operations (CAST) (see e.g.,
\cite{zhuang01bayeux}).
+scalable group multicast or anycast operations (CAST) (see e.g.,
\cite{zhuang01bayeux}).
The basic operations are \texttt{join(groupIdentifier)},
\texttt{leave(groupIdentifier)},
\texttt{multicast(message, groupIdentifier)},  \texttt{anycast(message,
groupIdentifier)}.
Participating peers may join and leave the group and send multicast messages to
-the group, or anycast message to a specific member of the group. DOLR and CAST
abstractions
+the group, or anycast message to a specific member of the group. The DOLR and
the CAST abstractions
have in common that they both use network proximity techniques
to optimize their operations in the overlay. Figure
\ref{fig:Strucutred_lookup_using_DOLR_model}
presents the DOLR abstraction.
@@ -356,20 +356,21 @@
for tightly structured overlays which have to be addressed in order
to perform efficient data lookups in tightly structured overlays.
First, mapping of keys to peers must be done in a load-balanced
-way. Second, the overlay must be able to forward a lookup for a
-specific key to an appropriate peer. Third, overlay must have
-support for a efficient distance function. Finally,  routing tables for each
peer
+way. Second, the overlay must be able to forward a data lookup for a
+specific key to an appropriate peer. Third, overlay must
+support efficient distance function. Finally,  routing tables for each peer
must be constructed and maintained adaptively.

To store data into a tightly structured overlay, each application-specific
unique key (e.g., SHA-1 \cite{fips-sha-1}) is \emph{mapped} uniformly (e.g.,
using consistent
hashing \cite{258660}) by the overlay to an existing peer in the overlay.
Thus, tightly
structured overlay assigns a subset of all possible keys to every
participating peer.
-We say that a peer is \emph{responsible} for the keys which are assigned by
the overlay.
-Also, each peer in tightly structured overlay maintains a \emph{routing
table}, which
+We say that a peer is \emph{responsible} for the keys which are assigned by
the overlay.
+Figure \ref{fig:structured_hashing} illustrates the
+process of data to key mapping in a tightly structured overlay.
+Also, each peer in the tightly structured overlay maintains a \emph{routing
table}, which
consists of identifiers and IP addresses of other peers in the overlay.
Entries of the routing
-table represents peer's neighbors in the overlay network. Figure
\ref{fig:structured_hashing} illustrates the
-process of data to key mapping in a tightly structured overlay.
+table represent peer's neighbors in the overlay network.

\begin{figure}
\centering
@@ -381,8 +382,8 @@
Currently, all proposed tightly structured overlays provide at least
poly--logarithmical data lookup operations. However, there are some key
differences in the data structure that they use as a routing table. For
example, Chord
-\cite{stoica01chord}, Skip graphs \cite{AspnesS2003} and SkipNet
\cite{harvey03skipnet2} maintain a local
-data structure which resembles Skip lists \cite{78977}.
+\cite{stoica01chord}, Skip graphs \cite{AspnesS2003} and SkipNet
\cite{harvey03skipnet2} maintain a
+distributed data structure which resembles Skip lists \cite{78977}.
In figure \ref{fig:structured_query}, we present an overview of Chord's lookup
process.
On the right side of Chord's lookup process, the same data lookup process
is shown as a binary-tree abstraction.  It can be noticed, that in each step,
the distance
@@ -396,7 +397,7 @@
\end{figure}

Tapestry
-\cite{zhao01tapestry} uses balanced $k$-trees as routing table's data
structure. Figure
+\cite{zhao01tapestry} uses balanced $k$-trees to implement the overlay. Figure
data lookup. Viceroy \cite{malkhi02viceroy} maintains a butterfly data
structure (e.g., \cite{226658}),
which requires only a constant number of neighbor peers while providing
$O(\log{n})$ data lookup
@@ -428,16 +429,16 @@
function is both unidirectional and symmetric. Moreover, Kademlia's
XOR-based metric doesn't need stabilization (like in Chord
(like in Pastry \cite{rowston01pastry}) \cite{balakrishanarticle03lookupp2p}.
-However, in all previous schemes each hop in the overlay shortens the distance
between
+However, in all above schemes each hop in the overlay shortens the distance
between
current peer working with the data lookup and the key which was looked up in
the identifier space.

Skip Graphs \cite{AspnesS2003} and SWAN \cite{bonsma02swan} employ a key space
very similar to a tightly structured
overlay, but in which queries are routed  to \emph{keys}. In these systems
a peer occupies several positions in the identifier space, one for each
application-specific key. The indirection of placing close keys in the
-custody of a storing peer is removed at the cost of each peer maintaining one
-''resource peer'' in the overlay network for each data item it publishes.
Provider peer is the peer
-in the overlay which is responsible for the assigned keys
+custody of a provider peer is removed at the cost of each peer maintaining one
+''resource peer'' in the overlay network for each data item it publishes. The
provider peer is a peer
+which has initially published services into the overlay.

PeerNet \cite{eriksson03peernet} differs from other tightly structured
overlays in that it operates
at the \emph{network} layer. PeerNet makes an explicit distinction
@@ -451,7 +452,7 @@
\subsection{Sketch of a formal definition}

In this subsection, we formalize the main features of tightly structured
overlay, i.e.,
-identifiers, identifier space and mapping function.
+identifiers, identifier space and the mapping function.

Let $S$ be the aggregate of all services $s$ in the system. Let $P$ be the
aggregate of
all peers $p$ in system. Let $I$ be the aggregate of all identifiers $i$ in
system.
@@ -483,18 +484,18 @@

The most important difference between approaches is performance and
scalability properties. Generally
-tightly structured systems can perform all internal operations in
poly-logarithmic time\footnote{However, it is unknown
+tightly structured systems can perform all internal operations in a
poly-logarithmic time\footnote{However, it is unknown
whether all proposed algorithms can preserve logarithmic properties in
real-life applications or not.}
-while the performance of loosely structured systems is not always even linear,
.
+while the performance of loosely structured systems is not always even linear.
Moreover, loosely structured systems scale to millions of peers, whereas
tightly structured systems are able
to cope with billions of concurrent peers \cite{osokine02distnetworks},
\cite{kubiatowicz00oceanstore}.

-To end user, biggest difference between these systems is how data lookups are
performed. Loosely
-structured systems provide a more rich and user friendly way of searching data
as they
-have support for keyword search than tightly structured systems. On the other
hand, tightly structured
+To end user, the biggest difference between these systems is how data lookups
are performed. Loosely
+structured systems provide a more rich and user friendly way of searching data
than tightly structured systems
+as they have a support for keyword searches. On the other hand, tightly
structured
systems support only exact key lookups as each data item is identified by
globally unique keys.

-In the end, both systems have open problems and issues. We will discuss these
aspects in more detail in
+In the end, both systems have open problems and issues. We will discuss these
aspects more detail in
chapter 3. Table \ref{table_comparison_approach} lists the key differences
between the loosely structured
approach and the tightly structured approach.