gzz-commits
[Top][All Lists]

 From: Hermanni Hyytiälä Subject: [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... Date: Fri, 14 Mar 2003 04:09:41 -0500

CVSROOT:        /cvsroot/gzz
Module name:    gzz
Changes by:     Hermanni Hyytiälä <address@hidden>      03/03/14 04:09:01

Modified files:

Log message:
Corrections

CVSWeb URLs:

Patches:
13 09:20:53 2003
04:08:50 2003
@@ -1,4 +1,4 @@
-%***********************
+%***********************a
%***********************
@@ -40,15 +40,15 @@

\abstract{
In this thesis, we review existing Peer-to-Peer approaches, algorithms and
their
-key properties. We summarize open problems in Peer-to-Peer networks and divide
+key properties. We summarize open problems in Peer-to-Peer systems and divide
problems into three sub-categories. We observe that there are many
problems, which have no solutions at all, or problems that have proposed
solutions but they are practically unrealizable.

Then, we give an overview of Fenfire system.  We evaluate existing
Peer-to-Peer approaches-- loosely and tightly structured overlays-- with regard
-to Fenfire's needs. Finally, we propose simple algorithms to efficiently find
Fenfire
-related data from Peer-to-Peer network.
+to Fenfire's needs. Finally, we propose simple methods to efficiently find
Fenfire
+data from Peer-to-Peer network.
}
\tiivistelma{
Tässä opinnäytetyössä esittelemme olemassaolevia vertaisverkkoja, algoritmeja
ja
@@ -61,7 +61,7 @@
Arvioimme olemassaolevia vertaisverkkoarkkitehtuureja-- löyhästi ja tiukasti
rakennettuja päällysverkkoja-- Fenfiren vaatimusten valossa. Lopuksi ehdotamme
yksin-kertaisia algoritmeja, joiden avulla voidaan tehokkaasti löytää
-vertaisverkosta Fenfire:n kannalta olennaista tietoa.
+vertaisverkosta Fenfire-tietoa.
}

\begin{document}
@@ -88,11 +88,17 @@

There are many definitions of Peer-to-Peer networks. Schollmeier
\cite{schollmeier01p2pdefinition}
describes Peer-to-Peer system as a system of
-distributed entities that share their own services. Thus, Peer-to-Peer systems
can be characterized as distributed
+distributed entities that share their own services. Indeed, Peer-to-Peer
systems can be characterized as distributed
systems in which all communication is symmetric and all participants entities
have similar
capabilities and responsibilities. Each entity, i.e., \emph{peer}, may
contribute services
to the overall system.

+Fenfire project is an attempt to build hyperstructured, seamlessly
interoperating desktop
+environment. In Fenfire, all data is stored in the same format, i.e., data
blocks.
+All blocks have globally unique identifiers and they can be referred by other
blocks, i.e.,
+pointer blocks. Additional features of Fenfire include innovative user
+interfaces for viewing data and usage of Peer-to-Peer networking for network
transparency.
+
In this thesis, we\footnote{Use of the plural is customary even if research
paper is authored solely.} review existing Peer-to-Peer approaches, algorithms
and their key properties.
We observe that despite of great amount of proposed Peer-to-Peer systems, all
systems fall either to
@@ -100,16 +106,9 @@
Peer-to-Peer systems and divide problems into three sub-categories: security
problems,
performance problems and miscellaneous problems.

-We give an overview of Fenfire project. Fenfire project is an attempt to build
hyperstructured,
-seamlessly interoperating desktop environment. Additional features of Fenfire
include innovative user
-interfaces for viewing data and usage of Peer-to-Peer networking for network
transparency. In Fenfire,
-all data is stored in the same format, i.e., data blocks.  All blocks have
globally unique
-identifiers and they can be referred by other blocks, i.e., pointer blocks.
-
-After overview, we evaluate existing Peer-to-Peer approaches and
-choose the best alternative to Fenfire's needs. Finally, we propose system
model for Fenfire in Peer-to-Peer
-environment and present yet simple but efficient algorithms to be used for
data lookups in
-Peer-to-Peer environment.
+Then, we give an overview of Fenfire project, evaluate existing Peer-to-Peer
approaches and
+choose the best alternative to Fenfire's needs. Finally, we propose yet simple
but efficient methods to
+be used for data lookups in Peer-to-Peer environment.

We have attempted to comprehensively summarize existing algorithms and open
problems in
Peer-to-Peer domain. However, this thesis is not meant to be detailed work.
More detailed
@@ -128,7 +127,7 @@
address open problems in Peer-to-Peer domain and divide problems into three
sub-categories. Chapter 4 gives an overview of Fenfire system. In chapter
5, we evaluate existing Peer-to-Peer approaches with regard to Fenfire system.
-In chapter 6, we present conclusions and future work.
+Finally, in chapter 6 we present conclusions and future work.

\chapter{Peer-to-Peer architectures}
@@ -136,7 +135,7 @@
review most important Peer-to-Peer algorithms and list key differences between
two main approaches.

-\section{Overview}
+\section{Brief history and overview}

The Internet has been originally established in the late 1960s. The objective
of the ARPANET-project was to share computers' resources among military
computers
@@ -156,7 +155,15 @@

Modern Peer-to-Peer system is composed of \emph{application} level overlay
network.
Figure \ref{fig:application_level} illustrates the analogy of Peer-to-Peer
network with
-regard to OSI model.
+regard to OSI model. Compared to ARPANET's Peer-to-Peer functionality, modern
Peer-to-Peer systems
+are ad hoc, i.e., peers join and leave the system constantly in a dynamic
manner. This
+fact constitutes challenging requirements for efficient construction and
maintenance
+of the overlay network. Even more demanding tasks are how to perform efficient
data
+lookup and maintain security in a varying distributed environment. The most
popular
+form of modern Peer-to-Peer computing is file-sharing. In this scenario,
participants
+of Peer-to-Peer network share their file resources to other participants while
obtaining
+more resources from others. This can be seen as a variant of distributed file
system
+(e.g., \cite{levy90distributedfilesystems}).

\begin{figure}
\centering
@@ -165,15 +172,7 @@
\label{fig:application_level}
\end{figure}

-Compared to ARPANET's Peer-to-Peer functionality, modern Peer-to-Peer systems
-are ad hoc, i.e., peers join and leave the system constantly in a dynamic
manner. This
-fact constitutes challenging requirements for efficient construction and
maintenance
-of the overlay network. Even more demanding tasks are how to perform efficient
data
-lookup and maintain security in a varying distributed environment. The most
popular
-form of modern Peer-to-Peer computing is file-sharing. In this scenario,
participants
-of Peer-to-Peer network share their resources to other participants while
obtaining
-more resources from others. This can be seen as a variant of distributed file
system
-(e.g., \cite{levy90distributedfilesystems}).
+

In a development of modern Peer-to-Peer systems, lot of influences has been
attained from
other research areas than computer science. Research has been conducted
regarding
@@ -195,8 +194,7 @@
loosely. This approach gives freedom for participating peers
to perform certain tasks in Peer-to-Peer network. On the other hand, tightly
structured
approach has some rules, which all participating peers have to obey. In the
following
-sections, we will discuss in more detail both approaches, their disadvantages
-key differences between them.
+sections, we will discuss in more detail both approaches and key differences
between them.

\section{Centralized}
@@ -214,7 +212,7 @@
\section{Loosely structured}

Gnutella \cite{gnutellaurl} is a well-known example of loosely structured
overlay network. As in
-other Peer-to-Peer networks, no peer is more important than any other peer in
the network.
+other pure Peer-to-Peer networks, no peer is more important than any other
peer in the network.
The construction and maintenance of Gnutella network is extremely ad hoc,
since participating
peers can form the overlay network based on \emph{local} knowledge. Figure
\ref{fig:gnutella_overlay}
illustrates how peers form an overlay network. Initially, peer 1 creates the
overlay, since
@@ -262,7 +260,7 @@

Lately, there has been done lot of research to improve Gnutella's data lookup
efficiency
-\cite{adamic01powerlawsearch} has studied different random walk methods in
power-law
+\cite{adamic01powerlawsearch} have studied different data lookup methods in
power-law
networks and have found that by
instructing peers forwarding data lookups to select high degree peers, the
performance of data lookup
increases significantly. As a result, some of the most recent loosely
@@ -299,10 +297,8 @@
\label{fig:gnutella_powerlaw}
\end{figure}

-Previously presented improvements are only partial solutions. Obviously, more
-research is required to make data lookup of loosely structured approach more
-scalable and effective. More advanced techniques to improve data lookup of
-loosely structured systems are presented in chapter 3.
+Previously presented improvements are only partial solutions. More advanced
techniques
+to improve data lookup of loosely structured systems are presented in chapter
3.

\subsection{Sketch of formal definition}
@@ -324,7 +320,7 @@

Partly due to scalability problems of loosely structured systems, several
tightly
structured overlays have been proposed.
-This list includes CAN \cite{ratnasamy01can}, Chord \cite{stoica01chord},
+The list includes CAN \cite{ratnasamy01can}, Chord \cite{stoica01chord},
Koorde \cite{kaashoek03koorde}, ODHDHT \cite{naor03simpledht},
Pastry \cite{rowston01pastry}, PeerNet \cite{eriksson03peernet},
@@ -333,10 +329,10 @@
\cite{zhao01tapestry}, Viceroy \cite{malkhi02viceroy} and others
\cite{freedman02trie}.
The biggest difference compared to loosely structured approach is that with
tightly structured systems,
it is now feasible to perform \emph{global} data lookups in the overlay.
-While there are significant differences among proposed systems, they all have
in common
+While there are significant differences among proposed tighty structured
systems, they all have in common
that \emph{peer identifiers} are assigned to participating peers from
-a large \emph{identifier space} by the overlay. Furthermore,
application-specific
-data items are also assigned globally unique identifiers, \emph{keys},
+a large \emph{identifier space} by the overlay. Furthermore, globally unique
identifiers
+are also assigned to application-specific data items, \emph{keys},
which are selected from the same identifier space. The form of identifier
space differs between proposed systems. Circular identifier space (and
variants)
is most widely used. For instance, Chord \cite{stoica01chord}, Koorde
\cite{kaashoek03koorde},
@@ -362,10 +358,10 @@
\label{fig:structured_hashing}
\end{figure}

-All messages are routed across overlay links towards peers, whose
+All messages are routed across the overlay towards peers, whose
peer identifier is gradually ''closer'' to the key's identifier
-in the identifier space. Distance can be measured by numerical
-difference between identifiers (e.g., Chord \cite{stoica01chord}), the number
of
+in the identifier space. The distance can be measured by numerical
+difference between identifiers (e.g., Chord \cite{stoica01chord}), number of
same prefix bits between identifiers (e.g., Pastry \cite{rowston01pastry} and
Tapestry \cite{zhao01tapestry}) or
Because of XOR-metric, Kademlia's distance function is both unidirectional
@@ -388,8 +384,8 @@
peer occupies several positions in the identifier space, one for each
application-specific key. The indirection of placing close keys in the
custody of a storing peer\footnote{Storing peer is the peer in the overlay
which is responsible for the
-assigned keys.} keys is removed at the cost of each peer maintaining one
-''resource peer'' in the overlay network for each resource item pair it
publishes.
+assigned keys.} is removed at the cost of each peer maintaining one
+''resource peer'' in the overlay network for each data item it publishes.

PeerNet \cite{eriksson03peernet} differs from other tightly structured
overlays in that it operates
at the \emph{network} layer. PeerNet makes an explicit distinction
@@ -463,10 +459,10 @@
The basic operations are \texttt{join(groupIdentifier)},
\texttt{leave(groupIdentifier)},
\texttt{multicast(message, groupIdentifier)},  \texttt{anycast(message,
groupIdentifier)}.
Participating peers may join and leave the group and send multicast messages to
-the group, or any-cast message to a specific member of the group. DOLR and
CAST abstraction
-have much in common. For instance, they both use network proximity techniques
-to optimize their operation in the overlay. Figure
\ref{fig:Strucutred_lookup_using_DOLR_model}
- presents basic operation of DOLR abstraction.
+the group, or anycast message to a specific member of the group. DOLR and CAST
abstractions
+have in common that they both use network proximity techniques
+to optimize their operations in the overlay. Figure
\ref{fig:Strucutred_lookup_using_DOLR_model}
+presents the DOLR abstraction.

\begin{figure}
\centering
@@ -486,9 +482,8 @@

\subsection{Sketch of formal definition}

-In this subsection we formalize tightly structured overlay's main features.
The model
-describes basic features of tightly structured overlay, i.e., identifiers,
identifier
-space and mapping function.
+In this subsection we formalize tightly structured overlay's main features,
i.e.,
+identifiers, identifier space and mapping function.

Let $S$ be the aggregate of all services $s$ in system. Let $P$ be the
aggregate of
all peers $p$ in system. Let $I$ be the aggregate of all identifiers $i$ in
system.
@@ -516,22 +511,16 @@
have very little in common. Indeed, the only thing they share is the fact that
no other peer is more
important than an other in the Peer-to-Peer network. Fault tolerance \emph{may}
be an area, in which approaches have similar properties (e.g., no single point
of failure).
-However, fault-tolerance properties of both approaches are currently only
initial calculations, or
-experimented in simulation environments. In real-life, measuring fault
tolerance is much more
+Fault tolerance properties of both approaches are currently only initial
calculations, or
+experimented in simulation environments. In real-life, however, measuring
fault tolerance is much more

The most important difference between approaches is performance and
scalability properties. While
performance of loosely structured approach is not always even linear,
generally tightly structured
approach can perform all internal operations in poly-logarithmic
time\footnote{However, it is unknown
whether all proposed algorithms can preserve logarithmic properties in
real-life applications or not.}.
-Loosely structured systems scale to millions of peers, whereas tightly
structured systems are able
-to cope with billions of concurrent peers.
-
-Another key point is the philosophy how overlay network is constructed and
maintained. While loosely
-structured approach gives much freedom to individual peers to join and leave
the overlay network, tightly
-structured approach has certain features, in which participating peers have no
control at all
-(such as mapping of data items). With DHT abstraction of tightly structured
approach, for instance,
-peer has no power to decide where the data items are mapped in the overlay.
+Moreover, loosely structured systems scale to millions of peers, whereas
tightly structured systems are able
+to cope with billions of concurrent peers \cite{osokine02distnetworks},
\cite{kubiatowicz00oceanstore}.

To end user, biggest difference between these systems is how data lookups are
performed. Loosely
structured systems provide more richer and user friendly way of searching data
as they
@@ -802,7 +791,7 @@
\parbox{37pt}{$O(\log^2{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
-\parbox{85pt}{$2k+2+f$, where k = long range connections, 2 = peer's
neighbors, f = fault-tolerance connections)} &
+\parbox{85pt}{$2k+2+f$, where k = long range connections, 2 = peer's
neighbors, f = fault tolerance connections)} &
\parbox{85pt}{Space can be also $O(1)$. Additional space of can be used as a
lookahead list for better performance, not necessarily fault-tolerant because
of constant degree of neighbors}
\\ \hline

@@ -864,15 +853,15 @@
to public, researchers' main concern has been the scalability problem of
loosely structured
approach. However, people often misunderstand the scalability problem of
loosely structured
approach; \emph{network} of loosely structured systems is scalable, but the
\emph{data lookup model} is not.
-The main concern of tightly structured system is to make overlay's data lookup
-routing more flexible against hostile attacks. Other key problems in tightly
structured
+The main concern of tightly structured system is to make overlay's data lookup
process
+more fault tolerant against hostile attacks. Other key problems in tightly
structured
systems are the lack of keyword searches, support for heterogeneous peers and
\cite{balakrishanarticle03lookupp2p}.

To make Peer-to-Peer systems even more popular (e.g., in industry),
Peer-to-Peer domain
needs better infrastructures to deal with security issues. There has been done
some
research regarding anonymity, access control, data availability and data
integrity but as
-we state in the following sections, much more research work is required to
solve security issues.
+we state in the following sections, much more research work is required to
solve these issues.

\section{Security problems in Peer-to-Peer}

@@ -1764,7 +1753,7 @@
We start by giving a problem overview when considering Fenfire in Peer-to-Peer
environment. We define Fenfire's special needs and evaluate existing
Peer-to-Peer approaches in light of these requirements. After that, we propose
system
-model for Fenfire in Peer-to-Peer environment and present simple algorithms to
perform data
+model for Fenfire in Peer-to-Peer environment and present simple methods to
perform data
lookups in Peer-to-Peer environment. In the end, we discuss possible problems
of using Fenfire
in Peer-to-Peer environment

@@ -1870,13 +1859,13 @@

\section{Fenfire system model in Peer-to-Peer environment}

-In this section we give a proposal of Fenfire Peer-to-Peer system, which
consists
+In this section we give a proposal for Fenfire Peer-to-Peer system, which
consists
of several technologies reviewed in this thesis.  Then, we introduce yet
simple but
-effective algorithms for obtaining Fenfire data from Peer-to-Peer environment.
+effective methods for obtaining Fenfire data from Peer-to-Peer environment.

\subsection{System proposal}

for
locating data efficiently in the Peer-to-Peer overlay. There are two main
reasons for this. First, Kademlia's XOR-based distance function is superior
over the distance functions of other systems. Second, there are already some
@@ -1901,7 +1890,7 @@
technology based
on rateless erasure codes \cite{maymounkov03ratelesscodes} seems very
promising.

-\subsection{Algorithms}
+\subsection{Methods}

We use the DOLR abstraction of tightly structured approach, i.e., each
participating peer hosts
the data and overlay maintains only the \emph{pointers} to the data. We
decided to use the DOLR