TABLE OF CONTENTS
INTRODUCTION .1
CHAP TER 1: PEER TO PEER NETWORKS .3
1.1 Overvie w .3
1.2 Definitions.3
1.3 Comparison with client/server architecture .4
1.4 Benefits and weaknesses of P2P networks .5
1.5 Classifying P2P networks .6
1.5.1 Pure P2P networks .6
1.5.2 Hybrid P2P networks .7
CHAP TER 2: BITTORRENT FILE SHARING SYSTEM .10
2.1 Overvie w .10
2.2 Benefits of the BitTorrent protocol.11
2.3 Limitations .13
2.4 Comparison with other file sharing protocols .13
2.5 Original BitTorrent for wired networks .14
2.5.1 Description .14
2.5.2 Operation .15
2.5.3 Creating and publishing torrents .16
2.5.4 Do wnlo ading torrents and sharing files .17
2.6 BitTorrent variant for wireless ad hoc networks .18
2.6.1 Trackerless BitTorrent .18
2.6.2 Packets exchanged betwee n peers .18
2.6.3 Do wnlo ading mechanism .19
2.6.4 Selecting a neighbor at random .20
2.6.5 Piece selection strategy .21
2.7 Performance metrics .21
CHAP TER 3: AN IMPROVEMENT OF BITTORRENT ADAP TATION TO
MANETS .24
3.1 General ideas .24
3.2 Node presence detection mechanism .24
3.2.1 Soft state bloom filter .25
3.2.2 Bloom filter operations .26
3.2.3 Decoupling information decay and BEACON Intervals .27
3.3 Inte gration of node presence detection mechanism into BitTorrent .28
3.4 Conclusion .30
CHAP TER 4: SIMULATION RESULTS .31
4.1 Network simulator (NS-2) .31
4.2 Main scenario .31
4.2.1 Estimate the average finish time .34
4.2.2 Estimate the average sharing ratio .35
4.2.3 Estimate the network traffic .36
4.2.4 Impact of the number of network nodes .37
CONCLUSIONS AND P ERSPECTIVES .38
REFERENCES .39
47 trang |
Chia sẻ: maiphuongdc | Lượt xem: 1362 | Lượt tải: 1
Bạn đang xem trước 20 trang tài liệu Khóa luận Trackerless bittorrent adapted to mobile ad hoc networks, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
there are many nodes involvement. In return we have a more quickly and more
efficiently searching mechanism. The nature of P2P interaction directly between the
nodes that still offers significant advantages through utilization of network resources.
But here arises a problem concerning the organization of storage the information about
the location of the file shared on network. Usually they organize the information into a
database. Each record includes a file with file name and address of the node currently
hosting the file. This is a popular way to organize applications using P2P file-sharing
hybrid model like Napster or OpenNap. However, limitations of this method is that it
only allows users to search by file name (or general file identifiers). This is suitable
only for sharing music file or photo libraries. With more complex libraries of
documents related to the character (text files, books, articles, ...), demand does not stop
from looking at the name or identity criteria. Currently almost all file-sharing
applications commonly found on the web do not allow to search features of the content
within document files. It can be seen that to put this searching feature function into a
file-sharing applications, we found it necessary to apply the methods and techniques
used in information retrieval . When that information about the documents not only of
9
the name (identification) of documents but also include further information on the
keywords contained in the document. Information on the searching server must be
organized as a system indexing by keyword.
10
CHAPTER 2: BITTORRENT FILE SHARING SYSTEM
In this chapter, we present an overview of the BitTorrent file sharing system and
its implementation over wired networks and mobile ad hoc networks.
2.1 Overview
File sharing is very well known in the entire community due to its popular usage.
A large amount of bandwidth on the Internet is used by different file sharing
applications/protocols. It has been estimated that it may account for as much as 43% of
all Internet traffic (depending on geographical location) as of February 2009[14].
BitTorrent is one of the most popular protocols for transferring large files. There
are many famous applications using BitTorrent protocol such as uTorrent or Podcast.
Besides legal distribution of files, BitTorrent is often used to illegally share and
download copyrighted material.
The technology within the protocol makes it possible to distribute large amounts
of data without the need of a high capacity server, and expensive bandwidth. This is
probably the main reason for using this protocol instead of using traditional
downloading from HTTP or FTP servers. For those sharing and downloading
copyrighted files, BitTorrent is profitable because it is decentralized.
11
Figure 4: An example of the BitTorrent file sharing system
2.2 Benefits of the BitTorrent protocol
The BitTorrent protocol helps to improve the previous generation of P2P
software with a number of significant initiatives, which shall be discussed in next
subsections.
High speed downloading
Simply put, BitTorrent is quite popular due to its parallel file transfer process.
Unlike the typical network behavior, which bottlenecks cause performance to
plummet; BitTorrent actually increases the demand to improve the performance of a
computer network. This behavior increases the stability ratio for the distribution of
large, popular content on the internet, for servers of limited capacity. The protocol
achieves this by splitting large files into smaller blocks, such as 128 kilobytes, by
splitting into blocks, the source components may be required from many sources, but
12
not need to set the complete file to be able to load up the block, which means
downloading the software can quickly start contributing as uploaders for other users
Simplicity, easy to use
Unlike other complex previous P2P protocols, the BitTorrent protocol works on a
web based distribution model built over the HTTP protocol. Users simply download a
client, which to a user act as an extension of their internet browser. By opening the
files ‘.Torrent’, client will allow the users to save files to a location of their choice on
the hard drives in the same fashion as original file transfer applications. Until the user
closes the client, BitTorrent will continue sharing the file to other clients.
Effective use of limited upload bandwidth
Home broadband connections have quite limited upload bandwidth available is a
big problem of P2P systems. To solve this, the BitTorrent protocol uses a few
mechanisms to mitigate for this problem. When a client has received a few initial
blocks, it can then use its upload bandwidth to forward those blocks to other machines.
Another aspect is that transfers are limited to four open sessions, and the number of
files hosted is limited to a few users; by focusing on a small number of activities, the
network can reduces overhead. By reducing the number of uploads per machine, and
having considerably more machines uploading content, the time required by a client
for downloading a large file is considerably less than tr aditional means.
Overcoming the Free-Rider problem
In typical P2P systems, the performance of the overall system downs when users
trying to obtain service without contributing anything to the system. This is an old
problem, and BitTorrent was designed with an approach of local optimization by
allocating the most upload bandwidth to peers, which are supplying the greatest
amount of content. Because of the design of the current BitTorrent Protocol, free riders
can only get 20% of the total system bandwidth.
High content integrity
BitTorrent employs two mechanisms, which make it more reliable. First, to be
accepted on a file tracking server, typically new content must be accepted and
managed by moderators who manually inspect it. To reduce the overload for
moderators, regular suppliers of content can be promoted to immoderate submitters
13
after gaining enough trust. Once the original content is considered trusted, a validation
checksum of file will be presented from the server to guarantee the content integrity.
2.3 Limitations
BitTorrent protocol does not help users to hide their identifiers. Because the
tracker maintains a list of files being shared, it also contains a list of IP addresses of
the computers are downloading the file, and the list of the files that have been
downloaded previously. Based on the BitTorrent protocol also ascertains the address
of the peer in the swarm and of course, the peer can be attacked.
Another disadvantage of the BitTorrent protocol is less encouraging peer into
seeder after file downloading is complete. Consequently, seeders and peers in swarm
will disappear gradually, which means that more older torrents to download the file,
then the probability of success lower. BitTorrent has the advantage in the broadband
environment, such as DSL, cable, satellite ... but for dial-up Internet users using the
BitTorrent protocol will not work, because dial-up connection or disconnection and
speed not load high.
2.4 Comparison with other file sharing protocols
The method used to distribute files between eDonkey2000 and BitTorrent
networks is similar, but the clients in the eDonkey network usually share and
download many files, making the bandwidth per carrier becomes less. In contrast,
transporting in BitTorrent is much faster by focusing on a file or a specific group of
files. Original eDonkey2000 protocol provides very little resistance to fraud machines
(download more, upload very little), the new version of eDonkey2000 client has
installed the system which encourages more uploads. For example, systematic program
eMule (credits system) have a score system to encourage uploaders. One client will
prioritize the shipment to its previous position by moving the client to the top of the
queue to wait for less time. This system proved effective because a queue for each
client using eMule often has hundreds, even thousand entries.
KaZaA is a protocol similar to BitTorrent protocol, but it has another point that
is its distinguishing workstations by dedication level (Participation Level).
Contribution level increases when you upload and decreases when you download.
Once you upload a resource, it devotes to the person with the highest-level first and
then this person devote to the lower contribution level one and so on. This model is
similar to the pyramid model; the most uploader is at the top position of the pyramid,
14
and the less one is on the bottom. KaZaA is only appropriate model distribution of
resources for a large number of users, it has been shown that the file at the bottom of
the pyramid can be downloaded faster than on the case by means of HTTP (in the case
file is large). Nevertheless, KaZaA has a small downside is that it believes the report
of the workstations on the level of dedication, so clients can cheat with a lot of
dedication for the workstation which is not official.
2.5 Original BitTorrent for wired networks
2.5.1 Description
BitTorrent protocol defines a method to disseminate and share files online.
Before having BitTorrent protocol, there were some P2P protocols, which allow a
computer in the network to share files with other computers without having a
centralized server. BitTorrent is an improvement from the previous P2P protocol.
BitTorrent protocol has a principle to work closely with the ability to customize,
reliable and cost of maintaining a list of computer file sharing better than the previous
P2P protocols. Because of communication following standard TCP/ IP, BitTorrent
protocol can operate on regular Internet connection [10].
BitTorrent protocol, which allows users to distribute large amounts of data
without depending much on their computers, would be needed for standard Internet
hosting. This protocol works as an alternative data distribution method that makes
even small devices (e.g. mobile phones) with low bandwidth capable of participating
in large data transfers.
BitTorrent client is a program operating under the BitTorrent protocol. Each
BitTorrent client can compares, requirements, and transports files on the network
using the BitTorrent protocol. Files can contain any information, including text, audio,
video and encrypted content.
First, a user playing the role of file-provider makes a file available to the
network. This first user's file that is called a seed allows other users, called peers, to
connect and start to download the file. When new peers connect to the network and
request the same file, their computer gets a different piece of the data from the seed.
Once multiple peers have multiple pieces of the seed, each of them becomes a source
for that continuously distributes the file. The effect of this is to reduce the dependent
on the initial user and distribute the file download task among the seed and many
peers. With BitTorrent, there is no need to provide a computer data quantities, which
could jeopardize the task by overwhelming all resources, but the same result is still
reached: each peer eventually receiving the entire file.
15
After the file successfully and completely downloaded, the peer can shift roles
and become an additional seed, helping the other peers to get the whole file. This
eventual change from peers to seeds determined the number of times a file is available
in its complete form.
This distributed nature of BitTorrent leads to spread the file throughout peers. As
more peers join the swarm, the likelihood of a successful download increases.
Compared with standard Internet hosting, this reduces the original distributor's
hardware and bandwidth resource costs significantly. It also reduces dependence on
the original distributor and provides a source for the file, which is generally temporary
and therefore more difficult to trace than providing by the long-term availability of a
server in standard file distribution techniques.
2.5.2 Operation
BitTorrent greatly reduces the load on the server, since users usually download
files between them, not the server. As shown by the colored bars below each client, the
file is downloaded in a random order, instead of carrying a sequential order.
Unlike the systems of traditional file sharing, its main objective is to provide an
efficient way to distribute a single file to a large group of people, forcing all those who
download a file to share it with others. First is distributing by conventional means a
small file with a. torrent. This file is static, so often found on websites or distributed
via email. The file 'torrent' contains the address of a "search server", called tracker,
which is responsible for locating potential sources for the file or part thereof.
This server is really centralized and it provides statistics on the number of
transfers, the number of nodes with a full copy of the file and the number of nodes that
have only a portion thereof.
The library file or files you want to download from the sources found by the
tracker, and at the same time as the download, it starts to climb parts of the file
available to other sources, using the bandwidth allocated to it. Since the action of
sharing begins even before completing the download of a file, each peer will inevitably
contribute to the distribution of this file. The system is responsible for rewarding those
who share more, the higher the bandwidth the greater the number of connections to
discharge peers to be established.
When a peer starts downloading a file, BitTorrent does not necessarily come at
the beginning of the file, but is low in parts at random, after peers are connected to
16
download the file. If the peers are available online each part of the whole file (even if
they are scattered), eventually every peers will get a full copy of it. Of course, initially
one must have the complete file to begin the process. This method produces significant
improvements in transfer rate when many users connect to download the same file.
Where there is not any node with having complete file ("seeder" or "seed")
connected to the tracker, the possibility exists that the file cannot be completed.
2.5.3 Creating and publishing torrents
The peers that are distributed over the network treat the file as a separation of it
in a number of identically sized pieces, typically between 32 KB and 4 MB each. Each
peer performs a checksum (checksum) for each part, using the SHA-1 algorithm, and
storing it in the torrent file. Parts greater than 512 KB will reduce the size of a torrent
file for each payload, but this would reduce the efficiency of the protocol. When
another peer later receives a particular piece, the hash of the piece is compared to the
recorded hash to check that is free from error. Peers that provide complete files be
called seeds (seeders) and the peer that provides the initial copy of the file is called
initial seed (initial seeder).
The exact information that is contained in the torrent file depends on the
BitTorrent protocol’s version. By convention, the name of a torrent file has the
extension ".torrent". Torrent files have a section called "announce", which specifies
the URL of the tracker, and "Info" section, which contains the names of the files, their
sizes, length of piece used, and SHA-1 hash code for each of the pieces, all this
information is used by clients to verify the integrity of the data received.
Once completed torrents files are posted on a website or elsewhere, and are
registered with an origin server, which is called a tracker, it keeps the list of clients
who are currently participating on the torrent file. Alternatively, in a decentralized
system, each peer acts as a tracker. BitTorrent clients such as μTorrent, BitComet,
KTorrent and Deluge, implement this, using methods of Distributed hash table (DHT).
Azureus also supports tracing method is incompatible nodes (April 2007) with the
DHT that offers its customers [2].
Once DHT has passed, a "private" flag like broadcast flag was unofficially
announced, telling clients to restrict the use of decentralized monitoring whether the
user desires. Flag is intentionally placed in info section so it cannot be disabled or
17
removed without changing the identity of the torrent. The purpose of the flag is to
prevent torrents from being shared with clients without access to tracker. The flag was
requested to be included in the formal specification in August, in 2008, but it has not
been accepted.
2.5.4 Downloading torrents and sharing files
Using any Internet browser, like Firefox, browse the site has a list of torrents,
download it then use BitTorrent client open out there. After opening the torrent file,
BitTorrent will connect to the trackers, the trackers will give it a list of the peers are
downloading this file. A group of members (or peers) of a BitTorrent network which
download the same file is called the swarm. When the client connects to the swarm, it
will start sharing files with each other. The clients will share the pieces together
instead of sharing direct with the trackers, so the number of clients in the swarm can
develop very quickly.
In BitTorrent network, each of clients shares some of its resources such as data,
upload bandwidth with other peers interested in the same content in order to go up the
global system capacity. Peers participating to download the same content form a
torrent. A peer discovers other peers by contacting a central rendezvous node called
tracker. The latter store IP addresses of all peers in the torrent and manage statistics of
uploader and downloader on network. To make the replication of the content in the
network easier and to guarantee multi-sourcing, a file is subdivided into a set of pieces.
Then, each piece is also subdivided into blocks. Seed (seeder) is a peer, which has all
pieces of the file. When the peer is still downloading pieces, it is called leecher. The
neighbors of a node are contained in a peer list which it can open a TCP-connection to
exchange data and information. Only four simultaneous outgoing active TCP
connections are accepted by the protocol. These neighbors are called effective
neighbors. They are chose by the BitTorrent’s choking algorithm. This algorithm is
performed periodically. When the choking period expires, a peer chooses to unchoke
the 3 uploaders to him at the highest rate. It is a best slot unchoking. This strategy,
called tit-for-tat, ensures reciprocity and enforces collaboration among peers. Now to
discover new upload capacities, a peer chooses randomly the fourth peer to unchoke.
This slot is called optimistic slot. All other neighbors are left choked. Once unchoked,
a peer chooses a piece to download using a specific piece selection strategy that is
called local rarest first. Indeed, each peer maintains an update-to-date list of pieces
owned by all its neighbors. When choosing a piece, a peer selects the piece with the
least redundancy in its neighbors. One of the rarest pieces is selected randomly in case
of equality. Rarest first is supposed to grow up the entropy of pieces in the network
that enforces cooperation and hence improves global performance [14].
18
2.6 BitTorrent variant for wireless ad hoc networks
Some works have tried to adapt to the BitTorrent network to MANETs (eg, [1]
and [13]). They focus only improved phase detection node without resolve to improve
the efficiency of content sharing. Michiardi et al [7] study in the performance of a
cooperative mechanism for distributing content from a source to a destination of great
potential. They propose to implement a small change BitTorrent that allows detection
neighbors and local traffic. This is done by selecting only close neighbors as effective
neighbors. The result is a decrease in download time and total energy consumption.
However, this below solution goes further by focusing not only on download time, but
also on the sharing between peers.
2.6.1 Trackerless BitTorrent
According to the feature of MANETs that peers themselves linking into a
network, a centralized tracker cannot install. To adapt BitTorrent over wireless ad hoc
network, we need a new mechanism to replace tracker's role: discovering and
identifying the other peers. M.K Sbai gave a method using HELLO message to solve
such problem [8]. To discover new peers, a peer floods periodically a HELLO
message to the network and waits for HELLO REPLY messages. These HELLO
messages are forwarded to the other neighbors with some initial TTL (Time-To-Live)
to control the scope of the flood and hence the visibility of a peer. Receiving a HELLO
message, a peer decreases the TTL and forwards it to its wireless neighbors, and so on
until its TTL reaches zero.
2.6.2 Packets exchanged between peers
There are 2 types of packets exchanged in the network: Data packets and control
packets. We choose to send data packets via TCP connections because TCP provides
reliability and congestion controls needed to transfer blocks of file. Control packets
such as HELLO message (or BEACON message in new version) and piece updates are
better when using UDP. Here are the control packets exchanged between peers:
- HELLO: see section 2.6.1
- HELLO REPLY: see section 2.6.1
- UPDATE PIECE LIST: when a peer receives a new piece, it sends an
UPDATE PIECE LIST to all peers with whom it can exchange data.
19
- PIECE OFFER REQUEST: when a peer i unchokes a peer j, it sends a
PIECE OFFER REQUEST packet to j. This packet contains the list of pieces
that i has already downloaded.
- PIECE OFFER REPLY: receiving a PIECE OFFER REQUEST, a peer sends
a PIECE OFFER REPLY packet to the sender. A flag included in this packet
indicates the offer being accepted or not.
2.6.3 Downloading mechanism
To profit from the advantages of the limit neighborhood, M. Sbai gave an idea
that modified the original BitTorrent to adapt to MANETs [8]. A few TCP connections
are created to distant peers in addition to those with close peers. Pieces can then spread
over the network and they will be propagated in different directions, which improve
the sharing ratio and the download completion time. Therefore, several zones of the
network can be active simultaneously, which makes the network becoming more
dynamic. The new choking algorithm is used by utilizing routing information. It
distributes optimistic unchokes between distant and close peers and adds a specific
neighbor selection mechanism to choose a remote peer. It also applies a new piece
selection strategy when the distant peer is offered. Unlike regular BitTorrent with
limited neighborhood, this method requires a global knowledge about the identifiers of
peers in the network.
In the modification BitTorrent, each peer maintains 2 tables: NEARBY
NEIGHBOR TABLE (NNT) and FAR NEIGHBOR TABLE (FNT). One and two
hops neighbors will are added to NNT, the others belong to FNT. The PIECE
UPDATE packets are sent only to neighbors in NNT. Due to the piece selection
strategy operates only on the NNTs; peers don’t need to know about all pieces in the
network. Indeed, in wireless networks, the copy of pieces is more efficient when it is
based on statistics in the nearby neighborhood since this guarantees a faster local
replication compared to a large neighborhood. As in BitTorrent, the choking algorithm
selects three best uploaders as effective neighbors. These three neighbors are chosen
from both nearby and far neighbor tables. The peer then serves these three neighbors
during the next choking period. But in addition to these effective neighbors, the fourth
random neighbor is selected from one of the two tables (optimistic slot) using a round
robin policy that guarantees an optimal balance between the random unchokes locally
and the transmission of pieces to remote neighbors in order to improve diversity. In
detail, the peer selects a peer one time from FNT, q times from NNT and so on. In this
protocol, the ratio of the number of time slots spent on serving nearby neighbors and
those for serving far neighbors bases on the quantum q. It is also the number of slots
20
that a peer should wait before unchoking a distant neighbor again. Sbai’s experiments
shown that the optimal choice of the quantum q depend on the number of nodes in
network, the maximum length of a path between 2 nodes (hm) and the number
Các file đính kèm theo tài liệu này:
- LUANVANDoKhanhToan.pdf