Khóa luận Trackerless bittorrent adapted to mobile ad hoc networks

TABLE OF CONTENTS

INTRODUCTION .1

CHAP TER 1: PEER TO PEER NETWORKS .3

1.1 Overvie w .3

1.2 Definitions.3

1.3 Comparison with client/server architecture .4

1.4 Benefits and weaknesses of P2P networks .5

1.5 Classifying P2P networks .6

1.5.1 Pure P2P networks .6

1.5.2 Hybrid P2P networks .7

CHAP TER 2: BITTORRENT FILE SHARING SYSTEM .10

2.1 Overvie w .10

2.2 Benefits of the BitTorrent protocol.11

2.3 Limitations .13

2.4 Comparison with other file sharing protocols .13

2.5 Original BitTorrent for wired networks .14

2.5.1 Description .14

2.5.2 Operation .15

2.5.3 Creating and publishing torrents .16

2.5.4 Do wnlo ading torrents and sharing files .17

2.6 BitTorrent variant for wireless ad hoc networks .18

2.6.1 Trackerless BitTorrent .18

2.6.2 Packets exchanged betwee n peers .18

2.6.3 Do wnlo ading mechanism .19

2.6.4 Selecting a neighbor at random .20

2.6.5 Piece selection strategy .21

2.7 Performance metrics .21

CHAP TER 3: AN IMPROVEMENT OF BITTORRENT ADAP TATION TO

MANETS .24

3.1 General ideas .24

3.2 Node presence detection mechanism .24

3.2.1 Soft state bloom filter .25

3.2.2 Bloom filter operations .26

3.2.3 Decoupling information decay and BEACON Intervals .27

3.3 Inte gration of node presence detection mechanism into BitTorrent .28

3.4 Conclusion .30

CHAP TER 4: SIMULATION RESULTS .31

4.1 Network simulator (NS-2) .31

4.2 Main scenario .31

4.2.1 Estimate the average finish time .34

4.2.2 Estimate the average sharing ratio .35

4.2.3 Estimate the network traffic .36

4.2.4 Impact of the number of network nodes .37

CONCLUSIONS AND P ERSPECTIVES .38

REFERENCES .39

pdf47 trang | Chia sẻ: maiphuongdc | Lượt xem: 1278 | Lượt tải: 1download
Bạn đang xem trước 20 trang tài liệu Khóa luận Trackerless bittorrent adapted to mobile ad hoc networks, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
there are many nodes involvement. In return we have a more quickly and more efficiently searching mechanism. The nature of P2P interaction directly between the nodes that still offers significant advantages through utilization of network resources. But here arises a problem concerning the organization of storage the information about the location of the file shared on network. Usually they organize the information into a database. Each record includes a file with file name and address of the node currently hosting the file. This is a popular way to organize applications using P2P file-sharing hybrid model like Napster or OpenNap. However, limitations of this method is that it only allows users to search by file name (or general file identifiers). This is suitable only for sharing music file or photo libraries. With more complex libraries of documents related to the character (text files, books, articles, ...), demand does not stop from looking at the name or identity criteria. Currently almost all file-sharing applications commonly found on the web do not allow to search features of the content within document files. It can be seen that to put this searching feature function into a file-sharing applications, we found it necessary to apply the methods and techniques used in information retrieval . When that information about the documents not only of 9 the name (identification) of documents but also include further information on the keywords contained in the document. Information on the searching server must be organized as a system indexing by keyword. 10 CHAPTER 2: BITTORRENT FILE SHARING SYSTEM In this chapter, we present an overview of the BitTorrent file sharing system and its implementation over wired networks and mobile ad hoc networks. 2.1 Overview File sharing is very well known in the entire community due to its popular usage. A large amount of bandwidth on the Internet is used by different file sharing applications/protocols. It has been estimated that it may account for as much as 43% of all Internet traffic (depending on geographical location) as of February 2009[14]. BitTorrent is one of the most popular protocols for transferring large files. There are many famous applications using BitTorrent protocol such as uTorrent or Podcast. Besides legal distribution of files, BitTorrent is often used to illegally share and download copyrighted material. The technology within the protocol makes it possible to distribute large amounts of data without the need of a high capacity server, and expensive bandwidth. This is probably the main reason for using this protocol instead of using traditional downloading from HTTP or FTP servers. For those sharing and downloading copyrighted files, BitTorrent is profitable because it is decentralized. 11 Figure 4: An example of the BitTorrent file sharing system 2.2 Benefits of the BitTorrent protocol The BitTorrent protocol helps to improve the previous generation of P2P software with a number of significant initiatives, which shall be discussed in next subsections. High speed downloading Simply put, BitTorrent is quite popular due to its parallel file transfer process. Unlike the typical network behavior, which bottlenecks cause performance to plummet; BitTorrent actually increases the demand to improve the performance of a computer network. This behavior increases the stability ratio for the distribution of large, popular content on the internet, for servers of limited capacity. The protocol achieves this by splitting large files into smaller blocks, such as 128 kilobytes, by splitting into blocks, the source components may be required from many sources, but 12 not need to set the complete file to be able to load up the block, which means downloading the software can quickly start contributing as uploaders for other users Simplicity, easy to use Unlike other complex previous P2P protocols, the BitTorrent protocol works on a web based distribution model built over the HTTP protocol. Users simply download a client, which to a user act as an extension of their internet browser. By opening the files ‘.Torrent’, client will allow the users to save files to a location of their choice on the hard drives in the same fashion as original file transfer applications. Until the user closes the client, BitTorrent will continue sharing the file to other clients. Effective use of limited upload bandwidth Home broadband connections have quite limited upload bandwidth available is a big problem of P2P systems. To solve this, the BitTorrent protocol uses a few mechanisms to mitigate for this problem. When a client has received a few initial blocks, it can then use its upload bandwidth to forward those blocks to other machines. Another aspect is that transfers are limited to four open sessions, and the number of files hosted is limited to a few users; by focusing on a small number of activities, the network can reduces overhead. By reducing the number of uploads per machine, and having considerably more machines uploading content, the time required by a client for downloading a large file is considerably less than tr aditional means. Overcoming the Free-Rider problem In typical P2P systems, the performance of the overall system downs when users trying to obtain service without contributing anything to the system. This is an old problem, and BitTorrent was designed with an approach of local optimization by allocating the most upload bandwidth to peers, which are supplying the greatest amount of content. Because of the design of the current BitTorrent Protocol, free riders can only get 20% of the total system bandwidth. High content integrity BitTorrent employs two mechanisms, which make it more reliable. First, to be accepted on a file tracking server, typically new content must be accepted and managed by moderators who manually inspect it. To reduce the overload for moderators, regular suppliers of content can be promoted to immoderate submitters 13 after gaining enough trust. Once the original content is considered trusted, a validation checksum of file will be presented from the server to guarantee the content integrity. 2.3 Limitations BitTorrent protocol does not help users to hide their identifiers. Because the tracker maintains a list of files being shared, it also contains a list of IP addresses of the computers are downloading the file, and the list of the files that have been downloaded previously. Based on the BitTorrent protocol also ascertains the address of the peer in the swarm and of course, the peer can be attacked. Another disadvantage of the BitTorrent protocol is less encouraging peer into seeder after file downloading is complete. Consequently, seeders and peers in swarm will disappear gradually, which means that more older torrents to download the file, then the probability of success lower. BitTorrent has the advantage in the broadband environment, such as DSL, cable, satellite ... but for dial-up Internet users using the BitTorrent protocol will not work, because dial-up connection or disconnection and speed not load high. 2.4 Comparison with other file sharing protocols The method used to distribute files between eDonkey2000 and BitTorrent networks is similar, but the clients in the eDonkey network usually share and download many files, making the bandwidth per carrier becomes less. In contrast, transporting in BitTorrent is much faster by focusing on a file or a specific group of files. Original eDonkey2000 protocol provides very little resistance to fraud machines (download more, upload very little), the new version of eDonkey2000 client has installed the system which encourages more uploads. For example, systematic program eMule (credits system) have a score system to encourage uploaders. One client will prioritize the shipment to its previous position by moving the client to the top of the queue to wait for less time. This system proved effective because a queue for each client using eMule often has hundreds, even thousand entries. KaZaA is a protocol similar to BitTorrent protocol, but it has another point that is its distinguishing workstations by dedication level (Participation Level). Contribution level increases when you upload and decreases when you download. Once you upload a resource, it devotes to the person with the highest-level first and then this person devote to the lower contribution level one and so on. This model is similar to the pyramid model; the most uploader is at the top position of the pyramid, 14 and the less one is on the bottom. KaZaA is only appropriate model distribution of resources for a large number of users, it has been shown that the file at the bottom of the pyramid can be downloaded faster than on the case by means of HTTP (in the case file is large). Nevertheless, KaZaA has a small downside is that it believes the report of the workstations on the level of dedication, so clients can cheat with a lot of dedication for the workstation which is not official. 2.5 Original BitTorrent for wired networks 2.5.1 Description BitTorrent protocol defines a method to disseminate and share files online. Before having BitTorrent protocol, there were some P2P protocols, which allow a computer in the network to share files with other computers without having a centralized server. BitTorrent is an improvement from the previous P2P protocol. BitTorrent protocol has a principle to work closely with the ability to customize, reliable and cost of maintaining a list of computer file sharing better than the previous P2P protocols. Because of communication following standard TCP/ IP, BitTorrent protocol can operate on regular Internet connection [10]. BitTorrent protocol, which allows users to distribute large amounts of data without depending much on their computers, would be needed for standard Internet hosting. This protocol works as an alternative data distribution method that makes even small devices (e.g. mobile phones) with low bandwidth capable of participating in large data transfers. BitTorrent client is a program operating under the BitTorrent protocol. Each BitTorrent client can compares, requirements, and transports files on the network using the BitTorrent protocol. Files can contain any information, including text, audio, video and encrypted content. First, a user playing the role of file-provider makes a file available to the network. This first user's file that is called a seed allows other users, called peers, to connect and start to download the file. When new peers connect to the network and request the same file, their computer gets a different piece of the data from the seed. Once multiple peers have multiple pieces of the seed, each of them becomes a source for that continuously distributes the file. The effect of this is to reduce the dependent on the initial user and distribute the file download task among the seed and many peers. With BitTorrent, there is no need to provide a computer data quantities, which could jeopardize the task by overwhelming all resources, but the same result is still reached: each peer eventually receiving the entire file. 15 After the file successfully and completely downloaded, the peer can shift roles and become an additional seed, helping the other peers to get the whole file. This eventual change from peers to seeds determined the number of times a file is available in its complete form. This distributed nature of BitTorrent leads to spread the file throughout peers. As more peers join the swarm, the likelihood of a successful download increases. Compared with standard Internet hosting, this reduces the original distributor's hardware and bandwidth resource costs significantly. It also reduces dependence on the original distributor and provides a source for the file, which is generally temporary and therefore more difficult to trace than providing by the long-term availability of a server in standard file distribution techniques. 2.5.2 Operation BitTorrent greatly reduces the load on the server, since users usually download files between them, not the server. As shown by the colored bars below each client, the file is downloaded in a random order, instead of carrying a sequential order. Unlike the systems of traditional file sharing, its main objective is to provide an efficient way to distribute a single file to a large group of people, forcing all those who download a file to share it with others. First is distributing by conventional means a small file with a. torrent. This file is static, so often found on websites or distributed via email. The file 'torrent' contains the address of a "search server", called tracker, which is responsible for locating potential sources for the file or part thereof. This server is really centralized and it provides statistics on the number of transfers, the number of nodes with a full copy of the file and the number of nodes that have only a portion thereof. The library file or files you want to download from the sources found by the tracker, and at the same time as the download, it starts to climb parts of the file available to other sources, using the bandwidth allocated to it. Since the action of sharing begins even before completing the download of a file, each peer will inevitably contribute to the distribution of this file. The system is responsible for rewarding those who share more, the higher the bandwidth the greater the number of connections to discharge peers to be established. When a peer starts downloading a file, BitTorrent does not necessarily come at the beginning of the file, but is low in parts at random, after peers are connected to 16 download the file. If the peers are available online each part of the whole file (even if they are scattered), eventually every peers will get a full copy of it. Of course, initially one must have the complete file to begin the process. This method produces significant improvements in transfer rate when many users connect to download the same file. Where there is not any node with having complete file ("seeder" or "seed") connected to the tracker, the possibility exists that the file cannot be completed. 2.5.3 Creating and publishing torrents The peers that are distributed over the network treat the file as a separation of it in a number of identically sized pieces, typically between 32 KB and 4 MB each. Each peer performs a checksum (checksum) for each part, using the SHA-1 algorithm, and storing it in the torrent file. Parts greater than 512 KB will reduce the size of a torrent file for each payload, but this would reduce the efficiency of the protocol. When another peer later receives a particular piece, the hash of the piece is compared to the recorded hash to check that is free from error. Peers that provide complete files be called seeds (seeders) and the peer that provides the initial copy of the file is called initial seed (initial seeder). The exact information that is contained in the torrent file depends on the BitTorrent protocol’s version. By convention, the name of a torrent file has the extension ".torrent". Torrent files have a section called "announce", which specifies the URL of the tracker, and "Info" section, which contains the names of the files, their sizes, length of piece used, and SHA-1 hash code for each of the pieces, all this information is used by clients to verify the integrity of the data received. Once completed torrents files are posted on a website or elsewhere, and are registered with an origin server, which is called a tracker, it keeps the list of clients who are currently participating on the torrent file. Alternatively, in a decentralized system, each peer acts as a tracker. BitTorrent clients such as μTorrent, BitComet, KTorrent and Deluge, implement this, using methods of Distributed hash table (DHT). Azureus also supports tracing method is incompatible nodes (April 2007) with the DHT that offers its customers [2]. Once DHT has passed, a "private" flag like broadcast flag was unofficially announced, telling clients to restrict the use of decentralized monitoring whether the user desires. Flag is intentionally placed in info section so it cannot be disabled or 17 removed without changing the identity of the torrent. The purpose of the flag is to prevent torrents from being shared with clients without access to tracker. The flag was requested to be included in the formal specification in August, in 2008, but it has not been accepted. 2.5.4 Downloading torrents and sharing files Using any Internet browser, like Firefox, browse the site has a list of torrents, download it then use BitTorrent client open out there. After opening the torrent file, BitTorrent will connect to the trackers, the trackers will give it a list of the peers are downloading this file. A group of members (or peers) of a BitTorrent network which download the same file is called the swarm. When the client connects to the swarm, it will start sharing files with each other. The clients will share the pieces together instead of sharing direct with the trackers, so the number of clients in the swarm can develop very quickly. In BitTorrent network, each of clients shares some of its resources such as data, upload bandwidth with other peers interested in the same content in order to go up the global system capacity. Peers participating to download the same content form a torrent. A peer discovers other peers by contacting a central rendezvous node called tracker. The latter store IP addresses of all peers in the torrent and manage statistics of uploader and downloader on network. To make the replication of the content in the network easier and to guarantee multi-sourcing, a file is subdivided into a set of pieces. Then, each piece is also subdivided into blocks. Seed (seeder) is a peer, which has all pieces of the file. When the peer is still downloading pieces, it is called leecher. The neighbors of a node are contained in a peer list which it can open a TCP-connection to exchange data and information. Only four simultaneous outgoing active TCP connections are accepted by the protocol. These neighbors are called effective neighbors. They are chose by the BitTorrent’s choking algorithm. This algorithm is performed periodically. When the choking period expires, a peer chooses to unchoke the 3 uploaders to him at the highest rate. It is a best slot unchoking. This strategy, called tit-for-tat, ensures reciprocity and enforces collaboration among peers. Now to discover new upload capacities, a peer chooses randomly the fourth peer to unchoke. This slot is called optimistic slot. All other neighbors are left choked. Once unchoked, a peer chooses a piece to download using a specific piece selection strategy that is called local rarest first. Indeed, each peer maintains an update-to-date list of pieces owned by all its neighbors. When choosing a piece, a peer selects the piece with the least redundancy in its neighbors. One of the rarest pieces is selected randomly in case of equality. Rarest first is supposed to grow up the entropy of pieces in the network that enforces cooperation and hence improves global performance [14]. 18 2.6 BitTorrent variant for wireless ad hoc networks Some works have tried to adapt to the BitTorrent network to MANETs (eg, [1] and [13]). They focus only improved phase detection node without resolve to improve the efficiency of content sharing. Michiardi et al [7] study in the performance of a cooperative mechanism for distributing content from a source to a destination of great potential. They propose to implement a small change BitTorrent that allows detection neighbors and local traffic. This is done by selecting only close neighbors as effective neighbors. The result is a decrease in download time and total energy consumption. However, this below solution goes further by focusing not only on download time, but also on the sharing between peers. 2.6.1 Trackerless BitTorrent According to the feature of MANETs that peers themselves linking into a network, a centralized tracker cannot install. To adapt BitTorrent over wireless ad hoc network, we need a new mechanism to replace tracker's role: discovering and identifying the other peers. M.K Sbai gave a method using HELLO message to solve such problem [8]. To discover new peers, a peer floods periodically a HELLO message to the network and waits for HELLO REPLY messages. These HELLO messages are forwarded to the other neighbors with some initial TTL (Time-To-Live) to control the scope of the flood and hence the visibility of a peer. Receiving a HELLO message, a peer decreases the TTL and forwards it to its wireless neighbors, and so on until its TTL reaches zero. 2.6.2 Packets exchanged between peers There are 2 types of packets exchanged in the network: Data packets and control packets. We choose to send data packets via TCP connections because TCP provides reliability and congestion controls needed to transfer blocks of file. Control packets such as HELLO message (or BEACON message in new version) and piece updates are better when using UDP. Here are the control packets exchanged between peers: - HELLO: see section 2.6.1 - HELLO REPLY: see section 2.6.1 - UPDATE PIECE LIST: when a peer receives a new piece, it sends an UPDATE PIECE LIST to all peers with whom it can exchange data. 19 - PIECE OFFER REQUEST: when a peer i unchokes a peer j, it sends a PIECE OFFER REQUEST packet to j. This packet contains the list of pieces that i has already downloaded. - PIECE OFFER REPLY: receiving a PIECE OFFER REQUEST, a peer sends a PIECE OFFER REPLY packet to the sender. A flag included in this packet indicates the offer being accepted or not. 2.6.3 Downloading mechanism To profit from the advantages of the limit neighborhood, M. Sbai gave an idea that modified the original BitTorrent to adapt to MANETs [8]. A few TCP connections are created to distant peers in addition to those with close peers. Pieces can then spread over the network and they will be propagated in different directions, which improve the sharing ratio and the download completion time. Therefore, several zones of the network can be active simultaneously, which makes the network becoming more dynamic. The new choking algorithm is used by utilizing routing information. It distributes optimistic unchokes between distant and close peers and adds a specific neighbor selection mechanism to choose a remote peer. It also applies a new piece selection strategy when the distant peer is offered. Unlike regular BitTorrent with limited neighborhood, this method requires a global knowledge about the identifiers of peers in the network. In the modification BitTorrent, each peer maintains 2 tables: NEARBY NEIGHBOR TABLE (NNT) and FAR NEIGHBOR TABLE (FNT). One and two hops neighbors will are added to NNT, the others belong to FNT. The PIECE UPDATE packets are sent only to neighbors in NNT. Due to the piece selection strategy operates only on the NNTs; peers don’t need to know about all pieces in the network. Indeed, in wireless networks, the copy of pieces is more efficient when it is based on statistics in the nearby neighborhood since this guarantees a faster local replication compared to a large neighborhood. As in BitTorrent, the choking algorithm selects three best uploaders as effective neighbors. These three neighbors are chosen from both nearby and far neighbor tables. The peer then serves these three neighbors during the next choking period. But in addition to these effective neighbors, the fourth random neighbor is selected from one of the two tables (optimistic slot) using a round robin policy that guarantees an optimal balance between the random unchokes locally and the transmission of pieces to remote neighbors in order to improve diversity. In detail, the peer selects a peer one time from FNT, q times from NNT and so on. In this protocol, the ratio of the number of time slots spent on serving nearby neighbors and those for serving far neighbors bases on the quantum q. It is also the number of slots 20 that a peer should wait before unchoking a distant neighbor again. Sbai’s experiments shown that the optimal choice of the quantum q depend on the number of nodes in network, the maximum length of a path between 2 nodes (hm) and the number

Các file đính kèm theo tài liệu này:

  • pdfLUANVANDoKhanhToan.pdf