Progress Report: NSF grant NCR-9416101 

July 1996 to July 1997

Principal Investigators: Andres Albanese and Michael Luby

Institution: International Computer Science Institute

PET

Motivated by immediate practical applications, multicasting real-time video and audio over high-speed lossy heterogeneous networks, in 1993 we introduced a novel approach for sending messages over lossy packet-based networks. The new method, called Priority Encoding Transmission (PET), allows a user to specify a different priority on each segment of the message. Based on the priorities, the sender uses the system to encode the segments into packets for transmission. The system ensures recovery of the segments in order of their priority. The priority of a segment determines the minimum number of packets sufficient to recover the segment.

In past reports, we reported on the development of several increasingly sophisticated demos to show the benefits of this approach. As we described previously, we have integrated PET into a version of the video-conferencing tool VIC to protect the real-time transmission of MPEG streams, and this worked very well.

Previous PET work concluded

In the past year, we have concluded some of the necessary aspects of the PET. The seminal work [1] on this project appeared in the last year in a special issue of IEEE Transactions on Information Theory devoted to coding theory.

The PET patent [2] issued in April of this year.

Over the past year, Michael Luby and Yulius Tjahjadi completely overhauled the PET API software library. The two main objectives were to make a set of clean and usable interfaces between the calling applications and the PET library, and to make it easy to plug in different forward error correcting (FEC) codes for use within PET. Both of these objectives were met, and there is now a clean version of the PET library written in C. We now have a description of the new API [5] for PET based on this revision.

FEC codes

By far, the most computationally intensive part of the PET algorithm is the implementation of forward error correcting (FEC) codes. The development of more efficient FEC codes is very challenging intellectually and has practical applications far beyond use in PET.

For example, a standard solution to the problem of packet loss is to request retransmission of lost packets. For multicast applications with one sender and many receivers, this solution introduces technical difficulties. For example, consider a video server distributing a movie on the Internet to thousands of users. If the users lose different sets of packets, and each user requests retransmission of these packets from the server, the server will quickly become overwhelmed by these requests. More sophisticated solutions along these lines have been considered, including sending requests for retransmission only to nearby session participants, but these solutions as yet appear inadequate.

FEC solution

An alternative solution utilizing FEC codes has been advocated by several groups as a method to avoid retransmission completely. Instead, redundant packets are sent out along with the original message packets. The redundant information is designed so that as long as a user receives enough of the transmitted packets, regardless of which packets are received, the original message can be reconstructed. Ideally, the number of encoding packets needed to decode the message is equal to the number of packets in the original message.

To exemplify the advantages of FEC, consider the following simple setting. Suppose that 10,000 users are receiving a file sent via a multicast transmission protocol, and the average rate of packet loss is 20%. Suppose the file consists of 64K packets of 512 bytes each, and thus the entire file is 32M. One way to ensure that all users receive all of the packets is to keep sending each packet until all users acknowledge its receipt. If the losses are independent for the different users, a simple calculation shows that on average each packet needs to be transmitted approximately six times.

Suppose instead an FEC code with a 3% length overhead (see the next subsection for a definition of length overhead) is used to add 64K redundant packets to each message before transmission. Suppose no user loses more than 62K packets out of the 128K packets associated with a message, i.e., the maximum loss rate is at most approximately two and a half times the average loss rate. Then, all users will receive enough of the encoding to decode the message. The total bandwidth used by the FEC solution is a factor of three smaller than the retransmission solution, and moreover the overall protocol is much simpler.

Previous FEC work completed

The very first implementation of FEC codes that we used in PET were not fast enough even for encoding and decoding moderate sized files. As reported previously, this led to our invention of some fairly fast FEC codes for our implementations of PET. Although these codes are reasonably fast for moderate sized files, they are not adequate for real-time encoding and decoding of large files, as the times scale quadratically with the size of the file. As reported earlier, we invented some faster FEC codes, and in the past year we produced the final paper [4] of this work.

In the past year, Malik Kalfane has finished his Master's thesis [3] under the supervision of Luby. This thesis consists of a practical study of the implementations of the various erasure codes we have considered, both those designed by others and those designed by us, including variants of the codes described in [4].

Tornado codes

The potential for FEC codes in networking applications has not yet been fully realized, primarily because previous software implementations of FEC codes have been too slow. Their running times are only reasonable when the number of redundant packets is small, e.g., at most 100 packets. The reason for this slowness is that the encoding and decoding times of standard FEC codes are proportional to the length of the encoding times the number of redundant packets. Other known codes based on the Fast Fourier transform have asymptotically much better decoding times, but in practice their complexity also makes them too slow to decode large messages, including those codes studied in [3] and [4]. (There are implementations of such codes in custom designed hardware that run quite fast.)

Over the past two years, we have developed a completely new class of codes, which we call Tornado codes, which promise to make the FEC solution practical. We have written a preliminary paper [6] about this work, and also have given presentations [7] of this work around the world. Tornado codes have the property that slightly more than the ideal number of encoding packets must be received to decode the message: we call the per cent needed over the ideal number the length overhead (recall that the ideal number is the original number of message packets). By allowing some length overhead, we can develop much faster coding schemes. Tornado codes can be encoded and decoded in time proportional to the length of the encoding times a small constant that is independent of the number of redundant packets. This small constant, which we call the time overhead, is typically around six.

We have implemented these codes in software, and the results are very encouraging. As an example, we took a 32M byte file and partitioned it into 64K message packets of 512 bytes each. Using Tornado codes, we produced 128K encoding packets from these 64K message packets. We then ran 10,000 trials, where each trial consisted of randomly choosing and receiving encoding packets until there are enough to decode all of the original 32M byte file. In these 10,000 trials, the average length overhead was only 3.2%; in other words, about 66K encoding packets were enough to decode. Furthermore, in less than 1% of the trials was the length overhead more than 4.0%.

Although there is a modest length overhead using Tornado codes, the advantage is that the encoding and decoding times are very small. On a DEC Alpha machine, the time to produce the 128K encoding packets from the 64K message packets is just under two seconds, and the time to decode the 32M byte file from 66K encoding packets is also just under two seconds. The encoding and decoding times for a message and encoding of this size using standard FEC codes would be at least 10,000 times slower, i.e., tens of hours. Thus, the advent of Tornado codes makes software solutions possible for problems that are orders of magnitude larger than were previously conceivable.

We emphasize that Tornado codes have a number of important features:

Simplicity: Tornado codes use only simple XOR operations, and no complicated data structures are required for their implementation.
Generality: We have developed techniques to design Tornado codes for whatever rate, or redundancy level, is specified. (In a rate 4/5 code, for example, 4/5 of the encoding is the original message, and 1/5 of the encoding is redundancy.) Moreover, we can design codes to decrease the length overhead at the cost of a small increase in the time overhead.
Provability: We have developed a mathematical formulation of Tornado codes that allows us to prove how well they will perform. Furthermore, we have found an infinite family of codes for which we can prove that the length overhead decreases very quickly as a function of a growing time overhead. In a work [8] in progress, we have found a new analysis of Tornado codes that can be used to analyze a number of other problems as well.
Applicability: Tornado codes can be used to encode and decode large files in real-time. This makes Tornado codes attractive to use for a number of networking applications.

Over the next year, we plan to aggressively pursue the incorporation of Tornado codes into networking protocols. For example, Ashu Rege (a postdoc who has recently started at ICSI, supported by this grant and supervised by Luby) has just started developing a multicast distribution protocol written in JAVA that uses Tornado codes. Preliminary tests of this nascent protocol have been very encouraging.

Network protocols

In a recent work [9] we propose a new model for the analysis of data transmission protocols in communication networks. The overall goal of a data transmission protocol is to successfully transmit a message from the sender to the receiver. We study the performance of protocols in an adversarial setting where the loss pattern and latencies of packets are determined by an adversary. This worst-case scenario is well motivated by recent measurements studies which indicate the burstiness of network traffic.

We advocate the modular decomposition of data transmission protocols into a time scheduling policy, which determines when packets are to be sent, and a data selection policy, which determines what data is to be placed in each sent packet.

We concentrate on the data selection policy and require that the protocol will achieve high bandwidth utilization in transmitting any prefix of the transmitted message. We introduce a simple and universal data selection policy which we prove is close to optimal in the following sense: For any time scheduling policy and any network behavior, in the worst case prefix measure our data selection policy provably performs as well as any other data selection policy up to a constant additive term.

Our explicit modular decomposition of a transmission protocol into two policies should be contrasted with existing network protocols such as TCP/IP. In implementations of TCP/IP, the data selection policy is intertwined with the time scheduling policy. It is not hard to see that any data transmission protocol, including protocols such as TCP/IP that weren't designed with two explicit separate policies, can be decomposed into a time scheduling and data selection policy. Our result shows that the performance of the overall transmission protocol would not degrade in performance (and could improve dramatically) if it used our universal data selection policy in place of its own. The most important consequence of this is that it reduces the problem of designing a data transmission protocol to the possibly easier task of designing the time scheduling policy.

Bibliography

[1] Andres Albanese, Johannes Bl\"{o}mer, Jeff Edmonds, Michael Luby, Madhu Sudan
``Priority Encoding Transmission''
IEEE Transactions on Information Theory
(special issue devoted to coding theory)
Vol. 42, No. 6, November 1996

[2] Andres Albanese, Michael Luby, Johannes Bloemer, Jeff Edmonds
``Message Encoding and Transmission System and Method for Multilevel Data Redundancy'',
U.S. Patent Application
Serial Number 08/361,802; 12-21-94
Issued April 1, 1997

[3] Malik Kalfane
``Erasure Codes for Priority Encoding Transmission''
M.Sc. thesis
UC Berkeley, Computer Science
1996

[4] Noga Alon, Michael Luby
``A Linear Time Erasure-Resilient Code With Nearly Optimal Recovery''
IEEE Transactions on Information Theory
(special issue devoted to coding theory)
Vol. 42, No. 6, November 1996

[5] Yulius Tjahjadi, Michael Luby
``Application Programmer's Interface for PET: Revised Version''

[6] Michael Luby, Michael Mitzenmacher, Amin Shokrollahi, Daniel Spielman, Volker Stemann
``Practical Loss-Resilient Codes''
29th ACM Symposium on Theory of Computing
1997

[7] Michael Luby, Michael Mitzenmacher, Amin Shokrollahi, Daniel Spielman, Volker Stemann
``Practical Loss-Resilient Codes''
slide presentation
1997

[8] Michael Luby, Michael Mitzenmacher, Amin Shokrollahi,
``Analysis of Random Processes via And-Or Tree Evaluation''
submitted to 1998 Symposium on Discrete Algorithms

[9] Micah Adler, Yair Bartal, John Byers, Michael Luby, Danny Raz
``A Modular Analysis of Network Transmission Protocols''
Proceedings of the Fifth Israeli Symposium on Theory of Computing and Systems
June 1997

Maintained by:Michael Luby (luby@icsi.berkeley.edu)

Progress Report: NSF grant NCR-9416101 July 1996 to July 1997