Tuesday, September 30, 2008

A Comparison of Mechanisms for Improving TCP Performance over Wireless Links

"Wireless TCP" is probably a popular prelim question; it also reappears from time to time in conferences; I've thought about working on it myself. The basic problem is at least clearly stated: TCP believes that all packet losses are caused by congestion, but in wireless this is likely to be incorrect. Thus, it does not perform very well over wireless links. The result of this paper is "yes, but local repair fixes the problem."

The authors evaluate a number of different possible solutions; local retransmission and a series of end-to-end modifications. Intuitively, we should be predisposed to local repair for several reasons: since it is closest to the link, it should have the best information about link quality, and should be able to react the fastest to loss. Split connections is a rather heavy weight solution to a not-that-deep problem, and furthermore will use significantly more network resources. SACKs are a worthy modification to TCP, but are best thought of as solving the bursty loss problem, and helping throughput on high bandwidth-delay links.

What this problem essentially comes down to is that, if a link as a Packet Reception Rate of "p", then with "r" retransmissions, the probability of packet delivery when using link-level acknowledgments is 1-((1-p)^r). What this says is that we can make a poor link look like a good link by estimating the value of p on the fly, and tuning r so as to guarantee a certain level of reliability. Of course, this can be complicated by bursty losses, but the intuition is correct.

The results of the paper bare out this intuition; local repair works as well as anything, and end-to-end protocols are hurt by the fact that the controllers are far from where losses are taking place. Split also works almost as well as local, but there is no gain from just running the local protocol directly on the TCP packets.

The upshot of all of this is that when the original internet papers say that the maximum drop rate of a link should be 1% TCP and IP headers which are either destined for an endpoint or else apply to each hop (hop-by-hop options).

1 comment:

Randy H. Katz said...

Your analysis is correct. It is interesting to looks at these things from the perspective of the mid-1990s. TCP over wireless yields poor performance for several reasons: inability to recover from multiple losses in the congestion window as well as high variability in RTT on wireless links yielding large RTO values. The pushback against fixing this at transport layer was to build a better link layer, as in cellular links ... but this has some costs too, mainly in terms of available bandwidth. This work was HIGHLY controversial at the time.