Tuesday, October 28, 2008

Resilient Overlay Networks

I think the central insight of this paper is that some things, like multi-path routing and optimizing for different application metrics are not well handled at the IP later. The reason for this is because the Internet is addressed in a flat address space, and so it's generally frowned upon to make assumptions about which traffic is important, or which destinations are being used. Therefore, in order to achieve the performance of the paper, a great deal of routing state would need to be exchanged by BGP, most of which would remain unused.

The authors' solution is to create an overlay network: essential, just another network layer with its own routing protocol which uses the routing provided by an underlying layer to create virtual links between the overlay routers. This design is reminiscent of a lot of other work, like Application Layer Multicast, and maybe even the Ken Birman's Isis (hi Ari...) In fact, routing is even easier with virtual links then physical links, because each router is directly connected to every other router through a virtual link (ie., IP hosts can talk to anyone). Therefore, flooding routing state is just sending N-1 internet packets to the other routers; we don't have to worry much about complicated flooding mechanisms.

One thing I really like about this design is that it optimizes both state and control traffic for endpoints which are in use, while ignoring the rest of them. On an Internet scale, and 50-node subnet is pretty insignificant, and optimizing the same metrics for every 50-node subset would probably be impossible. I do dislike the generally arrogant tone of this paper, but it's interesting that it works.

Their result that at most one intermediate node is sufficient to overcome faults is actually pretty obvious in hindsight; they even have a nice explanation of why in the paper that bears repeating. The underlying problem is that (1) BGP converges slowly since it's DV, and (2) multi homing is kind of broken. Essentially all they do with one backup intermediate router is to make the endpoint's AS's "virtually multihomed." From this angle, you really might get the impression that their whole argument really comes down to fixing BGP, which might not even be that hard; mainly a deployment issue. Nonetheless, an enjoyable read.

No comments: