Thursday, October 16, 2008

TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks

It's important to read this paper in context, so I'll try to provide some of that here as well as a summary of "what's changed", because a lot is different from when this was published in OSDI. In 2002, "sensor networks" were (I believe) widely thought to require a new and different model of network. The idea of "motes strewn from airplanes" which would get distributed on the group was a [somewhat] commonly cited example of how motes would be deployed; after landing, the motes would form a sort of "intelligent cloud" which would crunch the data and report the "answer" to someone outside the network. The goal of allowing non-computer scientists to create a sensor network application is a popular motivation for work. Of course, this is partially hyperbole, but it's also somewhat clear how TAG fits into this picture.

What's changed since 2002 is hard experience in application deployment and software development. Numerous early deployments were hobbled by badly performing hardware and software. Furthermore, contact with various domain specialists all indicated that they did not generally want a summary of the data; they wanted the data. That restricts the realm of on-line, fully distributed networks like TAG, TinyDB, ADMR, etc. to a special subset of applications. For many applications, the two network models most critical are collection (many to one), and dissemination (one to many). The importance of these primitives are clear even in this TAG paper, as collection or tree routing is used to report data to a centralized controller and dissemination to distribute queries through the network.

A major problem with early work like this is that it jumped ahead to design complicated systems like this in-network aggregation system before ever mastering the basics of collection and dissemination. A good example of this is the performance of their multi-hop collection protocol; they cite 45% as the yield; however, given the network size and layout parameters they list, that low yield is more a reflection on poor algorithms and implementation then anything else. Part of this is not their fault, as modern IEEE802.15.4 radios operate at 250kbps which increases potential throughput, and use the more advanced QPSK encoding which significantly reduce the packet error rate compared to the older FSK radios. However, overall this paper overreaches because no service decomposistion existed to allow the authors to focus on the central element of their work (the aggregation); they must consider the entire stack. It's kind of like trying to write a paper about bittorrent, but needing to design IP+Linux to do so.

One thing this paper gets right that is still topical is the use of storage into the network architecture; most motes have significant amounts of non volatile storage (flash) compared to their RAM and radio bandwidth. It often advantageous from an energy standpoint to stream a number of packets sequentially to amortize the fixed cost of waking up a receiver and channel acquisition.

2 comments:

Ari Rabkin said...

We in CS seem to have this irresistible urge to just assert that the important problem is X, without actually checking that we understood it right.

Interestingly, I've had this experience with monitoring systems. Ruckus was designed for in-network processing. But the thing I learned at Yahoo! is that the professionals want the data. Or at least, they want to totally separate the compression and summarization from the rest of the system, and have it totally under their control.

Matei Zaharia said...

I think two big advantages of collecting all the data in one place are first that you can run queries retroactively on old data, and second that you can run multiple queries on for free (no extra power or bandwidth cost). Often you don't know the questions you'd like to ask until after some interesting things have happened. Equally importantly, you sometimes think of several different questions to ask. With TAG, each of them would use a (small) amount of power. With centralized collection, you'd just run some queries on the same database.