Wednesday, January 9, 2013

CAPping routed and SDN networks

In my last blog entry I discussed two differences between SDN and conventional networking. The second was that SDN returns to centralized control as compared to a distributed control plane (I’ll return to the first difference in just a moment). And this means that there is a single point of failure, even if the controller itself has built-in redundancy.

There has been discussion in the OpenFlow community about failover (and even load balancing) between multiple controllers. The OF reference implementation includes a simple mechanism wherein an OF switch can be configured with a list of controllers, and if the master controller fails the OF switch selects the next on the list. Open vSwitch allows sending messages from OF switches to multiple controllers, and enables these controllers to elect a master.

What I want to elucidate here is why such solutions can never provide a complete resolution to the problem. My explanation is based on the CAP theorem for distributed computing systems.

A distributed computing system is composed of multiple computational resources (CPU, memory, storage) that are connected via a communications network and together perform a task. To make things more concrete consider a database, which can queried and into which new data can be entered, via a large number of servers around the world, connected over the public Internet.

There are three characteristics of distributed systems that are universally desirable:
Consistency, meaning that the system responds identically to a query no matter which node receives the request (or does not respond at all);
Availability, i.e., that the system always responds to a request (although the response may not be consistent or correct); and
Partition tolerance, namely that the system continues to function even when nodes or the communications network fail.

In 2000 Eric Brewer conjectured that a distributed system can satisfy any two of these guarantees at the same time, but not all three. This conjecture was later proven by Gilbert and Lynch and is now called Brewer’s theorem, or the CAP theorem.

Without going into a formal proof, you can readily convince yourself of the correctness of the CAP theorem by requiring one of the characteristics and then proving that the other two can’t co-exist. For example, let's require consistency. There are basically two ways to handle data; we can have a single database that all nodes need to update and retrieve from, or maintain a local copies of the database at each node. If there is a single database, then consistency demands that once any node starts updating the database, all the other nodes must wait until the update is complete. If there are multiple copies, then each time one node updates its local copy, consistency demands it to send messages to all the others instructing them to update their copies, and to wait for acknowledgements from all the others. In the first case, a communications network failure that separates the node in question from the database immediately leads to lack of availability. In the second case a network failure that causes even a single node’s acknowledgement not to arrive also causes lack of availability. So availability and partition tolerance can’t be simultaneously achieved.

Now what does this have to do with networking? Well, SDN teaches us that packet forwarding is simply a computational problem. That was the first difference between SDN and conventional networks that I discussed in my last blog entry. And since the task of forwarding a packet from network ingress to network egress is obviously carried out by a large number of forwarding elements, the network of packet forwarding devices is a distributed computational system. And ergo the CAP theorem applies.

So which two of the three desirable characteristics of distributed systems do we want to achieve, and more importantly, which one are we willing to forego?

In conventional routed networks we forego consistency. Each router has its own local Forwarding Information Base (FIB), which is locally created and only weakly tied to the FIBs of the others. Lack of forwarding consistency means that packets can sometimes become misrouted or “black-holed” (go around in circles until they are dropped). That is the reason the IP packet header has a TTL field, and the reason TCP is layered over IP to retransmit lost packets. On the other hand, IP service unavailability is almost always due to a local break that physically disconnects the host from the IP network, and, as previously discussed, IP networks are extremely tolerant to partition failures.

At the other extreme is OpenFlow, which stresses consistency. The controller acts as a global database, and specific OF mechanisms ensure that a packet entering the network is handled consistently by all switches. OpenFlow also attempts to provide high availability, with mechanisms for handling the first packet of a new flow, and popular test suites examining scalability and availability. So, what must be sacrificed? The only remaining characteristic is partition tolerance.

So what is wrong with using multiple OF controllers? If you have read everything up to now you should understand that these controllers can’t be kept consistent even when the network connecting them fails without having them hang indefinitely.

Since you can't have your CAP and eat it too, every network architecture must decide which of consistency, availability, and partition tolerance it is willing to do without. On this issue conventional IP networks and SDN have taken different routes.

Y(J)S