Wednesday, September 29, 2010

OAM for FM and PM

The Operations, Administration, and Maintenance (OAM) functionality provided in all modern communications systems supports two distinguishable functions, namely Fault Management (FM) and Performance Management (PM).
It is important to remember that despite the use of the word “management” here, OAM is a user-plane function. OAM may trigger control plane procedures (e.g., protection switching) or management plane actions (such as alarms), but the OAM itself is data that runs along with the user data.

FM deals with the detection and reporting of malfunctions. ITU-T Recommendation G.806 defines a scale of such malfunctions :
  • anomaly (n): smallest observable discrepancy between desired and actual characteristics
  • defect (d): sufficient density of anomalies that interrupts some required function
  • fault cause (c): root cause behind multiple defects
  • failure (f): persistent fault cause such that the ability to perform the function is terminated

The main FM functions include :

  • Continuity Check (CC): checking that data sent from A to B indeed arrives at B
  • Connectivity Verification (CV): checking that data set from A to B does not incorrectly arrive at C
  • Loopback (LB): checking that data can be sent from A to B can be returned from B and received at A
  • Forward Defect Indication (FDI) also called Alarm Indication Signal (AIS): when data sent from A to B is destined for C, B reports to C that it did not receive data from A
  • Backward Defect Indication (BDI) also called Reverse Defect Indication (RDI): when data is sent from A to B, B reports to A that it did not receive the data.

PM deals with monitoring of parameters such as end-to-end delay, Bit Error Rate (BER), and Packet Loss Ratio (PLR). While there may not be loss of basic connectivity if performance parameters are not maintained within their desired realms, the ability to provide specific services may be compromised, even to the extent that there is a loss of service. For example, excessive round-trip delay makes it difficult to hold interactive audio conferences, and excessive PLR may lead to loss of an IPTV service. For this reason, Service Providers (SPs) commit to Service Level Agreements (SLAs) that specify the acceptable PM parameters.

A partial list of PM parameters that may appear in an SLA is :

  • BER or PLR (for packet oriented networks)
  • 1-way delay (1DM) also called latency: the amount of time it takes for data to go between two points of interest (this measurement requires clock synchronization between endpoints)
  • 2-way delay also called roundtrip delay (RTD): the amount of time it takes for data to go to a point of interest and return (does not require clock synchronization)
  • Packet Delay Variation (PDV): the variation of delay (may be 1-way or 2-way, but even 1-way does not require time synchronization, although frequency synchronization may be required for highly accurate measurements)
  • Availability: percentage of time that the service can be provided
  • Throughput or Bandwidth profile (for packet oriented networks): methods of quantifying the sustainable data rate (will generally be needed for each direction separately)

While certain FM functions, in particular Continuity Check (CC), are usually run periodically, PM functions are frequently called on an ad-hoc basis. However, with an SLA in effect, the SP needs to periodically monitor the PM parameters, and the customer may want to do so as well. In fact, while customers typically trust legacy SPs to provide the promised service level (after all, a 2.048 Mbps leased line is never going to deliver only 1.9 Mbps!), they have much less trust for newer services (it is relatively easy for a SP to cheat and provide 8 Mbps Ethernet throughput instead of the promised 10 Mbps).

In future entries I will deal with questions such as what parameter levels are needed for particular applications, how PM impacts user experience, and how SPs and customers should monitor performance.

Y(J)S

Wednesday, September 8, 2010

Deployment, R&D, and protocols

In my last entry I discussed why the last mile is a bandwidth bottleneck while the backhaul network is a utilization bottleneck. Since I was discussing the access network I did not delve into the core, but it is clear that the core is where the rates are highest, and where the traffic is the most diverse in nature.

Based on these facts, we can enumerate the critical issues for deployment and R&D investment in each of these segments. For the last mile the most important deployment issue is maximizing the data-rate over existing infrastructures, and the area for technology improvement is data-rate enhancement for these infrastructures.

For the backhaul network the deployment imperative is congestion control, while development focuses on OAM and control plane protocols to minimize congestion and manage performance and faults.

For the core network the most costly deployment issue is large-capacity, fast and redundant network forwarding elements, along with rich connectivity. Future developments involve a huge range of topics, from optimized packet formats (MPLS) through routing protocols, to management plane functionality.

A further consequence of these different critical issues is the preference of protocols used in each of these segments. In the last mile efficiency is critical, but there no little need for complex connectivity. So physical-layer framing protocols rule. As there may be the need for multiplexing or inverse multiplexing, one sometimes sees non-trivial use of higher-layer protocols. However, these are usually avoided. For example, Ethernet has long had an inefficient inverse multiplexing mechanism (LAG), but this is being replaced with the more efficient sub-Ethernet PAF (EFM bonding) alongside physical layer (m-pair) bonding for DSL links.

In the backhaul network carrier-grade Ethernet has replaced ATM as the dominant protocol, although MPLS-TP advocates are proposing it for this segment. Carrier-grade Ethernet acquired all the required fault and performance mechanisms with the adoption of Y.1731, while the MEF has worked hard in developing the needed shaping, policing, and scheduling mechanisms.

In the core the IP suite is sovereign. MPLS was originally developed to accelerate IP forwarding, but advances in algorithms and hardware have made IPv4 forwarding remarkably fast. IP caters to a diverse set of traffic types, and the large number of RFCs attests to the richness of available functionality.

Of course it is sometimes useful to use different protocols. A service provider that requires out-of-footprint connectivity might prefer IP backhaul to Ethernet. An operator with regulatory constraints might prefer a pure Ethernet (PBBN) core to an IP one. Yet, understanding the nature and constraints of each of the segments helps us weigh the possibilities.

Y(J)S