Wi-Fi Overhead, Part 2: Solutions to OverheadBy CWNP On 05/02/2011 - 29 Comments
This is the second article in a two-part discussion about WLAN overhead. Part 1 (Sources of Overhead) demonstrated that there are too many sources of overhead on W-Fi networks. Much of the overhead is required for successful protocol operation, but that reality doesn’t make it suck less. In fact, protocol overhead usually causes at least a 50% decrease in actual network throughput when compared with theoretical signaling rates. Ouch.
Despite the painful reality of network overhead, there are a handful of important network design steps and configuration settings that can reduce overhead and optimize network performance. When you’re forced to concede over 50% of the capacity to protocol overhead, you should fight to keep everything else. Here’s how.
Interference is an inevitable part of a half-duplex wireless medium. However, there are two primary ways to reduce overhead caused by interference:
- Many sources of non-802.11 interference degrade network performance, but may not halt it altogether. You may have interferers that you don’t know about. Remove non-Wi-Fi transmitters whenever possible (I feel the need to say “duh!”).
- Reduce WLAN interference by planning and controlling your contention domains. This is a really HUGE topic that is fully explored elsewhere. The best way to decrease “busy medium” time (i.e. overhead) caused by other Wi-Fi devices is to increase the number of contention domains and create better separation between existing contention domains. You can increase the number of contention domains (within limits!) by being smart about AP placement, channel reuse, transmit power settings, antenna selection and RF shaping practices, and client device use. Remember that it’s not just about adding more APs. Find an acceptable balance that allows RF separation and provides enough capacity for each user/application. In addition to normal contention overhead, WLAN interference also causes CRCs and retries. We’ll talk about that in a minute.
Controlling the Necessary Functions
Most sources of overhead can’t be eliminated. For example, interframe spaces, random backoff, PHY signaling, and MAC headers will always be there. These sources of overhead are a necessary part of the protocol. Accept their existence.
But, don’t fall into helpless resignation; here are a few ways to reduce the impact of those necessary functions:
- An extra long interframe space (EIFS) is observed when STAs receive frames with CRC errors. Better separation of contention domains (less bad interference) will minimize the number of EIFS.
- In addition to using slow PHY rates for the MAC header and payload, 802.11b stations—AND backward compatible BSSs with 802.11b rates—also use considerably longer PHY signaling processes (PHY preamble and header). We’ll talk about compatibility issues later when we address protection mechanisms, but the primary solution is to get rid of 802.11b. :)
- One easy way to reduce MAC (and PHY) overhead is to eliminate unnecessary beacon streams. Instead of separating WLAN services with separate SSIDs, use dynamic user policy assignment practices.
Short Guard Intervals
802.11n offered us a number of additional overhead reducing technologies. In Part 1 of this blog series, I mentioned the default 800 ns guard interval. 802.11n allows optional 400 ns guard intervals, which boosts theoretical rates by about 10%. However, avoid using short guard intervals in environments with high reflectivity (e.g. warehouses, manufacturing, industrial environments, etc.).
Frame Aggregation and Block Acknowledgments
802.11n also makes much better use of frame aggregation and block acknowledgments (they were introduced with 802.11e, but not really used). Where early WLAN operators were making frames smaller (fragmentation) to avoid collisions, the higher PHY rates of modern networks allow for much larger wireless frames, which drastically improve efficiency. Packing more upper layer data in each frame is the quintessential example of overhead reduction. If each aggregated frame were to be transmitted independently, we’d see much higher overhead from interframe spaces, backoffs, PHY signaling, and MAC headers.
When aggregation is used, block acknowledgments are as well. Block acks add to the efficiency improvement by using ack bitmaps to indicate successful reception of multiple frames instead of transmitting individual acks for each received frame. Suck on that, overhead!
In most cases, enabling frame aggregation will produce significant capacity improvements. If there’s a configuration option, opt for A-MPDU instead of A-MSDU. It’s more efficient.
I mentioned previously that older technologies require more time for PHY signaling. In addition to being slow themselves, legacy stations also hold back the more efficient stations by requiring them to protect their own data transmissions. If you analyze the frame formats of protection frames (like RTS and CTS), you’ll notice that they are actually very small at the MAC layer. However, if you also look at the PHY layer formats (something you should do if you’re working towards CWAP), you’ll notice that a legacy RTS/CTS exchange actually takes a considerable amount of time by virtue of the legacy PHY preamble and PLCP header, which take much too long. You also have to factor in an additional one or two SIFS. That’s why protection mechanisms are not cool. As before, the solution is to get rid of legacy (particularly 802.11b and earlier) clients if your business case allows for it. If you can’t get rid of them, the next best thing may be to use airtime fairness mechanisms to slant the odds in favor of newer technologies.
A caveat worth considering: If you are seeing a high number of collisions on your WLAN and it is causing a noticeable performance impact, it may actually be worthwhile to enable RTS/CTS or CTS-to-Self, even if you don’t have legacy clients. The shorter protection frames will reserve the medium, allowing the larger data frame to follow with a lower likelihood of collision. I know, adding overhead to reduce overhead sounds crazy.
CRCs and Retries
Speaking of collisions, retries are a major source of overhead on many networks. Retries generally result from reception errors caused by interference, but there are a number of other causes. The rotten thing about retransmissions is that the first (failed) attempt already used up some airtime, the second attempt requires a longer backoff period, and retries often cause rate shifting (switching from a higher to lower data rate) to improve reliability. After deploying a network and verifying its performance, you should identify a retry baseline. The goal for retries is (loosely) less than 10%, but as always, the environment and applications should dictate what is acceptable. Retries can be reduced by improving the signal-to-noise ratio (SNR) and reducing interference. Those two design goals bring us full circle back to controlling our RF contention domains with proper AP placement, channel reuse, antenna selection, and power output settings.
Finally, data rate support is a hot topic in WLAN design. We’ve already discussed 802.11b, and we know that it is bad for our networks. If you must keep 802.11b stations, consider disabling support for 1 and 2 Mbps. When a low data rate is mandatory for the BSS, a lot of airtime is used up by management traffic sent at low rates—these frames must be “receivable” by all stations. Disabling 1 and 2 Mbps is very common. If you don’t support 802.11b at all, you may even be able to disable support for 6 (maybe 12 as well) Mbps, leaving 12 (or 18) Mbps as the lowest rate. I would only do this in a very high density application. In theory, you hope that your stations are never using low rates--because you designed for 24 Mbps and better, remember. In practice, you just can't control the RF domain with the same exactness as you'd like. Lower rates are useful for reliability. Most environments will be just fine with all OFDM rates enabled. Removing legacy rates and their accompanying PHY signaling is the most important step.
Final Comments and Suggestions (FCS)
When you know that overhead typically accounts for more than a 50% capacity loss, protecting the remaining capacity seems much more important. There are a lot of sources of overhead, and many of those sources can be kept at bay by designing your network properly and enabling the right features for your environment. Of course, we could talk the overhead topic to death, but not everyone needs to squeeze out every last drop of capacity. Let the applications dictate your design priorities, but don’t let overhead take a big bite out of your wireless capacity. After all, capacity is limited.
At a high level, you can identify network overhead problems by comparing your signaling rates with your actual performance. Look for an unusually high amount of 802.11 management and control frames when compared with data frames. In the same way, look at your utilization statistics to see if your lowest rates are using a disproportionate amount of airtime. Also, keep an eye on your retries and CRC errors.
As always, thanks for reading! Feel free to share more tips about identifying and controlling WLAN overhead.