Meraki Auto RF Explained

Meraki loves to chalk up the secret sauce in their products to “Meraki Magic” and boasts “anyone can do it”. Yet our inner engineering geek wants to open the curtain and see the real show. An example of that is Auto RF, which is a form of Radio Resource Management (RRM) that allows Meraki Wi-Fi access points to dynamically plan WLAN channels and radio transmit (TX) power. The following sections will break down what Auto RF is and how it works.

Auto RF is made up of two major components: Auto Channel and Auto TX Power. The goal is to provide an initial channel plan, and then adjust dynamically over time based on the environment. Both features are enabled by default, reducing the number of steps required to deploy Meraki access points effectively.

All currently shipping Meraki access points are built with a dedicated 2.4GHz/5GHz scanning radio, which constantly scans the entire usable spectrum. This radio, among other things such as location analytics and WIPS, is used to detect neighboring BSS’s and make off-channel scans without consuming airtime on client-serving radios. The scanning radio dwells on every channel to monitor duty cycle and detect levels of non-802.11 interference. It also sends probes on non-DFS channels to detect neighboring BSS’s and listens for beacons on all channels.

AUTO CHANNEL

The current iteration of Auto Channel comes from an algorithm called TurboCA. Auto Channel is designed to react to degrading conditions while balancing client performance against the disruptiveness of changing channels. Fortunately, with the 802.11-2012 standard we have better adoption of 802.11h, which defines standardized Channel Switch Announcements (CSAs) that reduce the impact of moving to a new channel by notifying clients when they will change channels and what channel they are moving to, so that clients can follow. Auto Channel relies on this heavily where possible, but also takes into account that many clients do not support CSAs, so it tries not to change channels frequently unless necessary.

Channel Switch Announcements

How does it work?

You can refer to the above linked TurboCA article for the full mathematical detail, but this section will summarize the process.

The goal of Auto Channel is to build a channel plan that minimizes channel overlap, optimizes cell sizes for better roaming, and maximizes channel efficiency by picking the best channel available for each AP. Then, it regularly rebuilds the plan in search of an optimization. The computation for the channel plan happens in the Meraki Cloud, where all Meraki access points report their logging data.

The key metrics for the algorithm are:

  • Node (AP) Performance
  • Network (AP Set) Performance
  • Channel Quality (noise floor, non-802.11 interference, neighboring BSS’s, etc.)
  • Channel Width
  • AP Load (number of associated clients)
  • Channel Switch Penalty
  • Hop Limit

Node performance is a calculation of how well an access point should perform on a given channel and channel width. Network performance is the product of the performance of all nodes in a Meraki Network, which is important as an individual node score close to zero will bring down the Network performance score, ensuring that a channel plan will not create issues for one area while optimizing another.

One bad node performance can rule out a channel plan
(arbitrary numbers used for examples)

Channel quality measures non-802.11 interference, duty cycle, and channel width. Channel switch penalty is a metric designed to reduce the number of channel changes for negligible benefit to reduce negative impact impact, and is weighted heavier on 2.4GHz where fewer clients support CSAs.

Hop Limit is used to determine how many neighboring APs we will consider when planning an AP’s channel. This basically determines the “aggressiveness” of the calculation. Meraki runs this calculation at 3 different intervals with different hop limits:

  • Every 15 minutes with a hop limit of “0”.
  • Every 3 hours with a hop limit of “1”, then “0”.
  • Every 24 hours with a hop limit of “2”, then “1”, then “0”

With a hop limit of “0” an AP only considers itself and directly neighboring APs when planning its channel. By running this more frequently, an AP can react to significant events quickly (such as a jammed channel) but won’t change channels too frequently. The more aggressive plans are run less frequently to balance creating a more globally optimal plan vs. changing channels too frequently.

The Auto RF Process

The process starts by inputting the current channel plan (if one exists), and collecting the scanning results and load information from each AP. It then picks a pseudo-random AP and identifies the channel that will render the best “node performance” for that AP. This selection favors picking heavier loaded APs first, as more clients actively connected to an AP signals that it is more important. By doing this, more actively used APs will have a better chance at picking the best channel available rather than running last and taking whatever channels are left.

After all APs have been assigned a channel and the cloud has calculated the predicted node performance for each, the network performance of the plan is determined. If the network performance of the new plan is better than the current plan, the new plan becomes the proposed plan. The algorithm is run ten times to compare multiple possible configurations. Once all iterations are run, the proposed plan becomes the current plan, and updated APs will change their channels accordingly.

Note that previous iterations of Auto Channel (before 802.11h) would not switch channels if a client was currently associated. Because Meraki now changes channels while clients are associated, it can lead to disruptions with clients using real-time applications that don’t support CSAs. If this is causing a negative impact, Meraki Support can revert this behavior upon request.

Exceptions

As with all things RF-related, an automatic algorithm will not fit every environment. In challenging RF environments, or high density deployments, Auto Channel can fall short. In these scenarios, manual channel assignment may be a better option, but Auto Channel can still be used as a starting point to reduce the amount of manual configuration required.

Static channel assignment

APs with a static channel assignment will be used in the plan to identify a used channel, but will be used in the algorithm to generate a plan or calculate network performance.

DFS events will always override a static or auto channel plan and trigger an immediate channel change, as required by the FCC.

If a jammed channel is detected, meaning that levels of non-802.11 interference exceed 65% for longer than one minute, a channel change will occur without waiting for the next run of Auto Channel.

If an AP is being used for wireless mesh, it will not change channels as this will have a significant impact on all APs and clients using that mesh route.

AUTO TX POWER

As with Auto Channel, Auto TX Power calculations are done in the Cloud, and the process is run every twenty minutes. A neighbor report is collected from each AP in the network, which contains the Signal-to-Noise Ratio (SNR) for all neighboring APs in the same Meraki network. The AP also reports its currently connected clients along with their SNR.

Using these lists, the Cloud compiles a list of “direct neighbors” for each AP (defined as any AP in the Meraki network with an SNR of 8dB or greater), and calculates what the ideal TX power should be. For each AP, the Cloud attempts to keep the SNR for its strongest direct neighbor at 30dB and always higher than 17dB for every direct neighbor.

An AP will never reduce its transmit power if a client is connected with SNR <10dB. Generally, if a client is connected with SNR <10dB it is looking for a better AP to roam to. If it hasn’t roamed, it can be assumed that a better AP is not available, so reducing the transmit power will only worsen that client’s performance.

To prevent dramatic changes in TX power which could have unintended results, at each twenty minute run an AP can increase its transmit power by 1dB or lower by 1-3dB. When a new Meraki AP is deployed, it starts at the highest transmit power supported by AP within the regulatory domain of which it is a member, unless overridden by an RF Profile or otherwise statically configured. This means that it could take several iterations before an AP reaches its optimal transmit power level.

RF Profiles can be used to define operating parameters for Auto RF

EXCEPTIONS

Auto TX Power will never set the transmit power lower than 5dBm on the 2.4GHz radios or 8dBm on the 5GHz radios to avoid setting a value which is unusably low where there are a high density of APs. There are valid use cases, such as when using directional antennas or in challenging RF environments, where such a low value is warranted. These environments usually require manual tuning anyway, in which case static values can be set in the Dashboard.

Static transmit power assignment

If an AP has an active mesh neighbor, it will not increase or decrease its transmit power. When using mesh, if an AP has no client serving SSIDs enabled it will always use its maximum available transmit power.

Active mesh prevents transmit power changes

If an AP only has one direct neighbor, it’s considered risky to reduce transmit power so it’s not done as often.

Monitoring

The Meraki Dashboard allows for several tools to monitor the current channel plan and any changes that have been made by Auto RF.

The Wireless > Radio Settings page allows you to identify the current channel and transmit power being used by each AP, as well as the target power range that Auto TX Power is using:

Radio Settings

Clicking on any AP takes you to the Status Page, where the RF tab displays a lot of information about client count, channel utilization, and any changes made to the access point by Auto RF. In the below screenshot, we can see that Auto TX power has adjusted the transmit power, and clicking Details will show exactly what was changed:

RF Tab in the Status Page
Transmit power was increased from 8dBm to 9dBm on the 5GHz radio

Summary

As you can see, there’s a lot more to Auto RF than is evident at first glance. Meraki leverages the analytics of the Dashboard and the metadata from millions of access points to create and refine these algorithms so that less time and effort needs to be spent tuning and tweaking configuration during deployment.

Designing Wi-Fi for High Density

In technical interviews, I often ask (and am often asked):
How would you design a Wi-Fi network to support a large room with 1000 devices?

The question is purposely vague to identify how someone thinks through a problem that doesn’t have a single answer, and to observe how thoroughly they respond. Below I’ll take my own stab at a response.

Step 1: Requirements Gathering

Starting off by talking about antenna types or software tuning is the wrong first step, every time. As much information as is provided in the question, it’s never enough. Wi-Fi is a fickle beast, and collecting requirements is certainly the most important step. I would start by asking qualifying questions such as:

  • What types of devices will be associating?
  • What types of applications are we expecting to support, and/or how much bandwidth is needed per client?
  • What is the construction and layout of the room?
  • What are the restrictions on AP location, such as cabling, mounting, or aesthetic requirements, etc.?

Step 2: Hardware

The correct hardware choice is usually determined by the answers to the questions in the previous section. Some environments, such as stadiums, allow for access points to be mounted under seats, where integrated omnidirectional antennas are adequate. In other areas, such as conference centers where chairs and tables may be moved, access points need to be mounting on walls or high ceilings.

Above 25ft, omnidirectional antennas lose a lot of their performance, as most of the attenuation is into space where there are no clients. In these cases, downtilt omni-directional antennas can provide a similar horizontal range, but better propagated toward the floor. In cases where limiting the propagation is desired, semi-directional or directional antennas will limit the horizontal propagation while also improving the vertical reach.

Step 3: Software Tuning

While every environment is different and requires unique exact configurations, a high density environment almost certainly requires a high density of APs, and with that there are a certain set of options that are best practice for almost all such deployments

Data Rates

In a well designed Wi-Fi environment, it’s a best practice to increase the minimum data rate above the default. 12-18Mbps is a common setting, as it prevents 802.11b devices from joining the BSSID and bringing other clients down, and it reduces the airtime required for management frames, leaving more space for meaningful traffic. It can also reduce effective cell sizes by not supporting clients that have too weak an RSSI to transmit at the increased minimum rate. However, caution is needed as setting the minimum data rate too high can lead to high amounts of corruption

Channel Planning

More APs means more chance for co-channel contention, which negatively impacts all clients on that channel. Where possible, enabling the use of 5GHz UNII-2 extended channels allows for more non-overlapping channels, as long as clients support them. On the 2.4GHz spectrum, with only 3 non-overlapping channels available in the US, disabling the 2.4GHz radio on select APs will reduce the number of APs in an area fighting for the same frequency.

In addition to enabling more 5GHz channels, it’s important to reduce the channel width to allow for more channels to be used concurrently. A high density environment configured for 80MHz-wide channels may only have six non-overlapping channels available, while the same environment configured for 20MHz-wide channels will have 25 non-overlapping channels. There is a tradeoff in throughput by reducing channel width, but that’s usually less important than having more channels available.

With a reduced number of 2.4GHz radios compared to 5GHz, band steering can also be effective at encouraging dual-band clients to connect on the 5GHz channels where there is less congestion.

Power Levels

With high client density, access points are generally placed to cover a chosen number of client devices. Because those clients are in a smaller area than lower density deployments, the AP doesn’t need to cover as large a physical area. Lowering the transmit (TX) power of the APs will reduce the cell size, and thus reduce the amount of co-channel contention.

Advanced Options

Some vendor-specific options, such as Cisco’s RX-SOP, can also impact client connectivity and roaming. While RX-SOP is marketed as helping to “reduce cell size ensuring clients are connected using the highest possible data rate”, this is not what it’s designed for, and improperly configuring these options can negatively impact connectivity. RX-SOP is used to lower the possible contention between APs on the same and adjacent channels by reducing the APs “sensitivity” to packets in determining transmit opportunity. When tuned correctly, it can increase the overall available airtime available.

Most vendors offer some type of Radio Resources Management (RRM) capabilities to automatically tune the settings above, to provide features such as: coverage hole detection and correction, dynamic channel assignment, dynamic transmit power control, and client balancing. However, many RRM solutions don’t do a great job of tuning for high-density environments out of the box, and almost always need tweaking and tuning.


As with any Wi-Fi deployment, there is no “one-size fits all” answer. Site surveys, both pre- and post-installation, are vital in ensuring success.

Multicast over Wireless

Multicast has brought a lot of efficiencies to IP networks. But multicast wasn’t designed for wireless, and especially isn’t well suited for high-bandwidth multicast applications like video. I’ll cover the challenges of multicast over wireless and design considerations.

But first, an overview of multicast:

To level set, I’ll briefly cover IP multicast. For the purposes of this article, I’ll focus specifically on Layer 2, If you’re already familiar with multicast over ethernet, feel free to skip this section.

What is multicast?

In short, multicast is a means of sending the same data to multiple recipients at the same time without the source having to generate copies for each recipient. Whereas broadcast traffic is sent to every device whether they want it or not, multicast allows recipients to subscribe to the traffic they want. As a result, efficiency is improved (traffic is only sent once) and overhead is reduced (unintended recipients don’t receive the traffic).

How does it work?

With multicast, the sender doesn’t know who the recipients are, or even how many there are. In order for a given set of traffic to reach its intended recipients, we send traffic to multicast groups. IANA has reserved 224.0.0.0 – 239.255.255.255 for multicast groups, with 239.0.0.0/8 commonly used within private organizations. Traffic is sent with the unicast source IP of the sender, and a destination IP of the chosen multicast group.

On the receiving side, recipients subscribe to a multicast group using Internet Group Management Protocol (IGMP). A station that wishes to join a multicast group sends an IGMP Membership Report / Join message for that given group. Most enterprise switches, WLCs, or APs use IGMP snooping to inspect IGMP packets and populate their multicast table, which matches ports/devices to multicast groups. Then, when a multicast packet is received, the network device can forward that packet to the intended receivers. Network devices that don’t support IGMP snooping will forward the packet the same as it would a broadcast, to every port except the port the packet came in on. Here’s an example of an IGMP Join request:

The problems with multicast over WiFi vs wired

In a switched wired network, all traffic is sent at the same data rate (generally 1Gbps today) and with each port being its own collision domain, collisions are rare. In addition, wired traffic uses a bounded medium, so interference and frame corruption is also rare. Because of this, there is no network impact to sending large amounts of wired traffic as multicast. WiFi does not share either of these characteristics, which makes multicast more complicated. Below are some of the issues with multicast to multicast over WiFi:

  1. Multicast traffic is sent at a mandatory data rate. As mentioned, WiFi clients share a collision domain. Because multicast is a single transmission that must be received by all transmitted receivers, access points are forced to send that frame at the lowest-common-denominator settings, to give the receivers the best chance of hearing the transmission uncorrupted. While this is fine for small broadcast traffic like beacons, it’s unsustainable for high-bandwidth applications.
  2. Low data-rate traffic consumes more air time. Because multicast traffic is sent at a low data rate, it takes longer for each of those transmissions to complete. A 1MB file sent at a data rate of 1 Mbps will take significantly longer than the same file at a data rate of 54Mbps. This means that all other stations must spend more time waiting for their turn to transmit.
  3. Battery-powered clients have reduced battery life. Multicast and broadcast traffic are sent at the DTIM interval, which all stations keep track of. When a multicast frame is sent, all stations must wake up to listen to the frame, and discard it if they don’t need it. This results in battery-powered devices staying awake for a lot longer than needed. If the DTIM interval is too high, the increased latency can impact real-time applications like video. But the lower the DTIM interval, the more often stations need to wake up.
  4. Multicast senders will not resend corrupt frames. Frame corruption and retransmissions are a standard part of any WiFi transaction. Every unicast frame, even if unacknowledged at upper OSI layers such as when using UDP, are acknowledged at Layer 2, and retransmitted by the sending station if necessary. This may not seem like a big deal at first, as unacknowledged traffic on a wired network works fine most of the time. But in an area of interference or poor RSSI level, it’s not unusual to see 10% of wireless frames retransmitted. 10% loss would be considered extremely high on a wired network, and most applications are unable to handle this level of loss.

So how do we fix it?

There’s no silver bullet to “fixing” multicast over wireless, but there are a few ways to design around the shortcomings.

  1. Increasing the minimum data rate. An increase to the minimum data rate means that broadcast and multicast frames must be sent at the higher rate. Unicast traffic is acknowledged at Layer 2, reducing loss experienced by the upper layers. As mentioned earlier, higher data rates reduce the time spent transmitting, and increase throughput for the multicast traffic. It also reduces the amount of time a battery powered device must spend listening to the frames. However, other design and configuration considerations must be made to ensure the wireless network can support this, as changing the minimum data rate can impact roaming, as well as connectivity for low-powered devices.
  2. Multicast-to-Unicast Conversion (M-to-U). Many vendors of wireless APs support multicast-to-unicast conversion, which sends a unicast copy the frame to each intended receiver, using IGMP snooping to determine those stations. This means that the frame can be sent at the receiving station’s best data rate, which should almost always be above the minimum. Several unicast transmissions at 54Mbps would still use less channel time than the same multicast transmission at 1Mbps. In addition, stations which aren’t the intended receivers don’t need to wake up to listen to the frame, reducing their battery consumption.

The pudding

Let’s take a look at the same multicast frame sent with and without Multicast-to-Unicast Conversion. Using iperf2 (since iperf3 doesn’t support multicast), we’ll generate multicast traffic at a rate of 20Mbps from a wired client and send it to a wireless client, using multicast address 239.255.1.2.

Parameters for this test:
Receiver: MacBook Pro (2015 edition). 3 spatial stream 802.11ac airport card.
Access Point: Cisco Meraki MR42E (802.11ac Wave 2, 3×3:3) with omni-directional dipole antennas.

Wired Multicast Source (10.1.1.216):

Mcast Source.png

Wireless Multicast Recipient (M-to-U enabled): 

Mcast MtoU.jpg

Wireless Multicast Recipient (M-to-U disabled):

Mcast No MtoU.jpg

The first thing to notice is the loss rate. With M-to-U enabled, my 20Mbps stream was successfully being transmitted with almost no loss. With M-to-U disabled, throughput was reduce by roughly 95%, with an average of 1Mbps throughput. There are two reasons for this: first, the mandatory data rate used for the multicast transmission was 6Mbps, of which ~40% is attributed to protocol overhead. In addition, with a unicast transmission the AP can buffer frames to a receiver, whereas a multicast transmission is best effort: it has no layer 2 acknowledgement or communication from the receivers. This can be improved with application-level handling, such as the application deciding to transmit at a lower quality, but there are no guarantees that the application is set up to handle that. iperf has no such throttling/accommodation.

To dive in further, let’s take a look at the differences in the frames transmitted:

Frame Capture (M-to-U enabled):

Unicast Multicast.png

Frame Capture (M-to-U disabled):

Demulticast Multicast.png

We can verify that the second frame is using multicast by the MAC address in the Destination Address field, since all multicast MACs begin with 01-00-5E. Notice also that the source address of the unicast frame is set to the MAC address of the access point as the AP had to generate that frame, whereas the multicast frame’s source is that of the sending station since there was no frame modification needed.

Next, we’ll look at the data rate. Multicast is always sent at the basic rate, which was 6 Mbps for this BSSID, and a transmission time of 2072μs. Compared to M-to-U with a data rate of 540Mbps and a transmission time of 46μs. That means that the multicast transmission held the channel 45 times longer than the unicast, and still only sent half as much data.

Also, since multicast must use the lowest-common-denominator parameters, it cannot take advantage of efficiency improvements such as A-MPDU and multiple spatial streams offered by this AP.

So wouldn’t M-to-U be the silver bullet solution?

As is often the case, the answer is “it depends”. In a lab where my 3 spatial stream MacBook Pro can connect at MCS 8, it may appear so. But, if the majority of clients are connected at a low data rate, and the content only consumes a small amount of bandwidth, the overhead caused by retransmitting small frames for a large number of receivers could add delay and consume more aggregate airtime than simply transmitting once at a low data rate.

Deploying Wi-Fi for Location Analytics

Many Wi-Fi vendors on the market now include the capability to leverage access points for location analytics in addition to serving clients. However, deploying location analytics has its own set of requirements, and attempting to simply leverage the same APs for location analytics may have suboptimal results if not planned out correctly. The following sections will detail some of these design considerations to optimize location accuracy and performance.

How do APs determine a device’s location?

Wi-Fi geolocation is done primarily by collecting the RSSI of frames sent from a client seen by multiple access points in an area, then applying trilateration algorithms to that data to approximate the location of a device. This requires careful placement of access points, as well as accurate placement on a floor plan or other location system within the access point controller.

AP Placement Considerations

First and foremost, for trilateration to work properly a client needs to be heard by at least three APs at any given time, and four would be ideal. On the flip side, more than five or six APs could limit the effectiveness by adding unnecessary noise and interference in the environment. A client that is only seen by two APs will be accurate in one dimension (the distance between the APs), but won’t be able to accurately detect the location in the second dimension.

Contrary to designing for coverage, location detection works best when the service area is completely encapsulated by the access points, meaning that APs are placed on the outer edge of the zone where devices will be located.

Because trilateration happens in the latitude and longitude planes and signal strength is used to determine the distance of a client between APs, placing APs in a perfect grid or line actually inhibits the APs from detecting the offset from each other. It’s recommended to place APs in an imperfect shape, which is especially important in long narrow spaces such as corridors or alleys.

Finally, minimize any major line-of-sight obstructions between APs, especially in areas of heavy traffic. Shelves and walls between APs will impact the RSSI received by the AP, which will place the client further away from the AP than it actually is.

Factors Impacting Accuracy

Traffic Frequency

It’s important to note that the accuracy will be limited by how often the access points see frames from a client. For a mobile phone with the screen turned off, such as in someone’s pocket, APs will rely on the periodic probes that a device will send out, which may be as few as a couple of times per minute, meaning our location detection will only be current to the last probe.

Trilateration Frequency

Because it can take a lot of processing power to constantly detect and triangulate a large number of clients, many Wi-Fi vendors will aggregate the received data and perform the trilateration at regular intervals, such as once per minute. It’s important to review the vendor’s documentation and set expectations accordingly.

MAC Randomization

Both iOS and Android support MAC randomization, which masks the device’s true MAC address in many management frames. This can make triangulating a device, or keeping track of subsequent visits, significantly more difficult. iOS has this feature enabled by default, whereas most Android phones default to disabled. There are ways to de-anonymize these devices, but it’s usually more hassle than it’s worth. The easiest way to overcome MAC randomization is to encourage devices to join the Wi-Fi network, as the real MAC address is used for association.

Beyond Wi-Fi

Because most Wi-Fi clients are mobile and probe frequency is sparse, sub-meter accuracy will be difficult-to-impossible to achieve. Other technologies, such as BLE, RFID, and RTLS may be used in place of, or in addition to, relying on Wi-Fi for location analytics. Some vendors, such as Meraki, include BLE scanning radios in their access points. While BLE can be more accurate than Wi-Fi, a larger percentage of devices are either not BLE-enabled, or users are disabling the BLE radio in their client device.