Meraki Auto RF Explained

Meraki loves to chalk up the secret sauce in their products to “Meraki Magic” and boasts “anyone can do it”. Yet our inner engineering geek wants to open the curtain and see the real show. An example of that is Auto RF, which is a form of Radio Resource Management (RRM) that allows Meraki Wi-Fi access points to dynamically plan WLAN channels and radio transmit (TX) power. The following sections will break down what Auto RF is and how it works.

Auto RF is made up of two major components: Auto Channel and Auto TX Power. The goal is to provide an initial channel plan, and then adjust dynamically over time based on the environment. Both features are enabled by default, reducing the number of steps required to deploy Meraki access points effectively.

All currently shipping Meraki access points are built with a dedicated 2.4GHz/5GHz scanning radio, which constantly scans the entire usable spectrum. This radio, among other things such as location analytics and WIPS, is used to detect neighboring BSS’s and make off-channel scans without consuming airtime on client-serving radios. The scanning radio dwells on every channel to monitor duty cycle and detect levels of non-802.11 interference. It also sends probes on non-DFS channels to detect neighboring BSS’s and listens for beacons on all channels.


The current iteration of Auto Channel comes from an algorithm called TurboCA. Auto Channel is designed to react to degrading conditions while balancing client performance against the disruptiveness of changing channels. Fortunately, with the 802.11-2012 standard we have better adoption of 802.11h, which defines standardized Channel Switch Announcements (CSAs) that reduce the impact of moving to a new channel by notifying clients when they will change channels and what channel they are moving to, so that clients can follow. Auto Channel relies on this heavily where possible, but also takes into account that many clients do not support CSAs, so it tries not to change channels frequently unless necessary.

Channel Switch Announcements

How does it work?

You can refer to the above linked TurboCA article for the full mathematical detail, but this section will summarize the process.

The goal of Auto Channel is to build a channel plan that minimizes channel overlap, optimizes cell sizes for better roaming, and maximizes channel efficiency by picking the best channel available for each AP. Then, it regularly rebuilds the plan in search of an optimization. The computation for the channel plan happens in the Meraki Cloud, where all Meraki access points report their logging data.

The key metrics for the algorithm are:

  • Node (AP) Performance
  • Network (AP Set) Performance
  • Channel Quality (noise floor, non-802.11 interference, neighboring BSS’s, etc.)
  • Channel Width
  • AP Load (number of associated clients)
  • Channel Switch Penalty
  • Hop Limit

Node performance is a calculation of how well an access point should perform on a given channel and channel width. Network performance is the product of the performance of all nodes in a Meraki Network, which is important as an individual node score close to zero will bring down the Network performance score, ensuring that a channel plan will not create issues for one area while optimizing another.

One bad node performance can rule out a channel plan
(arbitrary numbers used for examples)

Channel quality measures non-802.11 interference, duty cycle, and channel width. Channel switch penalty is a metric designed to reduce the number of channel changes for negligible benefit to reduce negative impact impact, and is weighted heavier on 2.4GHz where fewer clients support CSAs.

Hop Limit is used to determine how many neighboring APs we will consider when planning an AP’s channel. This basically determines the “aggressiveness” of the calculation. Meraki runs this calculation at 3 different intervals with different hop limits:

  • Every 15 minutes with a hop limit of “0”.
  • Every 3 hours with a hop limit of “1”, then “0”.
  • Every 24 hours with a hop limit of “2”, then “1”, then “0”

With a hop limit of “0” an AP only considers itself and directly neighboring APs when planning its channel. By running this more frequently, an AP can react to significant events quickly (such as a jammed channel) but won’t change channels too frequently. The more aggressive plans are run less frequently to balance creating a more globally optimal plan vs. changing channels too frequently.

The Auto RF Process

The process starts by inputting the current channel plan (if one exists), and collecting the scanning results and load information from each AP. It then picks a pseudo-random AP and identifies the channel that will render the best “node performance” for that AP. This selection favors picking heavier loaded APs first, as more clients actively connected to an AP signals that it is more important. By doing this, more actively used APs will have a better chance at picking the best channel available rather than running last and taking whatever channels are left.

After all APs have been assigned a channel and the cloud has calculated the predicted node performance for each, the network performance of the plan is determined. If the network performance of the new plan is better than the current plan, the new plan becomes the proposed plan. The algorithm is run ten times to compare multiple possible configurations. Once all iterations are run, the proposed plan becomes the current plan, and updated APs will change their channels accordingly.

Note that previous iterations of Auto Channel (before 802.11h) would not switch channels if a client was currently associated. Because Meraki now changes channels while clients are associated, it can lead to disruptions with clients using real-time applications that don’t support CSAs. If this is causing a negative impact, Meraki Support can revert this behavior upon request.


As with all things RF-related, an automatic algorithm will not fit every environment. In challenging RF environments, or high density deployments, Auto Channel can fall short. In these scenarios, manual channel assignment may be a better option, but Auto Channel can still be used as a starting point to reduce the amount of manual configuration required.

Static channel assignment

APs with a static channel assignment will be used in the plan to identify a used channel, but will be used in the algorithm to generate a plan or calculate network performance.

DFS events will always override a static or auto channel plan and trigger an immediate channel change, as required by the FCC.

If a jammed channel is detected, meaning that levels of non-802.11 interference exceed 65% for longer than one minute, a channel change will occur without waiting for the next run of Auto Channel.

If an AP is being used for wireless mesh, it will not change channels as this will have a significant impact on all APs and clients using that mesh route.


As with Auto Channel, Auto TX Power calculations are done in the Cloud, and the process is run every twenty minutes. A neighbor report is collected from each AP in the network, which contains the Signal-to-Noise Ratio (SNR) for all neighboring APs in the same Meraki network. The AP also reports its currently connected clients along with their SNR.

Using these lists, the Cloud compiles a list of “direct neighbors” for each AP (defined as any AP in the Meraki network with an SNR of 8dB or greater), and calculates what the ideal TX power should be. For each AP, the Cloud attempts to keep the SNR for its strongest direct neighbor at 30dB and always higher than 17dB for every direct neighbor.

An AP will never reduce its transmit power if a client is connected with SNR <10dB. Generally, if a client is connected with SNR <10dB it is looking for a better AP to roam to. If it hasn’t roamed, it can be assumed that a better AP is not available, so reducing the transmit power will only worsen that client’s performance.

To prevent dramatic changes in TX power which could have unintended results, at each twenty minute run an AP can increase its transmit power by 1dB or lower by 1-3dB. When a new Meraki AP is deployed, it starts at the highest transmit power supported by AP within the regulatory domain of which it is a member, unless overridden by an RF Profile or otherwise statically configured. This means that it could take several iterations before an AP reaches its optimal transmit power level.

RF Profiles can be used to define operating parameters for Auto RF


Auto TX Power will never set the transmit power lower than 5dBm on the 2.4GHz radios or 8dBm on the 5GHz radios to avoid setting a value which is unusably low where there are a high density of APs. There are valid use cases, such as when using directional antennas or in challenging RF environments, where such a low value is warranted. These environments usually require manual tuning anyway, in which case static values can be set in the Dashboard.

Static transmit power assignment

If an AP has an active mesh neighbor, it will not increase or decrease its transmit power. When using mesh, if an AP has no client serving SSIDs enabled it will always use its maximum available transmit power.

Active mesh prevents transmit power changes

If an AP only has one direct neighbor, it’s considered risky to reduce transmit power so it’s not done as often.


The Meraki Dashboard allows for several tools to monitor the current channel plan and any changes that have been made by Auto RF.

The Wireless > Radio Settings page allows you to identify the current channel and transmit power being used by each AP, as well as the target power range that Auto TX Power is using:

Radio Settings

Clicking on any AP takes you to the Status Page, where the RF tab displays a lot of information about client count, channel utilization, and any changes made to the access point by Auto RF. In the below screenshot, we can see that Auto TX power has adjusted the transmit power, and clicking Details will show exactly what was changed:

RF Tab in the Status Page
Transmit power was increased from 8dBm to 9dBm on the 5GHz radio


As you can see, there’s a lot more to Auto RF than is evident at first glance. Meraki leverages the analytics of the Dashboard and the metadata from millions of access points to create and refine these algorithms so that less time and effort needs to be spent tuning and tweaking configuration during deployment.

Designing Wi-Fi for High Density

In technical interviews, I often ask (and am often asked):
How would you design a Wi-Fi network to support a large room with 1000 devices?

The question is purposely vague to identify how someone thinks through a problem that doesn’t have a single answer, and to observe how thoroughly they respond. Below I’ll take my own stab at a response.

Step 1: Requirements Gathering

Starting off by talking about antenna types or software tuning is the wrong first step, every time. As much information as is provided in the question, it’s never enough. Wi-Fi is a fickle beast, and collecting requirements is certainly the most important step. I would start by asking qualifying questions such as:

  • What types of devices will be associating?
  • What types of applications are we expecting to support, and/or how much bandwidth is needed per client?
  • What is the construction and layout of the room?
  • What are the restrictions on AP location, such as cabling, mounting, or aesthetic requirements, etc.?

Step 2: Hardware

The correct hardware choice is usually determined by the answers to the questions in the previous section. Some environments, such as stadiums, allow for access points to be mounted under seats, where integrated omnidirectional antennas are adequate. In other areas, such as conference centers where chairs and tables may be moved, access points need to be mounting on walls or high ceilings.

Above 25ft, omnidirectional antennas lose a lot of their performance, as most of the attenuation is into space where there are no clients. In these cases, downtilt omni-directional antennas can provide a similar horizontal range, but better propagated toward the floor. In cases where limiting the propagation is desired, semi-directional or directional antennas will limit the horizontal propagation while also improving the vertical reach.

Step 3: Software Tuning

While every environment is different and requires unique exact configurations, a high density environment almost certainly requires a high density of APs, and with that there are a certain set of options that are best practice for almost all such deployments

Data Rates

In a well designed Wi-Fi environment, it’s a best practice to increase the minimum data rate above the default. 12-18Mbps is a common setting, as it prevents 802.11b devices from joining the BSSID and bringing other clients down, and it reduces the airtime required for management frames, leaving more space for meaningful traffic. It can also reduce effective cell sizes by not supporting clients that have too weak an RSSI to transmit at the increased minimum rate. However, caution is needed as setting the minimum data rate too high can lead to high amounts of corruption

Channel Planning

More APs means more chance for co-channel contention, which negatively impacts all clients on that channel. Where possible, enabling the use of 5GHz UNII-2 extended channels allows for more non-overlapping channels, as long as clients support them. On the 2.4GHz spectrum, with only 3 non-overlapping channels available in the US, disabling the 2.4GHz radio on select APs will reduce the number of APs in an area fighting for the same frequency.

In addition to enabling more 5GHz channels, it’s important to reduce the channel width to allow for more channels to be used concurrently. A high density environment configured for 80MHz-wide channels may only have six non-overlapping channels available, while the same environment configured for 20MHz-wide channels will have 25 non-overlapping channels. There is a tradeoff in throughput by reducing channel width, but that’s usually less important than having more channels available.

With a reduced number of 2.4GHz radios compared to 5GHz, band steering can also be effective at encouraging dual-band clients to connect on the 5GHz channels where there is less congestion.

Power Levels

With high client density, access points are generally placed to cover a chosen number of client devices. Because those clients are in a smaller area than lower density deployments, the AP doesn’t need to cover as large a physical area. Lowering the transmit (TX) power of the APs will reduce the cell size, and thus reduce the amount of co-channel contention.

Advanced Options

Some vendor-specific options, such as Cisco’s RX-SOP, can also impact client connectivity and roaming. While RX-SOP is marketed as helping to “reduce cell size ensuring clients are connected using the highest possible data rate”, this is not what it’s designed for, and improperly configuring these options can negatively impact connectivity. RX-SOP is used to lower the possible contention between APs on the same and adjacent channels by reducing the APs “sensitivity” to packets in determining transmit opportunity. When tuned correctly, it can increase the overall available airtime available.

Most vendors offer some type of Radio Resources Management (RRM) capabilities to automatically tune the settings above, to provide features such as: coverage hole detection and correction, dynamic channel assignment, dynamic transmit power control, and client balancing. However, many RRM solutions don’t do a great job of tuning for high-density environments out of the box, and almost always need tweaking and tuning.

As with any Wi-Fi deployment, there is no “one-size fits all” answer. Site surveys, both pre- and post-installation, are vital in ensuring success.

Multicast over Wireless

Multicast has brought a lot of efficiencies to IP networks. But multicast wasn’t designed for wireless, and especially isn’t well suited for high-bandwidth multicast applications like video. I’ll cover the challenges of multicast over wireless and design considerations.

But first, an overview of multicast:

To level set, I’ll briefly cover IP multicast. For the purposes of this article, I’ll focus specifically on Layer 2, If you’re already familiar with multicast over ethernet, feel free to skip this section.

What is multicast?

In short, multicast is a means of sending the same data to multiple recipients at the same time without the source having to generate copies for each recipient. Whereas broadcast traffic is sent to every device whether they want it or not, multicast allows recipients to subscribe to the traffic they want. As a result, efficiency is improved (traffic is only sent once) and overhead is reduced (unintended recipients don’t receive the traffic).

How does it work?

With multicast, the sender doesn’t know who the recipients are, or even how many there are. In order for a given set of traffic to reach its intended recipients, we send traffic to multicast groups. IANA has reserved – for multicast groups, with commonly used within private organizations. Traffic is sent with the unicast source IP of the sender, and a destination IP of the chosen multicast group.

On the receiving side, recipients subscribe to a multicast group using Internet Group Management Protocol (IGMP). A station that wishes to join a multicast group sends an IGMP Membership Report / Join message for that given group. Most enterprise switches, WLCs, or APs use IGMP snooping to inspect IGMP packets and populate their multicast table, which matches ports/devices to multicast groups. Then, when a multicast packet is received, the network device can forward that packet to the intended receivers. Network devices that don’t support IGMP snooping will forward the packet the same as it would a broadcast, to every port except the port the packet came in on. Here’s an example of an IGMP Join request:

The problems with multicast over WiFi vs wired

In a switched wired network, all traffic is sent at the same data rate (generally 1Gbps today) and with each port being its own collision domain, collisions are rare. In addition, wired traffic uses a bounded medium, so interference and frame corruption is also rare. Because of this, there is no network impact to sending large amounts of wired traffic as multicast. WiFi does not share either of these characteristics, which makes multicast more complicated. Below are some of the issues with multicast to multicast over WiFi:

  1. Multicast traffic is sent at a mandatory data rate. As mentioned, WiFi clients share a collision domain. Because multicast is a single transmission that must be received by all transmitted receivers, access points are forced to send that frame at the lowest-common-denominator settings, to give the receivers the best chance of hearing the transmission uncorrupted. While this is fine for small broadcast traffic like beacons, it’s unsustainable for high-bandwidth applications.
  2. Low data-rate traffic consumes more air time. Because multicast traffic is sent at a low data rate, it takes longer for each of those transmissions to complete. A 1MB file sent at a data rate of 1 Mbps will take significantly longer than the same file at a data rate of 54Mbps. This means that all other stations must spend more time waiting for their turn to transmit.
  3. Battery-powered clients have reduced battery life. Multicast and broadcast traffic are sent at the DTIM interval, which all stations keep track of. When a multicast frame is sent, all stations must wake up to listen to the frame, and discard it if they don’t need it. This results in battery-powered devices staying awake for a lot longer than needed. If the DTIM interval is too high, the increased latency can impact real-time applications like video. But the lower the DTIM interval, the more often stations need to wake up.
  4. Multicast senders will not resend corrupt frames. Frame corruption and retransmissions are a standard part of any WiFi transaction. Every unicast frame, even if unacknowledged at upper OSI layers such as when using UDP, are acknowledged at Layer 2, and retransmitted by the sending station if necessary. This may not seem like a big deal at first, as unacknowledged traffic on a wired network works fine most of the time. But in an area of interference or poor RSSI level, it’s not unusual to see 10% of wireless frames retransmitted. 10% loss would be considered extremely high on a wired network, and most applications are unable to handle this level of loss.

So how do we fix it?

There’s no silver bullet to “fixing” multicast over wireless, but there are a few ways to design around the shortcomings.

  1. Increasing the minimum data rate. An increase to the minimum data rate means that broadcast and multicast frames must be sent at the higher rate. Unicast traffic is acknowledged at Layer 2, reducing loss experienced by the upper layers. As mentioned earlier, higher data rates reduce the time spent transmitting, and increase throughput for the multicast traffic. It also reduces the amount of time a battery powered device must spend listening to the frames. However, other design and configuration considerations must be made to ensure the wireless network can support this, as changing the minimum data rate can impact roaming, as well as connectivity for low-powered devices.
  2. Multicast-to-Unicast Conversion (M-to-U). Many vendors of wireless APs support multicast-to-unicast conversion, which sends a unicast copy the frame to each intended receiver, using IGMP snooping to determine those stations. This means that the frame can be sent at the receiving station’s best data rate, which should almost always be above the minimum. Several unicast transmissions at 54Mbps would still use less channel time than the same multicast transmission at 1Mbps. In addition, stations which aren’t the intended receivers don’t need to wake up to listen to the frame, reducing their battery consumption.

The pudding

Let’s take a look at the same multicast frame sent with and without Multicast-to-Unicast Conversion. Using iperf2 (since iperf3 doesn’t support multicast), we’ll generate multicast traffic at a rate of 20Mbps from a wired client and send it to a wireless client, using multicast address

Parameters for this test:
Receiver: MacBook Pro (2015 edition). 3 spatial stream 802.11ac airport card.
Access Point: Cisco Meraki MR42E (802.11ac Wave 2, 3×3:3) with omni-directional dipole antennas.

Wired Multicast Source (

Mcast Source.png

Wireless Multicast Recipient (M-to-U enabled): 

Mcast MtoU.jpg

Wireless Multicast Recipient (M-to-U disabled):

Mcast No MtoU.jpg

The first thing to notice is the loss rate. With M-to-U enabled, my 20Mbps stream was successfully being transmitted with almost no loss. With M-to-U disabled, throughput was reduce by roughly 95%, with an average of 1Mbps throughput. There are two reasons for this: first, the mandatory data rate used for the multicast transmission was 6Mbps, of which ~40% is attributed to protocol overhead. In addition, with a unicast transmission the AP can buffer frames to a receiver, whereas a multicast transmission is best effort: it has no layer 2 acknowledgement or communication from the receivers. This can be improved with application-level handling, such as the application deciding to transmit at a lower quality, but there are no guarantees that the application is set up to handle that. iperf has no such throttling/accommodation.

To dive in further, let’s take a look at the differences in the frames transmitted:

Frame Capture (M-to-U enabled):

Unicast Multicast.png

Frame Capture (M-to-U disabled):

Demulticast Multicast.png

We can verify that the second frame is using multicast by the MAC address in the Destination Address field, since all multicast MACs begin with 01-00-5E. Notice also that the source address of the unicast frame is set to the MAC address of the access point as the AP had to generate that frame, whereas the multicast frame’s source is that of the sending station since there was no frame modification needed.

Next, we’ll look at the data rate. Multicast is always sent at the basic rate, which was 6 Mbps for this BSSID, and a transmission time of 2072μs. Compared to M-to-U with a data rate of 540Mbps and a transmission time of 46μs. That means that the multicast transmission held the channel 45 times longer than the unicast, and still only sent half as much data.

Also, since multicast must use the lowest-common-denominator parameters, it cannot take advantage of efficiency improvements such as A-MPDU and multiple spatial streams offered by this AP.

So wouldn’t M-to-U be the silver bullet solution?

As is often the case, the answer is “it depends”. In a lab where my 3 spatial stream MacBook Pro can connect at MCS 8, it may appear so. But, if the majority of clients are connected at a low data rate, and the content only consumes a small amount of bandwidth, the overhead caused by retransmitting small frames for a large number of receivers could add delay and consume more aggregate airtime than simply transmitting once at a low data rate.

Deploying Wi-Fi for Location Analytics

Many Wi-Fi vendors on the market now include the capability to leverage access points for location analytics in addition to serving clients. However, deploying location analytics has its own set of requirements, and attempting to simply leverage the same APs for location analytics may have suboptimal results if not planned out correctly. The following sections will detail some of these design considerations to optimize location accuracy and performance.

How do APs determine a device’s location?

Wi-Fi geolocation is done primarily by collecting the RSSI of frames sent from a client seen by multiple access points in an area, then applying trilateration algorithms to that data to approximate the location of a device. This requires careful placement of access points, as well as accurate placement on a floor plan or other location system within the access point controller.

AP Placement Considerations

First and foremost, for trilateration to work properly a client needs to be heard by at least three APs at any given time, and four would be ideal. On the flip side, more than five or six APs could limit the effectiveness by adding unnecessary noise and interference in the environment. A client that is only seen by two APs will be accurate in one dimension (the distance between the APs), but won’t be able to accurately detect the location in the second dimension.

Contrary to designing for coverage, location detection works best when the service area is completely encapsulated by the access points, meaning that APs are placed on the outer edge of the zone where devices will be located.

Because trilateration happens in the latitude and longitude planes and signal strength is used to determine the distance of a client between APs, placing APs in a perfect grid or line actually inhibits the APs from detecting the offset from each other. It’s recommended to place APs in an imperfect shape, which is especially important in long narrow spaces such as corridors or alleys.

Finally, minimize any major line-of-sight obstructions between APs, especially in areas of heavy traffic. Shelves and walls between APs will impact the RSSI received by the AP, which will place the client further away from the AP than it actually is.

Factors Impacting Accuracy

Traffic Frequency

It’s important to note that the accuracy will be limited by how often the access points see frames from a client. For a mobile phone with the screen turned off, such as in someone’s pocket, APs will rely on the periodic probes that a device will send out, which may be as few as a couple of times per minute, meaning our location detection will only be current to the last probe.

Trilateration Frequency

Because it can take a lot of processing power to constantly detect and triangulate a large number of clients, many Wi-Fi vendors will aggregate the received data and perform the trilateration at regular intervals, such as once per minute. It’s important to review the vendor’s documentation and set expectations accordingly.

MAC Randomization

Both iOS and Android support MAC randomization, which masks the device’s true MAC address in many management frames. This can make triangulating a device, or keeping track of subsequent visits, significantly more difficult. iOS has this feature enabled by default, whereas most Android phones default to disabled. There are ways to de-anonymize these devices, but it’s usually more hassle than it’s worth. The easiest way to overcome MAC randomization is to encourage devices to join the Wi-Fi network, as the real MAC address is used for association.

Beyond Wi-Fi

Because most Wi-Fi clients are mobile and probe frequency is sparse, sub-meter accuracy will be difficult-to-impossible to achieve. Other technologies, such as BLE, RFID, and RTLS may be used in place of, or in addition to, relying on Wi-Fi for location analytics. Some vendors, such as Meraki, include BLE scanning radios in their access points. While BLE can be more accurate than Wi-Fi, a larger percentage of devices are either not BLE-enabled, or users are disabling the BLE radio in their client device.

Wifi and Meraki Widgets for Mac and Windows

I recently decided to try to learn how to write python a little bit. I’m still not very good at it, however I did create something recently that I feel should be shared! Meraki local status pages can provide some very useful information for troubleshooting, however having to browse to is not always desirable, nor does it update quickly if you are walking around troubleshooting and connecting to different devices. So I figured hey, lets create a widget or skin for some common overlay tools out there (Ubersicht for Mac, and Rainmeter for Windows) and try and populate some useful information. So without further droning on, I want to introduce the tools I created! 

Meraki Skin for Rainmeter (Windows)

This skin requires the use of Rainmeter for Windows. For those not familiar, Rainmeter is a free tool that allows you to do anything from display useful data about your computer, to writing entire user interfaces to perform just about any function.
My rainmeter skin combines a number of bits of data, from the wifi stats out of netsh, IP info from netsh, and a bunch of data points from the first MR, MS, and/or MX that you are connected behind. There is also a hard requirement for Python3 to be installed in your PATH. This is so rainmeter can execute the script associated without having to derive the correct path based on installation. 
For more information please check out the github repo at: Meraki Rainmeter

Meraki Widget for Ubersicht (Mac)

Colorized based on connection quality

This widget requires the use of Ubersicht as a widget overlay tool. For those not familiar, Ubersicht is extremely lightweight and has a number of really cool widgets you can install. 

This widget has a hard requirement for python3 to be installed as well. One of the neat features of Uebersicht is I was able to color code some of the values for RSSI, Noise Floor, and if connected to an MR, the SNR from an AP perspective. These will change colors based on connectivity from green to yellow, to red. (Special thanks to Nathan Wiens @nwiens for helping me with the HTML)

To install please either browse to the Ubersicht widgets repo 
Or to my GitHub Repo

As always, thanks for reading, and if you have any feedback please leave it in the comments section below. Thanks!

MX Dual VPN Hub OSPF to EIGRP Redistribution

Disclaimer: It is a highly recommended practice to employ a system of peer review for any changes you make that effect data plane traffic. This practice is especially important on systems managed via CLI. CLI is not always consistent between software versions or device types. Reviewing documentation and getting a second set of eyes always helps. CLI configuration had been the de facto method for configuring network equipment up until a few years ago. The only way to do it accurately and consistently is to ensure multiple experienced engineers sign off on the candidate configurations. Without peer review configurations tend to be riddled with typos, artifacts from cut-and-paste, and inconsistent conventions. Always have someone check your work.


In this blog post I will review how to implement dual hub Cisco Meraki MX’s into an existing Cisco infrastructure that is running EIGRP as the dynamic routing protocol.

As of June 2018 – MXs allow for OSPF peering when in vpn concentrator mode or in NAT mode with a single VLAN(there is also a beta for BGP but that is for another day). This OSPF peering however only does route injection and does not learn routes. This is to allow for upstream/downstream devices to be aware of VPN peer subnets and also to allow for us to have redundant dynamic routes to VPN peers in the scenario where we have dual hub MXs in a hub and spoke topology.

For a breakdown in understanding MX 1-arm concentrator mode, please refer to the following document:

VPN Concentrator Deployment Guide


SD WAN Deployment Guide (CVD)

Now for the fun stuff.

Both of these guides outline in pretty decent detail how to deploy the technologies from a meraki perspective. The one bit of data lacking is integration into existing topologies with EIGRP.

Summary of the problem

In the graphic below, we have 2x MXs that are operating as Hubs in VPN concentrator mode. They are both peering to an upstream L3 routing appliance via OSPF. The L3 appliances are then using EIGRP for dynamic routing within the organization.

The spoke MX is configured to connect to DC1 as the primary VPN path, and DC2 as the secondary VPN path. If we were to just redistribute the subnet into EIGRP both MXs would show a more or less equal cost to the destination network of This could potentially cause asynchronous routing if the spoke MX sent traffic to DC1 for a service upstream, and then the upstream router chose to send the return traffic to DC2. This is not the desired behavior typically as that means the return path may be less desirable (higher latency, loss, etc). When you add dual hubs to a spoke MX, there is no way on the hubs to prune the routes to the spoke as the hubs will always advertise all spoke networks connected (unless the spoke VPN is down).

To fix this potential problem we need to deploy some configurations to make sure that a spoke’s traffic that goes to a primary hub, returns on the same path.


As you can see in the diagram the configuration is not terribly complex however we should break down each piece.


Without going into an entire blog post of how EIGRP works, remember that EIGRP does not use a simple cost setting like you would use in OSPF to weight a route. Instead EIGRP uses {Bandwidth}{Delay}{Reliability}{Load}{MTU}. We will be taking advantage of modifying the {delay} attribute when redistributing to make one set of redistributed routes appear more desirable than others.

A little light reading: Introduction to EIGRP

Prefix Lists

When redistributing routes you can take a number of different approaches. My least favorite is crossing your fingers and just redistributing the entire protocol. This can be useful in some circumstances but can cause more harm than good if you are not 100% on what routes you could possibly be injecting. That is why I recommend using prefix lists to more or less create filters for your redistribution statements. There is a potential if you redistribute the entire protocol that you could cause asynchronous routing as well which we want to avoid as we want the traffic to take the same return path. An example of a prefix list would be:

“ip prefix-list {Name} seq 10 permit {subnet in CIDR notation}”

E.g. “ip prefix-list PRIMARY seq 10 permit”

As you can see here we can use the sequence {XX} to place a statement before or after. I recommending if your list is not going to be too large to skip a few in between. E.g. Seq 10, seq 20, seq 30. This can be useful when you are building large prefix-lists and need to slide a prefix in between two others.

You can also use the suffix of “le XX” or “ge XX” as greater than or less than the prefix listed. This can be useful if you are trying to match a number of smaller prefixes.

Example:“ip prefix-list PRIMARY seq 20 permit ge 16 le 30”

This prefix list would match any prefixes in the that are greater than 16 bits and less than 30 bits. So any prefix that is 10.X.X.X/16 – 10.X.X.X/30 would be matched. However 10.X.X.X/31 would not be matched and 10.X.X.X/12 would not be matched due to being too small and too large respectively.

How could we use prefix lists though?

In this case we want to make the DC1 advertisement look much better than DC2, and only for the spoke MXs that are using DC1 as their primary. To accomplish this we would use prefix-lists to only match those spoke sites, and set a metric that is desirable on DC1 and less desirable on DC2.

This configuration on the DC1 IOS L3 appliance would be:

“ip prefix-list PRIMARY seq 10 permit”

On the DC2 IOS L3 appliance it would be:

“ip prefix-list SECONDARY seq 10 permit”

For more on prefix lists please read the following great blog article: PacketLife – Understanding IP Prefix Lists

Route Maps

Now to inject the prefix-lists we created, we need to utilize route maps. Route Maps are extremely versatile in function and can perform anything from route filtering to policy based routing and beyond. In this case we are just going to use route maps for route matching and injection.

On DC1 the route map would be:

Route-map HUB-Primary permit 10

Match IP address prefix-list PRIMARY

On DC2 the route map would be:

Route-map HUB-Secondary permit 10

Match IP address prefix-list SECONDARY

Route Map light reading: Route-Maps for IP Routing Protocol Redistribution Configuration

OSPF Configuration

One thing I always try to do is avoid making changes to a routing process when I have a multi-step configuration I am working on. This is in part to preserve the existing routing table and to make sure we do not make any config changes that would be potentially outage causing. To get OSPF up and running there are a number of configs to keep in mind. In this case the OSPF config is contained between the L3 first hop and the MX so it is not as paramount to add in all the little configuration tidbits that you would in a larger scale OSPF deployment. That being said if you are a big OSPF fan and want to build out your configuration with router-ids and other fun things have at it. In my example I will not be tuning the OSPF config.

DC1 & 2 OSPF configuration:

Router OSPF 10

Network X.X.X.X X.X.X.X area 0 #where the Xs represent the P2P network and wildcard mask between your MX and L3

E.G. “network area 0”

Passive-interface default # Don’t advertise OSPF on any interface

No passive-interface {interface name} #ok advertise on this interface

E.g. “no passive-interface gigabitethernet 1/0/48”

Redistribution into EIGRP

Now what we all have been waiting for…. Let’s get some routes into EIGRP!

To redistribute the routes from our VPN topology into EIGRP we are going to tie our previous configurations together in 1 nice lengthy statement.

For DC1:

redistribute ospf {process} route-map {route-map we created} metric {bandwidth | delay | reliability | load | MTU} 

E.G. “redistribute ospf 10 route-map HUB-Primary metric 10000 10 255 5 1500”

For DC2:

E.G. “redistribute ospf 10 route-map HUB-Secondary metric 10000 1000 255 5 1500″

Notice in the above example the underlined delay value is highly increased from the primary hub redistribution. This will make it more or less a backup route for access to the VPN subnet.

What if I have different spokes using DC2 as primary and DC1 as secondary?

In a lot of situations you have some spokes terminating on DC1 or DC2 as their primary hub. Which is to be expected as deployments grow and bandwidth isn’t always easily upgraded. To deploy we can take advantage of what we have already built, and add a little more config. So in the event that we have a spoke site with terminating on DC1 as a primary, and terminating on DC2 as primary, we would do the following:

For DC1:

ip prefix-list PRIMARY seq 10 permit

ip prefix-list SECONDARY seq 10 permit

Route-map HUB-Primary permit 10

Match IP address prefix-list PRIMARY

Route-map HUB-Secondary permit 10

Match IP address prefix-list SECONDARY

router ospf 10

redistribute ospf 10 route-map HUB-Primary metric 10000 10 255 5 1500

redistribute ospf 10 route-map HUB-Secondary metric 10000 1000 255 5 1500

For DC2:

ip prefix-list PRIMARY seq 10 permit

ip prefix-list SECONDARY seq 10 permit

Route-map HUB-Primary permit 10

Match IP address prefix-list PRIMARY

Route-map HUB-Secondary permit 10

Match IP address prefix-list SECONDARY

router ospf 10

redistribute ospf 10 route-map HUB-Primary metric 10000 10 255 5 1500

redistribute ospf 10 route-map HUB-Secondary metric 10000 1000 255 5 1500

In the above examples, what we did is make sure the primary spokes have priority from the primary hubs, and the secondary spokes are heavily weighted to be backup routes in the event that their primary hub goes down.

In Closing

This entire post came about after a number of situations I have had with customers needing redistribution and not having a clear path on how to do so. I included some links throughout the document that I urge you to read as you are configuring either in a lab or production. If you find any errors in my configurations or recommendations please let me know either in the comments or via DM on twitter/linkedin. Thank you for reading and I hope this was informative for you!

Deconstructing the RADIUS CoA process

If you need to brush up on the RADIUS process, please read my previous post:
Following the 802.1X AAA process with Packet Captures

Everyone talks about it, yet I rarely meet folks that really understand what CoA (Change of Authorization) means for RADIUS authentication and client access. I recently spent a few hours troubleshooting RADIUS CoA and figure since it is fresh in my mind maybe I can share and hopefully help others out in the field.


In Summary: RADIUS Change of Authorization (RFC 3576 & RFC 5176) Allows a RADIUS server to send unsolicited messages to the Network Access Server (aka Network Access Device/Authenticator in Cisco terminology e.g. AP/WLC/Switch/Firewall) to change the connected client’s authorized state. This could mean anything from disconnecting the client, to sending different attribute value pairs to the Authenticator to change the device’s VLAN/ACL and more. It is fairly robust in what it can do so I may not go too deep as I want this to be consumable.

What RADIUS CoA is NOT: Magic!

I will be walking through CoA Use Cases, what CoA looks like from a PCAP perspective,  and how to gather data for troubleshooting.

RADIUS CoA Typical Use Cases:

Central captive portal (Open SSID with MAC filtering) – Especially with Cisco ISE, RADIUS CoA is the core feature set required for the captive portal. In the example below, we are redirecting a client to a splash page for either Authentication or Acceptable Use Policy review. As you can see below we have a pretty simple process.

  1. The client connects to the network (wired/wireless)
  2. Client MAC address is sent to RADIUS server as a username and password (Access-Request)
  3. RADIUS server responds with an Access-Accept and a URL redirect. (could also include a VLAN assignment)
  4. The client is redirected to the splash portal
  5. User logs in using the credentials required
  6. RADIUS server then sends a CoA with a request to reauthenticate
  7. Authenticator (AP/Switch/WLC) sends a CoA-ACK
  8. Authenticator sends an Access_Request with existing Session-Id and authentication data.
  9. RADIUS server then responds back with Access-Accept and any extra functions e.g. a Filter-ID for group policy assignment in Meraki Wireless.

Wireless and Wired CoA-Reauthenticate Process

Screen Shot 2018-01-16 at 2.17.07 PM

The above process is also used for secure device registration and URL redirects for blacklisting etc. but would involve a complete client authentication/reauthentication via EAP instead of MAC authentication. For an example check the shared captures labeled 1-of-2 and 2-of-2. These contain the EAPoL side and the RADIUS side.

Client Posturing – In some cases you may want to perform posturing on the end client. This, more often than not, requires a client on the end machine, whether it is a dissolvable agent with java, or a thick client like Cisco AnyConnect. The whole goal of posturing is making sure the clients that have access to your internal resources are properly secured from threats. A common scenario is a user removing or disabling Anti-Virus. When this event occurs it may be desired to limit that client’s access to the network until AV is reinstalled or enabled. This could be done through an ACL or VLAN change.

One of the difficult situations that arises when changing VLANs is the client may not release their IP address. In 802.11 this is easily handled by sending a disconnect-request instead of reauthentication. In wired authentication scenarios this is not typically recommended as it requires a port bounce and can take some tweaking to make work well, if at all. Instead of a VLAN change it is recommended to perform ACL changes to wired clients. On a catalyst switch this could be a dACL (Downloadable ACL) for instance.

Dynamic Network Restrictions – Closely following the use case above, a client’s access may need to be dynamically changed if they are not adhering to the network policy. Using products such as Cisco’s Stealthwatch in tandem with Cisco ISE, we could monitor a client for data dumping thresholds and change the VLAN/ACL applied to them or shut down the port to minimize the impact. This is just one example of the many possibilities.

Wireless Disconnect-Request Flow

Screen Shot 2018-01-17 at 11.28.18 AM

Now on to the Fun stuff….

To capture CoA packets:

The CoA packets are only seen between the authenticator and the authentication server. Therefore we need to capture between the authenticator and the authentication server as depicted below.

Screen Shot 2018-01-16 at 11.15.39 AM

In most environments this consists of using a SPAN/RSPAN port to capture traffic. Some vendors do provide the ability to perform tcpdumps/pcaps which can be a little easier, especially if you are offsite. For capture applications I tend to lean towards using wireshark as it is free and powerful. To download please go to

CoA Messages are sent on two different udp ports depending on the platform. Cisco standardizes on UDP port 1700, while the actual RFC calls out using UDP port 3799. These messages are all included in the “radius” wireshark filter.

Just in case you don’t have a test network please feel free to use the pcaps in this share:

CoA PCAP Examples

RADIUS CoA Packet Types

There are two different RADIUS CoA packets that are sent from the RADIUS Server (Authentication Server):

  • Disconnect-Request – Requests to terminate the session of the client.
  • CoA-Request – Requests to do a number of things from reauthenticate to port-bounce, shutdown, and more.

And there are four that are sent from the NAS/NAD/Authenticator:

  • Disconnect-ACK – Acknowledgment of successful disconnect
  • Disconnect-NAK – Failed session disconnect
  • CoA-ACK – Acknowledgment of successful CoA action
  • CoA-NAK – Failed CoA action

RADIUS Server Sourced Packets

In this section we will review the two CoA messages that are sent from the RADIUS server and the useful material in the packet.

Disconnect-Request Message

Wireshark Filter: radius.code == 40

This packet is sent from the RADIUS server and is used to simply disconnect the client from the current session. This also typically involves an immediate re-authentication by the client. Disconnect-Requests can/should be used in 802.11 situations where a VLAN change needs to occur. If we simply used a CoA-Request (as we’ll see later), the client may be changed to a new VLAN while keeping the IP address it obtained from the former VLAN, clearly causing problems.

Screen Shot 2018-01-12 at 10.29.29 AM

A few useful attributes in this message are:


The account terminate cause will let you know the reason for the request. This can vary but typically is classified as an Admin-Reset from a Cisco ISE Perspective.

Screen Shot 2018-01-12 at 10.31.56 AM

Audit-Session-ID & Calling-Station-ID

These fields can be used to filter information from your RADIUS server regarding the client MAC address (Calling-Station-Id) and session ID. So when you need to hunt down a particular failure in a log, you can correlate the logs via these two attributes.

Screen Shot 2018-01-12 at 10.33.09 AM

NAS Response Link

Wireshark helpfully gives a link to the frame that is the NAS response to the RADIUS server. This Disconnect-Ack packet will be reviewed in the authenticator sourced packets section later in this post.

Screen Shot 2018-01-12 at 11.02.31 AM


CoA Request

Wireshark Filter: radius.code == 43

Unlike the Disconnect-Request above, a CoA-Request can contain a number of actions. This can include anything from reauthentication to bouncing or shutting down a port. A lot of these can be vendor specific responses. So in this instance I am going to use the Cisco ISE CoA Request info. One thing to note is useful attributes are also still the Audit-Session-Id, Calling-Station-ID, and the Response Link as well as the attributes below.

Screen Shot 2018-01-12 at 4.49.13 PM

Useful Info:

Cisco-AVPair: subscriber:command = XXXXXX

This is where we are able to request that the authenticator perform a function. With Cisco ISE it is rolled into a Cisco-AVPair: subscriber:command.

For instance:

  • subscriber:command=reauthenticate

This request will cause a reauthentication either for the client via EAP, or the authenticator may send the MAC address and session ID again in the event that it is a MAC authenticated session.

Screen Shot 2018-01-17 at 2.29.52 PM


  • subscriber:command=bounce-host-port

This is a wired only CoA Request. A request to bounce the host port will end up with a link-down link-up event on the switchport. This can be useful for trying to move a client to a new VLAN if possible. This is not something I recommend defaulting to for guest portals however as it can take some tweaking to the core CoA configurations. In ISE this would involve rewriting the Network Device Profile and CoA ReAuth requests to include a port-bounce, which I do not believe is a recommended practice.

Screen Shot 2018-01-17 at 2.37.52 PM


  • subscriber:command=disable-host-port

This is another wired only CoA request. This will disable the switchport if the switch supports it. I have seen cases where the end switch may not support a port shutdown and will bounce the port instead. This is not a recommended CoA request for most situations as it takes manual intervention to resolve. Instead a VLAN or ACL change is far more effective, even if the VLAN doesn’t exist (blackhole).

Screen Shot 2018-01-17 at 3.41.15 PM

Authenticator Sourced Packets

Now we will review the packets that are sent in response to the CoA or disconnect request from the server. These are fairly simple and usually only include an ACK for pass or NAK for failure.


Wireshark Filter: radius.code == 41

This is an acknowledgment of a successful disconnect-request instruction from the authenticator to the RADIUS server. This packet can contain attributes such as the session that was disconnected, calling-station-id, or just simply the Message-Authenticator.

Screen Shot 2018-01-17 at 4.18.15 PM


Wireshark Filter: radius.code == 42

This is an acknowledgement of a failed disconnect-request. This might happen if the client is already disconnected, or if the session has ended prior to the disconnect request. In the example screenshot we can see a bit of useful information in the error cause attribute.

Screen Shot 2018-01-17 at 4.15.00 PM


Wireshark Filter: radius.code == 44

As with the Disconnect-ACK, the CoA-ACK is just an acknowledgement of the success of the CoA requested action. This packet can contain attributes such as the session that was disconnected, calling-station-id, or just simply the Message-Authenticator.

Screen Shot 2018-01-17 at 4.17.05 PM


Wireshark Filter: radius.code == 45

Once again just like the Disconnect-NAK, the CoA-NAK is an acknowledgement of a failed CoA action. This could be due to lack of support or the session has ended prior to the CoA-Request. Just like the Disconnect-NAK we get a nice Error-Cause for further troubleshooting.

Screen Shot 2018-01-17 at 4.15.38 PM


In closing

One thing to remember is CoA can be used to create some very complex if-this-then-that type scenarios. In the end however it is not a complex feature and definitely not magic! I hope this post was informative for you. If you find anything incorrect please let me know. Thanks and good luck!

Single SSID BYOD Onboarding

**This video builds on top of the previous video of BYOD with Device Registration and Native Supplicant Provisioning. So please be sure to watch it for configuring the certificate templates and some of the SSID configuration. **

In this video we configure ISE and wireless with a single SSID for WPA2-Enterprise to perform device registration and EAP-TLS provisioning.

BYOD with Device Registration and Native Supplicant Provisioning

Aside from standard radius authentication and guest access, ISE is also useful for secure BYOD access. In this video I walk through building an onboarding SSID and Secure SSID in dashboard. Then in ISE we configure the guest portal, certificate template, native supplicant provisioning profile, and rule sets to put it all in play. Once that is done we test and verify access.