Guidelines for High Availability

Model Support

  • Firepower 1010:

    • You should not use the switch port functionality when using High Availability. Because the switch ports operate in hardware, they continue to pass traffic on both the active and the standby units. High Availability is designed to prevent traffic from passing through the standby unit, but this feature does not extend to switch ports. In a normal High Availability network setup, active switch ports on both units will lead to network loops. We suggest that you use external switches for any switching capability. Note that VLAN interfaces can be monitored by failover, while switch ports cannot. Theoretically, you can put a single switch port on a VLAN and successfully use High Availability, but a simpler setup is to use physical firewall interfaces instead.

    • You can only use a firewall interface as the failover link.

  • Firepower 9300—Intra-chassis High Availability is not supported.

  • The threat defense virtual on public cloud networks such as Microsoft Azure and Amazon Web Services are not supported with High Availability because Layer 2 connectivity is required.

Additional Guidelines

  • When the active unit fails over to the standby unit, the connected switch port running Spanning Tree Protocol (STP) can go into a blocking state for 30 to 50 seconds when it senses the topology change. To avoid traffic loss while the port is in a blocking state, you can enable the STP PortFast feature on the switch:

    interface interface_id spanning-tree portfast

    This workaround applies to switches connected to both routed mode and bridge group interfaces. The PortFast feature immediately transitions the port into STP forwarding mode upon linkup. The port still participates in STP. So if the port is to be a part of the loop, the port eventually transitions into STP blocking mode.

  • Configuring port security on the switches connected to the threat defense device failover pair can cause communication problems when a failover event occurs. This problem occurs when a secure MAC address configured or learned on one secure port moves to another secure port, a violation is flagged by the switch port security feature.

  • For Active/Standby High Availability and a VPN IPsec tunnel, you cannot monitor both the active and standby units using SNMP over the VPN tunnel. The standby unit does not have an active VPN tunnel, and will drop traffic destined for the NMS. You can instead use SNMPv3 with encryption so the IPsec tunnel is not required.

  • Both the peer devices go into unknown state and high-availability configuration fails if you run clish in any of the peer devices while creating a High Availability pair.

  • Immediately after failover, the source address of syslog messages will be the failover interface address for a few seconds.

  • For better convergence (during a failover), you must shut down the interfaces on a HA pair that are not associated with any configuration or instance.

  • If you configure failover encryption in evaluation mode, the systems use DES for the encryption. If you then register the devices using an export-compliant account, the devices will use AES after a reboot. Thus, if a system reboots for any reason, including after installing an upgrade, the peers will be unable to communicate and both units will become the active unit. We recommend that you do not configure encryption until after you register the devices. If you do configure this in evaluation mode, we recommend you remove the encryption before registering the devices.

  • When using SNMPv3 with failover, if you replace a failover unit, then SNMPv3 users are not replicated to the new unit. You must remove the users, re-add them, and then redeploy your configuration to force the users to replicate to the new unit.

  • The device does not share SNMP client engine data with its peer.

  • If you have a very large number of access control and NAT rules, the size of the configuration can prevent efficient configuration replication, resulting in the standby unit taking an excessively long time to reach standby ready state. This can also impact your ability to connect to the standby unit during replication through the console or SSH session. To enhance configuration replication performance, enable transactional commit for both access rules and NAT, using the asp rule-engine transactional-commit access-group and asp rule-engine transactional-commit nat commands.

  • A unit in a High Availability pair transitioning to the standby role synchronizes its clock with the active unit.

    Example:

    firepower#show clock
    01:00:52 UTC Mar 1 2022
    
    ...
    01:01:18 UTC Mar 1 2022 <======= Incorrect (previous) clock
    Cold Standby               Sync Config                Detected an Active mate
    
    19:38:21 UTC Apr 9 2022 <======= Updated clock
    Sync Config                Sync File System           Detected an Active mate
    ...
    firepower/sec/stby#show clock
    19:38:40 UTC Apr 9 2022
  • The units in High Availability do not dynamically synchronize the clock. Here are some examples of events when synchronization takes place:

    • A new High Availability pair is created.

    • High Availability is broken and re-created.

    • Communication over the failover link was disrupted and reestablished.

    • Failover status was manually changed at the CLI using the no failover/failover or configure high-availability suspend/resume (threat defense) commands.

  • Enabling High Availability forces all routes to be deleted and are re-added after the High Availability progression changes to the Active state. You could experience connection loss during this phase.

  • If you replace the primary unit, then when you re-create high-availability, you should set the replacement unit as the secondary unit so that the configurations are replicated from the former secondary unit to the replacement unit. If you set the replacement unit as primary, you will accidentally overwrite the configuration that is present on the operational unit.

  • Deploying Firepower 1100 devices in high availability with hundreds of interfaces configured on them can result in increased delay in the failover time (seconds).

  • In the High Availability configuration, short-lived connections, usually using port 53, are closed quickly and never transferred or synchronized from Active to Standby, so there might be a difference in the number of connections on both High Availability devices. This is expected behavior for short-lived connections. You can try to compare the connections that are long-lived ( for example, more than 30-60 seconds).

  • In the High Availability configuration, embryonic connections—connection requests that have not yet completed the three-way handshake process—are closed quickly and not synchronized between the active and standby devices. This design ensures HA system efficiency and security. For this reason, there might be a difference in the number of connections on both High Availability devices, which is to be expected.

  • If the failover LAN link is not connected back-to-back and instead connected through one or more switches, a failure within the intermediate path can cause the active unit to lose connectivity with the standby unit, resulting in inconsistent active/standby states. Although this does not impact High Availability functionality, it is recommended to check and recover the failover-link path between the active and standby units.

    When the failover LAN link is down, it is not recommended to deploy any configuration, as it may not be replicated to the peer unit.

  • See the Cisco Secure Firewall Threat Defense Virtual Getting Started Guide and review your threat defense virtual device configurations for high availability.