пятница, 25 августа 2017 г.

Fail Closed, Fail Open, Fail Safe and Failover: ABCs of Network Visibility

One of the important issues in network operations is how the potential failure of a component will affect overall network performance. Physical and virtual devices deployed on the network can be configured to fail open or fail closed. These conditions impact the delivery of secure, reliable, and highly-responsive IT services.
FAIL CLOSED
Simply stated, failing closed is when a device or system is set, either physically or via software, to shut down and prevent further operation when failure conditions are detected.
This strategy is common in situations where security concerns override the need for access. We encounter this every day when we forget the password to a seldom-used personal account and are denied entry. A physical example is the failure of a metal detector at the entrance to a federal courthouse, which leads to a long line of people waiting to get in at a second door, while a technician tries to repair the first door. In these situations, access is a second priority to security.
Benefits of Fail Closed
To prioritize security: In an IP network, security appliances like firewalls can be configured to fail closed, to prevent incoming Internet traffic from being passed into your internal network when the firewall is unable to confirm that the packet is allowed. The network outage that results from a firewall outage can be minimal if a backup firewall quickly takes over processing duties (like the second door at the courthouse). The fail closed condition generally provides greater confidence that a cyber threat or attack will not sneak in while a firewall is offline.
It’s important to note that the fail closed strategy, even for a device like a firewall, has not always been the rule. In some environments, network interruption can be a greater concern than security, leading to the choice to fail open. This was more frequently the case in the early days of firewall deployment, when organizations were learning how to balance the need for security inspection with network availability.
FAIL OPEN
A system set to fail open does not shut down when failure conditions are present. Instead, the system remains “open” and operations continue as if the system were not even in place.
This strategy is used when access is deemed more important that authentication. Healthcare systems are sometimes operated on a fail open basis, such as when emergency care is provided even without authentication of insurance coverage or the ability to pay. The risk (of non-payment in this case) is essentially mitigated by performing authentication after-the-fact. Another example often cited is when a door with an electronic locking mechanism is automatically unlocked when the system fails and is unable to authenticate access credentials. This ensures an exit is made available, particularly in the event of a fire or natural disaster that disables electronic systems.
Benefits of Fail Open
To protect access: Historically, some organizations considered inline deployment of a network firewall to be a “nice-to-have,” rather than an essential element of IT security. When a firewall failed, they preferred to have it fail open and let Internet traffic proceed on into the internal network without authentication. The thinking was that, the majority of traffic was safe and the risk of a network breach was low, so it did not make good business sense to interrupt network operations. The business risk was minimized by prioritizing firewall restoration to limit potential exposure and by analyzing copies of network traffic (using out-of-band tools) to detect suspicious activity after the fact. The fail open condition prevailed in situations where access was deemed more important than security.
To supplement another security appliances: There are other security solutions that organizations may want to operate in a fail open condition to supplement the function of existing security appliances. One example is an advanced malware protection (AMP) sandbox, which is used to execute unknown files in a safe environment and provide the results to anti-malware solutions. Since the sandbox is supplementing the main device, it’s failure may not require a complete shutdown of processing.
For deployment and testing: Another practical use for fail open is during the initial deployment and testing period of a new security appliance. Configuring a new device to fail open allows the team to become comfortable with the operation and learn how to respond to alert situations without becoming overwhelmed. Once the team feels confident, the device can be switched over to a fail closed condition, for greater risk management.
FAIL SAFE
Another definition is relevant here and that is fail safe, which refers to a device that is configured to protect all other components in the system from failure, in the event the device itself fails. Practically, this can have the same result as failing open, but fail safe is often achieved through addition of a separate device, known as a bypass switch.
Bypass switches are deployed “in front of” network devices and work by establishing a direct connection to the device and monitoring its ability to receive and process traffic. This is achieved by sending a very small network packet, called a heartbeat packet, to the device at very fast intervals—generally one every couple microseconds. If the packet is returned, the bypass remains open; if the packet is not returned, traffic is bypassed around the device and moved along to the next switch in the network.
Many network security appliances, such as next generation firewalls and IPS solutions, now include an internal bypass function. However, internal bypasses do not provide all of the functionality of an external bypass switch.
An external bypass switch deployed in front of a network device can be activated proactively by the IT staff, to take a device offline for regular maintenance, periodic troubleshooting, or repositioning in the network. The external bypass essentially removes a particular device temporarily from the active network, eliminating the need to wait for a network maintenance window to perform upgrades or respond to support issues.
FAILOVER
A final concept to consider is failover, the ability to recover the functionality of network devices that fail. This is a broader concept than fail safe, which only specifies only no adverse impact to other components. Failover implies recovery of functionality, achieved through redundancy. External bypass switches are now available with the ability to designate an alternative path for traffic in the event of a network device failure. For example, should the primary IPS appliance fail, when the external bypass switch detects the failure (within microseconds of the event), the switch can automatically begin sending traffic to a secondary, backup appliance. This can be a cost-effective solution for achieving resiliency.
Summary
Depending on an organization’s priorities, the failure of a security appliance or other network device can be handled by halting the flow of network traffic (configuring to fail closed) or moving the traffic around the offline device (fail open or fail safe), or directing to a backup appliance (failover). These choices enable an enterprise to deliver secure, reliable, and highly-responsive IT services.
Find out more about the benefits of external bypass switches or reach out to Ixia for more information.

Fortigate Optimize AV (Fail-Open and Fail-Open session)

1 Antivirus failopen

1.1 - Introduction

Dealing with high traffic volume may cause the following two problems:
  • Running in conserve mode due to low system memory
  • Proxy connection pool has no free connectionsThe first problem deals with low memory situations. The antivirus system operates in one of two modes, conserve mode and non-conserve mode, depending on available memory for the whole FortiGate unit. If the free memory is greater than 30% of the total memory, then the system is in non-conserve mode. If the free memory drops to less than 20% of the total memory, then the system enters conserve mode. The system will not go back to non-conserve mode until the free memory once again reaches 30% or greater of the total memory.
    The second problem deals with connection pools and has the av-failopen feature working on a localized level and affecting a single proxy. If a FortiGate unit is receiving large volumes of traffic on a specific proxy, it is possible that the unit will exceed the connection pool limit.  If the number of free connections within a proxy connection pool reaches zero, the av-failopen will be applied to that specific proxy only. Each proxy calculates the size of its connection pool at start up, based on the available memory of the FortiGate. On the FGT5001SX product, for example, when 2G of memory is installed and available, theoretically, each proxy can handle around 9500 connections. But in fact, the installed 2G memory will be shared with the OS and other programs. So, when the proxy starts, the available memory is always less than 2G.
    If either situation occurs, or if both conditions co-exist, the problem will be resolved by the antivirus failopen feature.
    Antivirus failopen is a safeguard feature that determines the behavior of the FortiGate antivirus system if it becomes overloaded in high traffic. The feature is configurable in the CLI only. The command set av-failopen has the following three options.
    offIf the FortiGate unit enters conserve mode, the antivirus system will stop accepting new AV sessions but will continue to process current active sessions. 
    one-shotIf the FortiGate unit enters conserve mode, all subsequent connections bypass the antivirus system but current active sessions will continue to be processed. One-shot is similar to pass but will not automatically turn off once the condition causing av-failopen has stopped.
    WARNING: With the one-shot option, no content filtering of the traffic is done (except perhaps IPS). The data stream could contain malicious content.
    passDefault setting. If the system enters conserve mode, connections bypass the antivirus system until the system enters non-conserve mode again. Current active sessions will continue to be processed.
    WARNING: With the pass option, no content filtering of the traffic is done (except perhaps IPS). The data stream could contain malicious content.

    1.2 - How antivirus failopen works

    There are currently 2 conditions that can cause the FortiGate unit to operate in failopen mode:
    • The system is low on memory and has entered conserve mode.
    • The individual proxy pool is full (no free connections are available).
    In the tables, B = connection blocked, P = connection passed.
    With the first condition, low memory, the av-failopen setting will be applied; see table one. The default for this setting is Pass.
    Table 1: av-failopen
    offone shotpass
    BPP
    With the second condition (the individual proxy pool is full), the action will depend on the av-failopen-session settings. There are two settings, enabled and disabled (default).
    • If the av-failopen-session is enabled and the free connections in the proxy connection pool reach zero, the protocol reverts back to the av-failopen settings as in table one.
    • If the av-failopen-session is disabled, then all sessions will be blocked for the proxy, regardless of the av-failopen settings. See table two.
    Table 2: av-failopen-session
     offone shotpass
    disableBBB
    In the event that both conditions exist at the same time, the av-failopen settings will override the av-failopen-session settings. For example:
    The HTTPS connection pool reaches capacity and the av-failopen-session setting is enabled. The HTTPS proxy will revert to the av-failopen settings and will behave according to table one. No other proxies will be affected and the FortiGate unit will not enter conserve mode. The traffic to the FortiGate unit continues to increase and the free memory drops below the 20% threshold. The FortiGate unit automatically enters conserve mode and the av-failopen-session settings are overridden. All proxies are now affected by the av-failopen settings (see table 1) regardless of the av-failopen-session settings.

    1.3 - How to configure antivirus failopen

  • Antivirus failopen is only available through the command line interface (CLI).
    To enable antivirus failopen
    1. Log in to the FortiGate unit CLI.
    2. Enter the following command with the desired option.
      config system global
          set av-failopen {off | one-shot | pass}
      end
    3. Enter get system global to confirm the settings.

    1.4 - How to configure antivirus failopen session


    Antivirus failopen session is only available through the command line interface (CLI).
    To enable antivirus failopen session
    1. Log in to the FortiGate unit CLI.
    2. Enter the following command with the desired option.
    config system global
        set av-failopen-session {enable | disable}
    end
  • Enter get system global to confirm the settings.

      2 - Optimize antivirus

      The optimize feature configures CPU settings to ensure efficient operation of the FortiGate unit for either antivirus scanning or straight throughput traffic. When optimize is set to antivirus, the FortiGate unit uses symmetric multiprocessing to spread the antivirus tasks to several CPUs, making scanning faster.
      Note: These procedures are only available for the FortiGate-1000 and higher.
      There are two options for optimize.
      antivirusThe FortiGate unit spreads the antivirus scanning tasks across several CPUs (symmetric multiprocessing).
      throughputDefault setting. The FortiGate unit uses a single CPU to process traffic.

      2.1 - When to use optimize antivirus

      Use optimize antivirus in conjunction with antivirus failopen to ensure maximum efficiency and safeguard against system crashes if the system does become overloaded because of high traffic.

      2.2 - How to configure optimize antivirus

      Optimize is only available through the command line interface (CLI).
      To enable optimize antivirus
      1. Log in to the FortiGate unit CLI.
      2. Enter
        config system global
            set optimize {antivirus | throughput}
        The following warning appears:
        This change will reboot the system.
        If you don't want it to be changed, type "abort"
      3. Type end
        The system reboots.
      4. Log back in to the CLI and enter get system global to confirm the settings.Note: If you get the following message when you enter the optimize command, then this command is not available on the FortiGate unit:
        command parse error before 'optimize' command fail. return code -61
      To restore a configuration including optimize antivirus
      If you are restoring a backed up configuration to the FortiGate unit, you must manually enable optimize antivirus through the CLI, even if the backup already includes this command.
      After restoring the configuration, follow steps 1 through 4 above to enable optimize antivirus.
    1. вторник, 15 августа 2017 г.

      Fortigate Cheat and Tricks

      General Tips

      • You can use the grep utility to filter output from the commands below.
        • Use grep -f to show the the context of the grepped item.

      External support (Fortinet)

      • Generate a TAC report: exec tac report
      • Get crash log: diag debug crashlog read shows the crashlog in a readable format.

      System

      Status

      • Show system status: get system status

      Open Network Connections

      • List open networking ports: diagnose sys tcpsock

      Performance

      • Show performance usage: get system performance status
      • Show top: get system performance top, use SHIFT+M to sort on memory usage.
      • Show top with grouped processes: diagnose sys top-summary
        • Use diagnose sys top-summary -h to show the help message for top-summary
      • Show shared memory information: diagnose hardware sysinfo shm
        • Look if conservemode is 1.

      Processes

      LDAP / Radius Authentication

      • Use the following commands to debug LDAP or Radius:
      diagnose debug enable
      diagnose debug application fnbamd -1
      

      High Availability

      • Show HA status: get system ha status
      • Show HA checksum: get system checksum status
      • Manage other cluster member through HA interface: exec ha manage 0/1
      • Show a HA diff: diagnose sys ha hadiff status
      • Execute a fail-over: diagnose sys ha reset uptime

      Object Management

      • Find object dependencies for object (example): diag sys checkused system.interface.name port1

      Log

      • Set a log filter: execute log filter
      • Show log: exec log show

      Layer 1 (Physical Layer)

      Network Interface Card

      • Show all NIC's: config system interface
      • Show hardware info for NIC: diagnose hardware deviceinfo nic
      • Show device information for specific NIC: diagnose hardware deviceinfo nic <nic>

      Layer 2 (Data Link Layer)

      Address Resolution Protocol (ARP)

      • Show ARP table: get system arp
      • View ARP cache: diag ip arp list
      • Clear ARP cache: execute clear system arp table
      • Remove a single ARP table entry: diag ip arp delete <interface name> <IP address>
      • Add static ARP entries: config system arp-table

      Layer 3 (Network Layer)

      Internet Protocol

      • Execute a ping: exec ping <dst>
      • Set specific ping options: exec ping-options
        • Set specific source IP: exec ping-options source
      • Execute a telnet: exec telnet ip:port

      Routing

      • Show routing table: get router info routing-table all
      • Show routing database: get router info routing-table database
      • Get routing information for specific <host>: get router info routing-table details <host>
      • Execute a traceroute: exec traceroute
      • Poor man's traceroute
        If you would like to test a traceroute for a different source IP than the one assigned to your outbound interface you can use poor-mans-traceroute.
        Use this procedure:
        1. Open a second ssh session and filter on the outbound interface for icmp
        2. Set the execute ping-options timeout to 1.
        3. Set the execute ping-options source to your source IP.
        4. Ping the target host.
        5. Observer the ICMP time to live exceeded message you get from the first router.
        6. Increase the timeout to 2 and repeat from step 4.

      OSPF

      Use Fortinet's recommended procedure to debug OSPF: http://kb.fortinet.com/kb/viewContent.do?externalId=FD31207
      • Show OSPF neighbor status: get router info ospf neighbor all
      • Delete all OSPF entries: execute router clear ospf process
      • Show OSPF router status: get router info ospf status
      • Dump OSPF packets on any interface: diagnose sniffer packet any 'proto 89' 4 0
      • Show OSPF interface: get router info ospf interface.
      • Show OSPF database: get router info ospf database brief

      IPSEC

      • Show list of IPSEC VPN tunnels: get vpn ipsec tunnel summary
      • Show details for IPSEC VPN tunnel: get vpn ipsec tunnel detail
      • Debug IKE:
      diag debug application ike 63 
      diagnose vpn ike log-filter clear
      diagnose vpn ike log-filter dst-addr 1.2.3.4
      diagnose debug app ike 255
      diagnose debug enable
      
      Look for:
      • SNMP tunnel UP / Down traps
      • Own and remote proposal

      Geo IP Information

      • Show Geo IP IP address list: diagnose firewall ipgeo ip-list
      • Show Geo IP countries: diagnose firewall ipgeo country-list
      • Update Geo IP addresses: execute update-geo-ip

      Layer 4 (Transport Layer)

      Firewall

      • Show session table: diagnose sys session list
      • Show session table with statistics: diagnose firewall statistics show
      • Short list for session table: get system session list

      Session List Filters

      It is possible to set filters for the session list.
      • Clear session list filter: diagnose sys session filter clear
      • Show possible session list filters: diagnose sys session filter ?
      • Set session filter for destination IP: diagnose sys session filter dst 8.8.8.8
      • Set session filter for destination port: diagnose sys session filter dport 53

      Traffic Flow through FortiGate

      • Use traffic flow to debug FortiGate policy problems such as NAT.
      diagnose debug enable
      diagnose debug flow show console enable
      Diag debug flow show function enable
      diagnose debug flow filter add 10.10.0.1
      diagnose debug flow trace start 100
      

      Sniffer

      • Dump packets on interface: ~diagnose sniffer packet <interface> '<tcpdump filter>'~
      Packets with TCP RST flag set:
      diagnose sniffer packet internal 'tcp[13] & 4 != 0'
      
      Packets with TCP SYN flag set:
      diagnose sniffer packet internal 'tcp[13] & 2 != 0'
      
      Packets with TCP SYN ACK flag set:
      diagnose sniffer packet internal 'tcp[13]=18'
      
      Packets with TCP SYN and TCP ACK
      diagnose sniffer packet internal 'tcp[13] = 18'
      

      Layer 5 (Session Layer)

      SSL-Inspection

      • Show possible diagnose commands: diagnose test application ssl 0
      • Show SSL proxy usage: diagnose test application ssl 4
      • Show info per connection: diagnose test application ssl 44

      Fortinet Single Sing On (FSSO)

      • Debug FSSO:
      diag debug enable
      diag debug authd fsso list
      diag debug authd fsso server-status
      diag debug authd fsso-summary
      

      Layer 7 (Application Layer)

      Proxy

      • Show user list: diagnose wad user list
      • Test HTTP proxy: diagnose test application http
      • Enable console log for proxy:
      execute log filter dump
      execute log filter category 0
      execute log filter field hostname www.google.ch
      execute log display
      

      FortiGuard

      • Show list of FortiGuard server: diag debug rating

      Antivirus

      • Update Antivirus Database: execute update-now

      IPS

      • Use diagnose test application ipsmonitor ? to get a menu for the IPS monitor.
      • Show DoS anomaly list diagnose ips anomaly list