Cloud Infrastructure Classic

Get Involved. Join the Conversation.

Topic

    David Nicholas
    How to troubleshoot VPN connectivity issues and faults in...
    Topic posted July 2, 2018 by David NicholasGreen Ribbon: 100+ Points, tagged Tip 
    3175 Views
    Title:
    How to troubleshoot VPN connectivity issues and faults in VPNaaS
    Summary:
    General Troubleshooting and Configuration Considerations When Creating or Modifying VPNaaS
    Content:

    1.      Make sure the Internet Key Exchange (IKE) and Internet Protocol Security (IPSEC) timeouts on the VPNaaS and the 3rd party device agree

    • Authentication: pre-shared keys
    • Encryption: 3DES, AES 128, AES 192, AES 256
    • Hash: MD5, SHA1, SHA2
    • Policy Group: Diffie-Hellman groups supported are 2, 5, 14, 22, 23, 24
    • Recommend using PFS from a security viewpoint.

    Phase1 / IKE

    • IKE ID must match (proposed vs. expected)
    • Lifetime must match
    • Double Check Pre-Shared Key (PSK)

    Phase2 / IPSec

    • NAT-T is a requirement of OCI Classic (OCI-C)
    • Lifesize unlimited
    • Lifetime must match
    • remove idle timeout

    2.  Configure the on-premise VPN device to be responder-only as the VPNaaS will always make sure tunnel is up (barring networking issues between OCI-C and the on-premise VPN).

    3. Make sure on-premise VPN device has the following configuration on all security associations (SA’s); some may be inherited from global settings

         a. Idle timeouts are ok in general and should be set to reasonable lengths. If you have 100’s of SA’s you should consider no idle timeout. Idle timeouts defaults are 30 minutes on Cisco. This will expire an SA if enabled and cause a rekey.
         b. If traffic limits are required, set reasonable traffic volume limits, sometimes referred to as life size. Some devices default to 4.6 Gigabytes. This will cause a phase 2 rekey if enabled and that amount of bandwidth is used on the SA between lifetime rekeys.

    4. If some subnets work and others do not, make sure both sides agree on the subnets participating in the vpn. If there are more subnets defined on the on-premise VPN device than the VPNaaS, the VPNaaS will not report a problem since it does not know about the other subnets.

    5. Since OCI Classic uses a Network Address Translation (NAT’ed) environment, the VPN to the cloud must use NAT Traversal (NAT-T). NAT-T requires UDP port 4500 to be open.

    6. If using High Availability (HA), then Dead Peer Detection (DPD) is required to be enabled.

     

    Vendor Specific Considerations


    Checkpoint

         1. Check Point 1490 Appliance: Partner ID of FQDN does not work. Unknown why so use IP Address. This version does not support DPD. NATT and PFS work with this device and software
              a. Software version: Version: R77.20.31 (990170952)
         2. Checkpoint 3200 (tested by OCI Classic)
         3. NAT-T is not always negotiated when a NAT is involved for some Checkpoint devices. The cause is some software versions do not support NAT-T with fixed location VPN.

    Fortinet

         1. Fortigate 200D --- No way to set a hostname/fqdn/string on the fortigate for remote side so right now it only works by setting IKE ID to IP Address (WAN) on Corente side and administer that IP on the fortigate side. NATT/PFS/DPD are OK.
              a. Software: v5.4.1,build1064 (GA)

    Dell Sonicwall

         1. TZ-600 (tested, no known issues)
              a. Software: SonicOS Enhanced 6.2.7.1-23n
         2. TZ-190 (tested, no known issues)
              a. Software: SonicOS Enhanced 4.2.1.9-20e

    Cisco

         1. Cisco ASA 5505 (tested, no known issues)
              a. Software: Version 9.1(6)
         2. Cisco 2921 (tested, no known issues)
              a. Software: Cisco IOS Version 15.4(3)M3, RELEASE SOFTWARE (fc2)
         3. Cisco ASR1001-X (not tested)
              a. Software:
         4. Cisco ISR 4331 (tested by OCI Classic)
         5. Cisco ASA 5515 (seen working in field)
              a. Verify the timeout is important.
         6. Cisco ISR: Reported from field:
              a. ISR applies phase 1 IKE ID on phase 2 so it will be a phase 1 failure but VPNaaS may indicate this is a phase 2. Confirm the IKE ID is correct.

    Juniper

         1. SRX300 (works, no known issues)
              a. Software Version: JunoOS 15.1X49-D45
         2. From Field Reports:
              a. SRX from field
                    Software Version: JunoOS12
                    Packet Lost When There Are More Than 2 Subnet Set For 3rd-party VPN Device (JunOS12 or higher) In App Network Manager (Doc ID 2213615.1)

              b. MX480 – Report from field. Does not support NAT-T
                    Junos: 14.1R7.4 ** This may be fixed in newer software versions

    Palo Alto Networks

         1. PA-200 (Not tested)
         2. PA-3020 (Tested by OCI Classic)
              a. AES-256, SHA-256

    Watchguard

         1. Won’t accept a Fully Qualified Domain Name (FQDN) if there is a dot in it. Reported from field
              a. Model: unknown
         2. If multiple public IP’s on WAN interface, it will default to first IP address for outbound IPSec. Reported from field
              b. Model: unknown

    Details and Common Negotiation Issues Seen in the VPNaaS Event Logs

    There are two phases of the  IPsec VPN negotiation process. Phase 1 is  creation of the IKE security association (SA), and phase 2 is the creation of the  IPSEC SA.

    During Phase 1 negotiation, six messages are sent between the two devices:

    • MAIN_I1 (This is the first packet sent by the negotiation initiator)
    • MAIN_R1 (this is the response to the first initiation packet sent by the responder)
    • MAIN_I2 (this is the second piece of the negotiation the initiator sends, after receiving the R1 packet from the responder)
    • MAIN_R2 (this is the response to the second piece of the negotiation that the responder will send, after receiving the I2 packet from the initiator)
    • MAIN_I3 (this is the third piece of the negotiation the initiator sends, after receiving the R2 packet from the responder)
    • MAIN_R3 (this is the response to the third piece of the negotiation that the responder will send, after receiving the I3 packet from the initiator)
      • Note: MAIN refers to "Main Mode", which is the negotiation mode supported by VPNaaS
      • Note: The I prior to the packet number denotes "Initiator", whereas the R denotes "Responder"
      • Note: Per Internet Security Association and Key Management Protocol (ISAKMP) protocol, the first VPN endpoint that sends MAIN_I1 and receives MAIN_I2 generally becomes the initiator barring networking/saturation issues (such as high latency) that may cause a race condition. We highly recommend that the on-premise VPN endpoint is configured as responder-only to mitigate any potential race conditions.

    If the MAIN_I1 message is sent but doesn't receive MAIN_R1, it typically means there is a proposal mismatch, that the remote side is not configured to accept the connection, or there is a network issue blocking the connection.  A proposal mismatch can happen when encryption formats, authentication (IKE Identifiers), or diffie-hellman (DH) group don't match between the VPNaaS and on-premise VPN.

    *you can view the VPNaaS gateway logs in the OCI-C user interface by selecting 'View Event Log' for the connection

    Remote side not configured or blocking firewall, the log output on the VPNaaS may show:
        : max number of retransmissions (2) reached STATE_MAIN_I1. No response (or no acceptable response) to our first IKE message

    • The meaning of this message shows that the VPNaaS either did not receive a response from the on-premise VPN endpoint, or did not receive an expected response (the MAIN_R1 packet) to the MAIN_I1 packet sent.

    Proposal mismatch, the log output on the VPNaaS may show:
        : sending notification NO_PROPOSAL_CHOSEN to <<CUSTOMER VPN PUBLIC IP>>:500

    • The meaning of this message means that the on-premise VPN endpoint rejected the configured VPNaaS proposals.

    if MAIN_I3 message is sent but doesn't receive MAIN_R3, it means Pre-Shared Key (PSK) may be not matched or encryption proposals may not be matched

    If there is a PSK mismatch, the VPNaaS logs may show:
        : STATE_MAIN_I3: sent MI3, expecting MR3
        : discarding duplicate packet; already STATE_MAIN_I3
        : discarding duplicate packet; already STATE_MAIN_I3
        : max number of retransmissions (2) reached STATE_MAIN_I3. Possible authentication failure: no acceptable response to our first encrypted message

     If VPNaaS is responder and it does not send a MAIN_R3, it means IKE ID may be not matched.

    If the remote ike id on the VPNaaS gateway is incorrect the VPNaaS logs may show:
        : Main mode peer ID is ID_IPV4_ADDR: '<<CUSTOMER VPN PUBLIC IP>>'
        :no suitable connection for peer '<<CUSTOMER VPN PUBLIC IP>>'
        : sending encrypted notification INVALID_ID_INFORMATION to <<CUSTOMER VPN PUBLIC IP>>:500

    if VPNaaS is initiator and doesn't receive MAIN_R3, it means VPNaaS IKE ID may be not be correct.

    If the local ike id on the VPNaaS gateway is incorrect the VPNaaS logs may show:
        : STATE_MAIN_I3: sent MI3, expecting MR3
        : ignoring informational payload, type INVALID_ID_INFORMATION msgid=00000000
        : received and ignored informational message

    If VPNaaS is initiator and does receive MAIN_R3 but it clearly states the on-premise VPN IKE ID is not correct the VPNaaS logs may show:

        : we require peer to have ID '<<VALUE>>', but peer declares '<<VALUE>>'
        : sending encrypted notification INVALID_ID_INFORMATION to <<CUSTOMER VPN PUBLIC IP>>:4500


    Phase 2 (Create IPSEC-SA) should include four messages:

    QUICK_I1 (The first packet negotiating phase 2 settings. Each security association negotiates these separately.)
    QUICK_R1 (This is the response to the negotiation of QUICK_I1.)
    QUICK_I2 (This is an acknowledgement by the initiator that the negotiation parameters have been accepted.)
    QUICK_R2 (This is an acknowledgement by the responder that the negotiation parameters have been accepted.)


    If Quick_I1 message is sent but doesn't receive Quick_R1, it means phase 2 proposal may not be matched or the network settings are mismatched.

    If Encryption, Authentication or Perfect Forward Secrecy (PFS) mismatch the VPNaaS logs may show:

        : initiating Quick Mode PSK+ENCRYPT+TUNNEL+UP to replace #14893 {using isakmp#14886 msgid:c28acce1 proposal=AES(12)_128-SHA1(2)_160 pfsgroup=no-pfs}
        : max number of retransmissions (2) reached STATE_QUICK_I1. No acceptable response to our first Quick Mode message: perhaps peer likes no proposal
        : starting keying attempt 3 of at most 999999

    If PFS is configured on the VPNaaS but not enabled on remote side the VPNaaS logs may show:
        : we require PFS but Quick I1 SA specifies no GROUP_DESCRIPTION 

    If PFS is not configured on the VPNaaS gateway but enabled on the on-premise VPN the VPNaaS logs may show:
        : initiating Quick Mode PSK+ENCRYPT+TUNNEL+UP {using isakmp#2765 msgid:cf1649bc proposal=AES(12)_256-SHA2_256(5)_256 pfsgroup=no-pfs}
        : received Delete SA payload: deleting ISAKMP State #2765
        : packet from <<CUSTOMER VPN PUBLIC IP>>:4500: received and ignored informational message
        : max number of retransmissions (2) reached STATE_QUICK_I1 

    Phase2 network proposal mismatch:

    The VPNaaS gateway is policy-based and will not accept that a partial IPSec configuration is working, it will tear down viable IPSec SAs and the IKE SA
    until all IPSec SAs are up. If the customer device is proposing a subnet pair that the VPNaaS is not configured for, the VPNaaS logs may show:

        : cannot respond to IPsec SA request because no connection is known for <<CIDR OF OCI-C IP NETWORK>>()===<<VPNAAS ETH0 IP>>[<<VPNAAS VISIBLE IP>>]...<<CUSTOMER VPN PUBLIC IP>>===<<CIDR OF CUSTOMER SUBNET>>()
        : sending encrypted notification INVALID_ID_INFORMATION to <<CUSTOMER VPN PUBLIC IP>>:4500