top of page

VPN tunnel not working

  • mooneya9
  • Mar 1, 2024
  • 2 min read

Updated: Jun 12

VPN tunnels are often among the most difficult infrastructure components to troubleshoot. This is primarily because they require coordinated configuration and analysis between two independent parties - each of whom may believe the issue lies with the other side. Miscommunication and blame-shifting can easily extend resolution times from hours to weeks.


An often-overlooked but critical factor in resolving VPN issues is the ability to communicate clearly and diplomatically. Establishing a cooperative mindset across both parties greatly improves the speed and quality of resolution. In many cases, progress depends not just on technical troubleshooting, but on building mutual trust and ownership of the issue.


In one such case, a client reported that traffic from their AWS environment could not reach a third-party endpoint over a VPN tunnel. Initial tests using tools like telnet and netcat showed that TCP connections to the remote server on port 443 were not completing. To rule out issues on the AWS side, flow logs were checked to confirm that security groups and NACLs were not blocking the traffic. All logs indicated that the traffic was permitted and leaving the AWS environment successfully.


To proceed further, a screen-sharing session was arranged with the network team on the third-party side. This allowed for live analysis of the traffic as it arrived at their perimeter. Their firewall logs confirmed receipt of the packets - but the traffic was being denied due to the use of high-numbered source ports.


This behaviour is expected. In any client-server communication, two ports are involved:

  • The destination port (e.g. 443 for HTTPS), and

  • A source port, which is randomly selected by the client from a range of ephemeral ports.


Ephemeral ports are fundamental to how operating systems establish outbound connections. Blocking them on inbound firewall rules - based on the source port - is a misconfiguration. Firewalls should match on destination ports (e.g. allow inbound connections to port 443), not reject packets based on the ephemeral source port used by the client.


After explaining this, the third-party team updated their firewall rules to accept return traffic regardless of the ephemeral source port range. Once the correction was made, the connection from AWS succeeded immediately.


The total time to resolution was short because of early agreement to collaborate live, rather than relying on asynchronous back-and-forth over email. Direct engagement between the engineering teams allowed the problem to be understood and resolved in a matter of hours.


This case demonstrates that effective resolution of VPN issues relies as much on coordination and communication as on technical analysis. By fostering a shared sense of responsibility, delays and misunderstandings can be avoided - even in traditionally slow-moving cross-organisation scenarios.

 
 

Recent Posts

See All
RDS database slow - storage layer

In this case study we explore a problem where we tackled performance issues plaguing an enterprise application responsible for processing...

 
 
bottom of page