Firewall has been widely deployed by companies to protected perimeter with business partner and Internet, including VPN. Firewall denies all connections unless explicitly allowed by the business for the best security.
It’s not unusual that business connection is not working after firewall rules are implemented on the firewall, which causes financial loss and frustration to the business, up to the point that some people even question the benefits of the firewall.
With business transaction being interrupted, it is important to be able to troubleshoot the connection issue methodically and systematically to ensure the timely resolution of the issue. The outcome of the troubleshooting is either proving firewall is not guilty as charged, or acknowledge guilty and rectify the issue accordingly.
This post will be focusing on troubleshooting with Checkpoint platform based on my years’ hands on experience. Troubleshooting with other firewall platform will follow in other posts, but concepts should be the same.
Below are the steps that I typically use to troubleshoot connectivity issue on Checkpoint platform:
1. Collect devices info for the impacted business hosts
First and the most important, it’s critical to get the complete list of IPs for the impacted business hosts. This typically can be provided by business, however there are following caveats to watch out:
- Hosts may be multi-homed and have multiple interfaces, so what business gives may be just one IP of several interfaces, and that IP may not be the one registered in firewall rules, so ensure to get all interface IPs for the host.
- Checkpoint host object is object-oriented, meaning that each host object has one IP explicitly registered and visible, but there may be other IPs registered in topology property of the host object, and those IPs are not easily visible unless you click on “topology” tab for the host object. When search the rules, ensure to search for the all the business host interface IPs.
- Sometimes business can only provide IPs that they think are in the scope, but due to the unique network or application design, it could be that other related device IPs that are having issue, this typically the cases when connections are proxied or NAT is involved. It is helpful to have business provide the information about how the connections are setup to direct the troubleshooting in the right direction.
2. Verify the routing path for the business connection
After business side IP information is collected, next step is to find out if firewall has probability of being guilty, i.e., to find out if firewall is in the routing path of the connection, and if there is firewall in path, find out what firewall to investigate (note that companies typically have many firewalls and complex routing).
This typically can be done using “traceroute” tool from the source host. Actual “traceroute” command differs depending on the OS system. In Unix, it is “traceroute”, in windows, it is “tracert”. There are following caveats to watch out for the traceroute:
- Traceroute needs to be allowed by the firewalls and routers in path. If host not traceable, or being blocked by in-path firewalls and routers, traceroute output will display “request timed out” messages. In such cases, if there is no related network topology diagram available, one has to jump to the last hop to continue to traceroute until all hops are mapped out.
- There may be firewall behind firewall for some connections, so ensure to find out how many firewalls are involved in the issue.
Once firewalls in routing path are identified, traceroute should be attempted from firewall to business hosts to ensure there is no asymmetric routing going on, as firewall may drop the connection due to the SmartDefense setting.
3. Verify business host and application is up and reachable
When people complain about firewall issue, it is sometimes in fact the host or application is not up, or not reachable due to routing issue. Checking that host and application is up may save quite some time working on the firewall. If ping and traceroute is allowed outbound, ping and traceroute the host will help speed up identifying the root cause.
4. Check the drop logs to see if traffic is denied
If host and application is up, and routing appears working, also the firewalls are confirmed to be the routing path, next is to focus on the firewall rule. Check the firewall logs are typically the most effective way to find out where the problem is. As checkpoint could store firewall logs locally, or forward to SmartTracker, or any other remote hosts, it is critical to know where the firewall rule logs are stored. Checking the logging property for the firewall object will identify where the logs SHOULD be. Note that sometimes the logs are stored locally due to the connection issue to the remote log server, even though it is set to forward to the remote log server. If logs are not seen on the server where it should be, check the logging connections on the firewall, checkpoint is using tcp port 257 for the logging, so check if the connection for this port is established on the firewall.
The most luck scenario would be that the connection is logged, and it will indicate either the connections are dropped or allowed. If connections are allowed as indicated by the logs, it can be safely say that problem likely lies outside of firewall, otherwise, the drop log should provide the necessary info to add the correct rules for the connection. For example, business may have provided the wrong IP, or wrong service ports, or additional IP and ports are needed in the rules.
If you are under the gun to find out whether the firewall is guilty or not now, as the last resort, temporarily add an allow-all rule to the firewall at the top of firewall rules, and ask business to test. If business still complains, you know that problem is somewhere else (warning: remove allow-all rule quick after the test).
5. Additional troubleshooting tools
So far above procedure should have resolved typically firewall connection issues, however in some cases, the problem may need to dig into how the firewall daemon is behaving on the firewall. These will have to resort to debug tools such as tcpdump to check the traffic packets. Checkpoint also provides utility called “fw monitor” tool and many debug options. Since these tools deserve their own length of explanation, I will cover them in different posts.