trace·warrior
  • Tools
  • Monitoring
  • Pricing
  • Resources
  • About
Sign inGet started
trace·warrior

Network diagnostics for IT professionals. Built for speed, accuracy, and the long tail of the Friday afternoon outage.

ALL SYSTEMS NOMINAL
Tools
  • DNS Lookup
  • Ping Test
  • Port Checker
  • WHOIS
  • See all
Product
  • Monitors
  • Pricing
  • How-to guides
  • Compare
Resources
  • Blog
  • API docs
  • Tool index
  • Contact
Company
  • About
  • Privacy
  • Terms
  • Cookie policy
© 2026 Trace Warrior · made for engineers, by engineersnetwork forensics, quietly
/
Green line-art panels labelled SERVICE REQUEST and SERVICE RESPONSE with arrows pointing to a magnifier over a bug between them, on a dark grid
how-to/how-to-debug-api-connectivity-issues
last verified · 2026-06-10

How to debug API connectivity issues

Trace failing API calls layer by layer (DNS, TCP, TLS, then HTTP) and fix the real causes: timeouts, rate limits, certificates, and proxies.

apihttptroubleshootingnetworking
Trace Warrior Team
6 min read

An API call that fails gives you one error string (ECONNREFUSED, ETIMEDOUT, certificate verify failed, or just a 5xx), and that string is rarely the whole story. The productive way to debug it is the same way the request travels: resolve the name, open the socket, complete the handshake, exchange HTTP. Each layer has distinct failure signatures, and curl can expose all of them in a single command.

Step 0. Reproduce outside your application

First, take your code out of the equation. If the call fails in your app, run the equivalent request with curl from the same machine:

curl -sv https://api.example.com/v1/status \
  -H "Authorization: Bearer $TOKEN" \
  -o /dev/null

The -v output walks through every layer in order (DNS resolution, TCP connect, TLS handshake, request, response) and stops at the layer that fails. Two outcomes:

  • curl fails the same way: it's a network/server problem. Continue with this guide.
  • curl succeeds: the problem is in your application: wrong base URL in config, an HTTP client that ignores system proxy settings (or uses one curl doesn't), a stale connection pool, or different DNS behaviour in your runtime. Diff the curl request against what your code actually sends.

Step 1. Check DNS

dig +short api.example.com

No answer, or SERVFAIL? The API hostname isn't resolving and nothing else matters yet. Check against a public resolver to separate "their DNS is broken" from "my resolver is broken":

dig +short api.example.com @1.1.1.1

If the public resolver answers but your local one doesn't, the problem is your resolver or a corporate DNS filter. If neither answers, the provider's DNS is down; check their status page. The DNS Lookup tool gives you the same answer from a neutral vantage point outside your network.

One subtle case: the name resolves, but to different IPs over time (CDN, geo-DNS), and your application caches the first answer forever. Some runtimes (notably the JVM with default settings in older versions) cache DNS aggressively. If calls fail only after the provider rotates infrastructure, suspect stale DNS caching in your client.

Step 2. Check TCP reachability

nc -vz api.example.com 443
  • Refused: the host answered with a RST, meaning nothing listening on that port. Rare for a production API; usually means a wrong port in your config or the provider is mid-incident.
  • Timeout: packets dropped silently. This is the signature of a firewall. Test the same port from outside your network with the Port Checker: if it's open externally but times out from your server, your egress firewall or security group is the culprit. Cloud environments are the usual offenders: an egress rule that allows 443 to one CIDR but the API moved IPs, or a NAT gateway issue.

For more on isolating port-level problems, see how to check open ports.

Step 3. Check TLS

The handshake is where corporate environments break things. In curl's -v output, failures here appear after TLS handshake lines.

echo | openssl s_client -connect api.example.com:443 -servername api.example.com 2>/dev/null | openssl x509 -noout -issuer -dates

Common signatures:

  • certificate verify failed only from your network: look at the issuer. If it's your company's name rather than a public CA, a TLS-intercepting proxy is rewriting the connection and your runtime doesn't trust the corporate root CA. Add it to your runtime's trust store (NODE_EXTRA_CA_CERTS, REQUESTS_CA_BUNDLE, JVM keystore); don't disable verification.
  • Expired or mismatched certificate from everywhere: provider-side problem. Confirm with the SSL Certificate Checker, then report it. If you operate the API yourself, an SSL expiry monitor is how you stop shipping this failure to your own consumers.
  • Protocol version errors: old runtimes capped at TLS 1.1 talking to endpoints that require 1.2+. Upgrade the client; the server is right.

Step 4. Read the HTTP response properly

Connection succeeds but the call still "fails"? Now the status code and headers carry the diagnosis. Pull them with curl or the HTTP Header Checker:

curl -si https://api.example.com/v1/status -H "Authorization: Bearer $TOKEN" | head -30
  • 401: the token is missing, expired, or sent in the wrong shape. Check the exact header format the API expects (Bearer prefix, header name, casing of the scheme).
  • 403: authenticated but not allowed: scope/permission gaps, or an IP allowlist that doesn't include your server's egress IP. The egress IP is frequently not what you think it is once a NAT gateway is involved; curl https://api.ipify.org from the server tells you the truth.
  • 404 on every endpoint: wrong base URL or API version path. Compare against the docs character by character.
  • 429: rate limited. Look for Retry-After and X-RateLimit-Remaining headers. The fix is client-side throttling, not retrying harder (see below).
  • 5xx: the provider's problem, usually. But a 502/504 from their gateway can also mean your request is malformed in a way that crashes upstream; check whether a minimal request succeeds.

Step 5. Diagnose timeouts with timing data

"Slow or hanging" is its own category. curl can attribute time to each phase:

curl -s -o /dev/null -w \
  "dns=%{time_namelookup}s connect=%{time_connect}s tls=%{time_appconnect}s ttfb=%{time_starttransfer}s total=%{time_total}s\n" \
  https://api.example.com/v1/status

Read it left to right: a large dns means resolver trouble; a large connect means network path or packet loss (corroborate with the Ping Test tool); a large gap between tls and ttfb means the server accepted the request and then sat on it: server-side slowness, not network.

Step 6. Set timeouts and retries deliberately

Most "API connectivity issues" in production are really the absence of a timeout/retry policy meeting an imperfect network. Defaults are dangerous: some HTTP clients default to no timeout at all, which means a hung connection holds a thread or socket indefinitely.

Sensible baseline:

  • Connect timeout: short, 3-5 seconds. If a SYN isn't answered in that window, more waiting won't help.
  • Read/request timeout: based on the endpoint's real latency, not a guess. If p99 is 2 seconds, a 10-second timeout is generous; 120 seconds just delays the failure.
  • Retries: only for idempotent requests (GET, PUT, DELETE) and transient failures (connect errors, 429, 502/503/504). Never blind-retry POSTs that aren't idempotency-keyed; you'll create duplicate resources.
  • Backoff with jitter: exponential delays (e.g., 1s, 2s, 4s) with randomisation, so a fleet of clients doesn't retry in lockstep and re-hammer a recovering service. Respect Retry-After when the server sends it.
# requests + urllib3: the canonical Python setup
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

retry = Retry(total=3, backoff_factor=1,
              status_forcelist=[429, 502, 503, 504],
              allowed_methods=["GET", "PUT", "DELETE"])
session = requests.Session()
session.mount("https://", HTTPAdapter(max_retries=retry))
resp = session.get("https://api.example.com/v1/status", timeout=(5, 10))

The timeout=(5, 10) tuple is connect and read timeouts separately: exactly the two numbers steps 2 and 5 taught you to distinguish.

TL;DR

  1. Reproduce with curl -sv first; if curl works, debug your app, not the network.
  2. DNS: dig +short, cross-check against 1.1.1.1 (DNS Lookup).
  3. TCP: nc -vz host 443; timeout = firewall, refused = nothing listening (Port Checker).
  4. TLS: check issuer and dates; corporate proxies and expired certs are the usual suspects.
  5. HTTP: 401 = auth shape, 403 = permissions/IP allowlist, 429 = back off, 5xx = usually them.
  6. Use curl -w timing to attribute slowness to DNS, connect, TLS, or server.
  7. Set explicit connect and read timeouts, retry only idempotent requests, back off with jitter.

Related

  • HTTP Header Checker - inspect status codes and headers from a neutral vantage point
  • How to check open ports - the TCP layer in depth
  • How to test webhook endpoints - the same debugging, from the receiving side
related guides
  • How to test webhook endpoints

    Verify a webhook endpoint is reachable, returns the right status codes, and validates signatures correctly, with curl commands for every step.

  • How to diagnose a slow internet connection

    A step-by-step elimination workflow for slow internet. Isolate device, Wi-Fi, router, ISP, or the remote site, with the exact tests for each layer.