KahWee

Thoughts on web development, programming, and technology

Fixing Cloudflare 525 Errors When Pointing a Domain to Fly.io

This is a breakdown of why my Fly.io app fronted by Cloudflare was returning Error 525 (SSL handshake failed) and how it was fixed.

1. Architecture: Who Does What

Layer Responsibility
Browser → Cloudflare Edge Terminates client TLS first (if proxy enabled / orange cloud)
Cloudflare → Fly.io Origin Establishes a second TLS connection to Fly.io (Full/Full Strict modes)
Fly.io Serves the app + must present a valid certificate for the custom domain
Let’s Encrypt Issues the cert Fly.io requests after domain control validation (ACME)

So I effectively have dual TLS when Cloudflare proxying is on. If the origin (Fly.io) does not present a valid cert for the hostname, Cloudflare can’t complete the handshake → 525.

2. Symptom

525 SSL handshake failed

Seen only when Cloudflare proxy (orange cloud) was enabled. With proxy off (grey cloud / DNS only) the site either failed differently or showed certificate mismatch/missing depending on timing.

3. Root Cause

  1. DNS records at Cloudflare were inconsistent (stale A/AAAA or CNAME remnants).
  2. Attempts to "bring" or upload a custom certificate (Cloudflare-issued) to Fly.io — not supported. Fly.io expects to manage cert issuance itself.
  3. While DNS didn’t cleanly point to Fly.io’s IPv6, Fly.io couldn’t finish ACME (Let’s Encrypt) validation → no valid origin cert → Cloudflare handshake failed → 525.

4. The Fix

Step 1: DNS Hygiene

Removed every unrelated/legacy record for the apex (example.com) and www that could conflict:

  • Deleted old A, AAAA, CNAME records not belonging to the Fly.io deployment.
  • Left only what Fly.io + ACME needed.

Why: ACME HTTP-01 or DNS-01 (Fly.io uses internal automation) and routing depend on unambiguous records.

Step 2: Add Correct AAAA Records

Fly.io provided an IPv6 that routes to the app. Added (proxied):

example.com          AAAA   <fly-io-ipv6-address>  (Proxied)
www.example.com      AAAA   <fly-io-ipv6-address>  (Proxied)

(Actual IPv6 redacted in this post; substitute the value shown in your Fly.io dashboard.)

No A record needed if Fly.io only gave IPv6 — Cloudflare will still proxy fine.

Why IPv6 only is fine: Cloudflare terminates at edge (dual stack) then connects over IPv6 downstream.

Step 3: Preserve _acme-challenge Records

Existing _acme-challenge CNAMEs (added earlier by Fly.io automation) were left intact. These allow Let’s Encrypt to validate via DNS-01.

Step 4: Propagation Wait

TTL + resolver caches required a short wait (a few minutes). Verified via dig from several public resolvers.

Step 5: Let Fly.io Re-Issue Certificates

Once DNS pointed correctly, Fly.io’s dashboard showed:

  • Domain status: Verified
  • Certificates: RSA + ECDSA issued

Cloudflare SSL Mode set to Full (Strict) (or at least Full). Not Flexible.

Why Strict matters: Ensures Cloudflare validates the origin cert chain instead of ignoring it (Flexible) or being lax (plain Full). Prevents hidden misconfig later.

5. What Actually Mattered

I only needed to confirm three things:

  1. DNS returned just the Fly.io AAAA for apex + www.
  2. Fly.io marked the domain verified and showed issued certs.
  3. Cloudflare SSL mode was Full (Strict) with proxy on.

Everything else (manual probing, deep TLS inspection) was noise once those were correct.

6. Why Custom Cert Upload Failed

Fly.io issues and renews its own certs via ACME. Cloudflare Origin Certificates are only meant for Cloudflare↔origin trust and aren’t public‑PKI certs. Uploading was never the path.

7. Common Pitfalls (Condensed)

  • Stale A/AAAA/CNAME records lingering.
  • Mixing previous host IPs with Fly.io AAAA.
  • Using “Flexible” SSL (masks the real problem).
  • Trying to import certificates instead of letting automation run.
  • Over-debugging before checking DNS + Fly.io status + Cloudflare mode.

8. Mental Model

Browser --TLS--> Cloudflare --TLS--> Fly.io
                      ^
            Needs valid cert at origin

If Cloudflare can’t finish that second TLS hop, you get 525.

9. Minimal Checklist

  1. Add domain in Fly.io.
  2. Publish ONLY the Fly.io AAAA (and A if given) in Cloudflare; proxy on.
  3. Leave _acme-challenge CNAMEs untouched.
  4. Set SSL/TLS = Full (Strict).
  5. Wait a few minutes; confirm Fly.io shows “Verified / Issued”.

10. Key Points

  • 525 = Cloudflare↔origin TLS failure.
  • DNS cleanliness gates cert issuance.
  • Let Fly.io handle issuance; don’t upload certs.
  • Fewer DNS records = fewer surprises.

Final Summary

Broken: Proxy pointed at an origin without a valid cert due to messy DNS.
Fix: Clean DNS → correct AAAA only → wait for Fly.io issuance → Full (Strict).
Result: Stable dual TLS with automated renewals.

If you see 525: DNS → Fly.io cert status → Cloudflare mode. Stop when all three are green.

All Tags