mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
fix(referral): stop /api/referral/me 503s on prod homepage (#3186)
* fix(referral): make /api/referral/me non-blocking to stop prod 503s Reported in prod: every PRO homepage load was logging 'GET /api/referral/me 503' to Sentry. Root cause: a prior review required the Convex binding to block the response (rationale: don't hand users a dead share link). That turned any flaky relay call into a homepage-wide 503 for the 5-minute client cache window — every PRO user, every page reload. Fix: dispatch registerReferralCodeInConvex via ctx.waitUntil. Response returns 200 + code + shareUrl unconditionally. Binding failures log a warning but never surface as 503. The mutation is idempotent; the next /api/referral/me fetch retries. The /pro?ref=<code> signup side reads userReferralCodes at conversion time, so a missed binding degrades to missed attribution (partial), never to blocked homepage (total). The BRIEF_URL_SIGNING_SECRET-missing 503 path is unchanged — that's a genuine misconfig, not a flake. Handler signature now takes ctx with waitUntil, matching api/notification-channels.ts and api/discord/oauth/callback.ts. Regression test flipped: brief-referral-code.test.mjs previously enforced the blocking shape; now enforces the non-blocking shape + handler signature + explicit does-not-503-on-binding-failure assertion. 14/14 referral tests pass. Typecheck clean, 5706/5706 test:data, lint exit 0. * fix(referral): narrow err in non-blocking catch instead of unsafe cast Greptile P2 on #3186. The (err as Error).message cast was safe today (registerReferralCodeInConvex only throws Error instances) but would silently log 'undefined' if a future path ever threw a non-Error value. Swapped to instanceof narrow + String(err) fallback.
This commit is contained in:
@@ -3,21 +3,26 @@
|
||||
*
|
||||
* GET /api/referral/me
|
||||
* Bearer-auth via Clerk JWT.
|
||||
* -> 200 { code, shareUrl, invitedCount, convertedCount }
|
||||
* -> 200 { code, shareUrl }
|
||||
* -> 401 on missing/invalid bearer
|
||||
* -> 503 if BRIEF_URL_SIGNING_SECRET is not configured (we reuse
|
||||
* it as the HMAC secret for referral codes — see handler body).
|
||||
*
|
||||
* `code` is a deterministic 8-char hash of the Clerk userId (stable
|
||||
* for the life of the account). `invitedCount` is the number of
|
||||
* `registrations` rows that used this user's code as `referredBy`.
|
||||
* `convertedCount` is the subset of those that later produced a PRO
|
||||
* subscription — omitted in the MVP because subscription tracking
|
||||
* lives in a different table and the join isn't needed for the
|
||||
* share-button UX.
|
||||
* for the life of the account).
|
||||
*
|
||||
* Stats are privacy-safe: the route returns counts only, never the
|
||||
* referred users' emails or identities.
|
||||
*
|
||||
* Convex binding is fire-and-forget via ctx.waitUntil (see handler).
|
||||
* An earlier iteration blocked on the binding and returned 503 on
|
||||
* any failure — that turned a single flaky Convex call into a
|
||||
* homepage-wide 503 outage for every PRO user (all homepage loads
|
||||
* fetch this within the 5-minute client cache window). The mutation
|
||||
* is idempotent; the next fetch re-attempts, and a receiver's
|
||||
* signup at /pro?ref=<code> only needs the binding to have landed
|
||||
* SOMETIME before that receiver completes signup, not on every
|
||||
* share-button mount. Missed attribution beats homepage 503.
|
||||
*/
|
||||
|
||||
export const config = { runtime: 'edge' };
|
||||
@@ -36,13 +41,16 @@ const PUBLIC_BASE =
|
||||
* Bind the Clerk-derived share code to the userId in Convex so that
|
||||
* future /pro?ref=<code> signups can actually credit the sharer.
|
||||
*
|
||||
* BLOCKING (no longer fire-and-forget). If the binding can't be
|
||||
* persisted the endpoint returns 503 instead of a dead share link —
|
||||
* handing a user a code the waitlist mutation will never resolve is
|
||||
* worse than a retryable error. The Convex mutation is idempotent,
|
||||
* so client retries are safe.
|
||||
* Fire-and-forget via the caller's ctx.waitUntil — never blocks the
|
||||
* 200 response on this path. The mutation is idempotent, so the next
|
||||
* /api/referral/me fetch (or signup-side lookup) re-attempts. A
|
||||
* missed binding degrades to "receiver's signup isn't attributed"
|
||||
* which is strictly less bad than the prior behaviour of 503'ing
|
||||
* every PRO homepage load while Convex is slow or misconfigured.
|
||||
*
|
||||
* Throws on any failure; caller translates to 503.
|
||||
* Resolves on success. Does NOT throw on failure — the caller relies
|
||||
* on waitUntil to catch + log so a background failure can't surface
|
||||
* as an unhandled rejection.
|
||||
*/
|
||||
async function registerReferralCodeInConvex(userId: string, code: string): Promise<void> {
|
||||
const convexSite =
|
||||
@@ -67,7 +75,10 @@ async function registerReferralCodeInConvex(userId: string, code: string): Promi
|
||||
}
|
||||
}
|
||||
|
||||
export default async function handler(req: Request): Promise<Response> {
|
||||
export default async function handler(
|
||||
req: Request,
|
||||
ctx: { waitUntil: (p: Promise<unknown>) => void },
|
||||
): Promise<Response> {
|
||||
if (isDisallowedOrigin(req)) {
|
||||
return jsonResponse({ error: 'Origin not allowed' }, 403);
|
||||
}
|
||||
@@ -109,18 +120,26 @@ export default async function handler(req: Request): Promise<Response> {
|
||||
return jsonResponse({ error: 'service_unavailable' }, 503, cors);
|
||||
}
|
||||
|
||||
// Bind the code to the userId in Convex BEFORE returning the share
|
||||
// URL. If the binding can't be persisted (Convex outage, missing
|
||||
// env, non-2xx) we must NOT hand the user a link that the waitlist
|
||||
// mutation can never resolve — a dead share link is worse than a
|
||||
// retryable error. The mutation is idempotent so client retries
|
||||
// are safe.
|
||||
try {
|
||||
await registerReferralCodeInConvex(session.userId, code);
|
||||
} catch (err) {
|
||||
console.error('[api/referral/me] binding failed:', (err as Error).message);
|
||||
return jsonResponse({ error: 'service_unavailable' }, 503, cors);
|
||||
}
|
||||
// Bind the code to the userId in Convex in the background so future
|
||||
// /pro?ref=<code> signups can credit the sharer. FIRE-AND-FORGET
|
||||
// via ctx.waitUntil — the response doesn't wait, and a binding
|
||||
// failure (Convex outage, bad env, non-2xx, timeout) logs a warning
|
||||
// but never turns into a 503. See module docstring for the
|
||||
// rationale; an earlier blocking design caused homepage-wide
|
||||
// outages on every flake. The mutation is idempotent so the next
|
||||
// request retries.
|
||||
ctx.waitUntil(
|
||||
registerReferralCodeInConvex(session.userId, code).catch((err: unknown) => {
|
||||
// Narrow rather than cast — a future path that throws a
|
||||
// non-Error value must not turn this warning into "failed:
|
||||
// undefined". The helper today only throws Error instances,
|
||||
// so the instanceof branch is the common path.
|
||||
console.warn(
|
||||
'[api/referral/me] binding failed (non-blocking):',
|
||||
err instanceof Error ? err.message : String(err),
|
||||
);
|
||||
}),
|
||||
);
|
||||
|
||||
// No invite/conversion count is returned on the response. The
|
||||
// waitlist path (userReferralCredits) now credits correctly, but
|
||||
|
||||
@@ -114,21 +114,51 @@ describe('referral attribution resolves Clerk codes (waitlist path)', () => {
|
||||
assert.match(src, /\.index\("by_referrer_email",\s*\["referrerUserId",\s*"refereeEmail"\]\)/);
|
||||
});
|
||||
|
||||
it('/api/referral/me blocks on the binding + returns 503 on failure (not a dead share link)', async () => {
|
||||
it('/api/referral/me fires the Convex binding non-blocking and never 503s on binding failure', async () => {
|
||||
const { readFileSync } = await import('node:fs');
|
||||
const { fileURLToPath } = await import('node:url');
|
||||
const { dirname, resolve } = await import('node:path');
|
||||
const __d = dirname(fileURLToPath(import.meta.url));
|
||||
const src = readFileSync(resolve(__d, '../api/referral/me.ts'), 'utf-8');
|
||||
assert.match(src, /registerReferralCodeInConvex/, 'helper must exist');
|
||||
// REGRESSION: the earlier head used ctx.waitUntil(...) which
|
||||
// returned a 200 + share link to the user even if the binding
|
||||
// silently failed (missing env, non-2xx, network). That handed
|
||||
// users dead links. Binding must block the response.
|
||||
assert.match(src, /await registerReferralCodeInConvex/, 'binding must be awaited, not fire-and-forget');
|
||||
assert.doesNotMatch(src, /ctx\.waitUntil\(registerReferralCodeInConvex/, 'must NOT use waitUntil for the binding');
|
||||
// Failure path must return 503 rather than a 200 with a dead link.
|
||||
assert.match(src, /binding failed[\s\S]{0,200}service_unavailable[\s\S]{0,50}503/, 'binding failure returns 503');
|
||||
// CURRENT CONTRACT: binding is fire-and-forget via ctx.waitUntil.
|
||||
//
|
||||
// An earlier iteration (the "await + 503 on failure" shape this
|
||||
// test used to enforce) turned a flaky Convex relay call into a
|
||||
// homepage-wide 503 for every PRO user — the homepage fetches
|
||||
// this endpoint on mount, so one bad Convex response broke the
|
||||
// 5-minute cache window for everyone. The mutation is idempotent
|
||||
// and the /pro?ref=<code> signup side re-reads at conversion
|
||||
// time, so a missed binding degrades to missed attribution
|
||||
// rather than outright breakage.
|
||||
assert.match(
|
||||
src,
|
||||
/ctx\.waitUntil\(\s*registerReferralCodeInConvex/,
|
||||
'binding must be dispatched via ctx.waitUntil (non-blocking)',
|
||||
);
|
||||
// Handler must accept a second ctx arg with waitUntil — matches
|
||||
// the notification-channels + discord-oauth handler shapes.
|
||||
assert.match(
|
||||
src,
|
||||
/export\s+default\s+async\s+function\s+handler\s*\(\s*req:\s*Request,\s*ctx:\s*\{\s*waitUntil:/,
|
||||
'handler signature must take ctx with waitUntil',
|
||||
);
|
||||
// MUST NOT 503 on a binding failure — the whole point of the
|
||||
// non-blocking shape. Handler must not mention "binding failed"
|
||||
// anywhere near a 503 response code.
|
||||
assert.doesNotMatch(
|
||||
src,
|
||||
/binding failed[\s\S]{0,200}service_unavailable[\s\S]{0,50}503/,
|
||||
'binding failure must not return 503',
|
||||
);
|
||||
// BRIEF_URL_SIGNING_SECRET missing still legitimately 503s
|
||||
// (different codepath; we can't mint a code without the secret).
|
||||
// That's intentional, and unrelated to the Convex binding.
|
||||
assert.match(
|
||||
src,
|
||||
/BRIEF_URL_SIGNING_SECRET is not configured[\s\S]{0,200}503/,
|
||||
'missing signing secret still 503s',
|
||||
);
|
||||
assert.match(src, /\/relay\/register-referral-code/, 'must POST to the Convex HTTP action');
|
||||
});
|
||||
|
||||
|
||||
Reference in New Issue
Block a user