← Case studies

Case Study 02 · B2B SaaS

Building a Three-Domain
Marketplace Billing System
from the Ground Up

How I engineered a production-grade payment platform handling SaaS subscriptions, Stripe Connect marketplace payments, and PayPal — with webhook idempotency, atomic transactions, and zero billing errors across thousands of coach-client transactions.

Stripe Connect Stripe Webhooks PayPal Ruby on Rails PostgreSQL B2B SaaS Production
3×
Payment Domains Unified
$0
Duplicate Charges in Production
99%+
Platform Uptime Maintained
Webhook Events, Zero Duplicates

01 · The Problem

Three entirely different
payment problems, one system

The product is a B2B SaaS platform serving professional coaches worldwide. When I joined, the billing architecture had three separate, largely disconnected payment concerns that needed to be unified, hardened, and made production-reliable.

The fundamental challenge was that these three domains had completely different requirements, different Stripe accounts, different failure modes, and different compliance concerns — yet all lived inside one Rails application and needed to coexist without bleeding into each other.

🏢
Domain 01
Platform SaaS Billing
Coaches paying the platform for their monthly or annual subscription. Standard Stripe checkout, subscription management, failed payment recovery, and plan upgrades with proration.
🔗
Domain 02
Stripe Connect Marketplace
Coaches connecting their own Stripe accounts so their clients can pay for invoices, coaching programs, group sessions, and courses — money flowing coach to client, not to the platform.
🅿️
Domain 03
PayPal Marketplace
Coaches connecting PayPal by email. Clients paying invoices and event bookings via PayPal Express Checkout. Completely separate from Stripe, different failure modes, different reconciliation.

What the system needed to solve — all at once

  • Stripe webhooks firing multiple times for the same event causing duplicate subscription activations, plan changes, and invoice records
  • No clean separation between platform Stripe credentials and connected coach Stripe accounts — risk of cross-coach payment leakage
  • Race conditions during subscription checkout where two simultaneous browser tabs could create two subscriptions for one coach
  • Failed SaaS payments needed graceful degradation — keep access alive during retries, terminate only after all retries exhausted
  • Plan upgrades with proration were billing incorrectly when interval changed from monthly to annual
  • PayPal execute callbacks had no idempotency — network retries could mark the same invoice paid twice

02 · Architecture

Clean isolation across
three billing domains

The first architectural decision was absolute: platform billing and marketplace billing must never share credentials or state. A bug in one domain must never affect the other. Here is how the system is structured:

billing-architecture.txt — platform overview
PLATFORM SAAS BILLING  (coach pays platform)
  SubscriptionsController
    → Stripe Platform Account  (credentials.stripe_secret_key)
    → business_plans           (local subscription state)
    → stripe_invoices          (local invoice cache)
    → payment_methods          (card metadata only, never full card)
  StripeWebhooksController     (POST /stripe_webhook)
    → processed_stripe_events  (idempotency — unique stripe_event_id)
    → StripeBusinessPlanSyncService

STRIPE CONNECT MARKETPLACE  (client pays coach)
  StripeController / PaymentsController
    → omini_auths              (provider: stripe_connect, token, o_uid)
    → PaymentService → StripePaymentService
    → Stripe.api_key = coach_token  (never platform key)
    → stripe_account: coach_o_uid   (full isolation)
  StripeWebhooksController     (POST /stripe_connect_webhook)
    → invoices / payment_histories / client_subscriptions

PAYPAL MARKETPLACE  (client pays coach via PayPal)
  PaypalController
    → omini_auths              (provider: paypal, email)
    → PaypalCheckoutService    (Express Checkout)
    → payment_histories        (first_or_create on payment_token)
    → invoices / event_bookings

SHARED LAYER  (common to all domains)
  → invoices                   (coach-created, provider-agnostic)
  → payment_histories          (all successful payments)
  → invoice_histories          (immutable audit trail)
      

Two separate webhook endpoints — /stripe_webhook and /stripe_connect_webhook — with different signing secrets mean a connected account event can never accidentally trigger platform billing logic. This boundary is enforced at the infrastructure level, not just in code.

03 · The Hard Parts

Five engineering problems
that actually mattered

Problem 1: Webhook idempotency. Stripe retries failed webhooks up to 72 hours. Without idempotency, a network timeout after processing but before responding would trigger the same subscription creation twice — activating two plans, sending two emails, granting double SMS credits.

app/controllers/stripe_webhooks_controller.rb Ruby
# Insert FIRST — process ONLY if insert succeeds
ProcessedStripeEvent.create!(
  stripe_event_id: event.id,
  event_type:      event.type,
  stripe_created_at: Time.at(event.created)
)
# Unique index on stripe_event_id prevents duplicates
# If already exists → ActiveRecord::RecordNotUnique → return 200 OK silently

Billing::StripeBusinessPlanSyncService.new(event).handle

Problem 2: Race condition during checkout. A coach opening two browser tabs simultaneously could both reach the Stripe checkout creation endpoint before either tab received a response — creating two Stripe subscriptions billed on day one.

app/controllers/subscriptions_controller.rb Ruby
current_user.with_lock do
  # Reuse unexpired pending checkout session
  if current_user.pending_checkout_session_id.present? &&
     current_user.checkout_session_expires_at > Time.current
    return render_existing_session
  end

  # Block if live Stripe subscription already exists
  ensure_no_live_stripe_subscription!(customer)

  # Only now: create new Checkout Session
  session = Stripe::Checkout::Session.create(...)
  current_user.update!(pending_checkout_session_id: session.id, ...)
end

Problem 3: Plan upgrade proration with interval change. Upgrading from monthly to annual required resetting the billing cycle anchor to now — otherwise Stripe prorated against the old monthly anchor, generating incorrect invoices.

app/controllers/subscriptions_controller.rb Ruby
interval_changed = current_plan.interval != new_plan.interval

Stripe::Subscription.update(
  stripe_subscription_id,
  items:                 rebuilt_items,
  proration_behavior:    'always_invoice',
  payment_behavior:      'pending_if_incomplete',
  billing_cycle_anchor:  interval_changed ? 'now' : 'unchanged'
)

# Only update local plan if payment intent actually succeeded
# Declined card = no local state change, user stays on old plan
if payment_captured?(stripe_subscription)
  business_plan.update!(plan: new_plan, ...)
end

Problem 4: Connected account isolation. Every Stripe Connect API call must use the coach's token, not the platform key. A single wrong credential resolves to a different Stripe account entirely — charging or reading data from another coach's account.

app/services/stripe_payment_service.rb Ruby
stripe_auth = coach.omini_auths.find_by(provider: 'stripe_connect')

# Set API key to coach's connected token — never the platform key
Stripe.api_key = stripe_auth.token

session = Stripe::Checkout::Session.create(
  {
    line_items:    line_items,
    mode:          'payment',
    success_url:   success_url,
    cancel_url:    cancel_url,
    metadata:      metadata
  },
  { stripe_account: stripe_auth.o_uid }  # explicit account header
)

Problem 5: Failed payment graceful degradation. Stripe retries failed subscription payments automatically. The wrong behaviour is to immediately terminate access on first failure. The correct behaviour is to keep access alive during retries and only terminate after the final retry fails.

app/services/billing/stripe_business_plan_sync_service.rb Ruby
def handle_payment_failed(invoice)
  subscription = Stripe::Subscription.retrieve(invoice.subscription)

  if subscription.next_payment_attempt.present?
    # Stripe has another retry scheduled — keep access active
    sync_payment_status(subscription)
    send_payment_warning_email
  else
    # Final failure — no more retries scheduled
    Stripe::Subscription.cancel(subscription.id)
    Stripe::Invoice.void_invoice(invoice.id)
    business_plan.update!(is_plan_active: false, status: false)
    business.update!(terminated: true)
    send_suspension_emails
  end
end

04 · Engineering Decisions

Why the system
is built this way

🔐

Two webhook endpoints, two signing secrets

Platform events and connected account events use different Stripe webhook secrets and different endpoints. A misconfigured connected account can never accidentally trigger platform billing logic.

📋

Insert-first idempotency pattern

The stripe_event_id is inserted with a unique constraint before any side effects run. If the insert fails, the event was already processed. No risk of double-activation, double-deduction, or double-email under any retry scenario.

🔒

Pessimistic locking during checkout creation

with_lock on the user record during checkout session creation prevents two simultaneous requests from both creating Stripe subscriptions. The second request reuses the first session or fails cleanly.

💾

Local invoice cache reduces Stripe API calls

stripe_invoices acts as a local cache for hosted URLs, PDFs, status, and amounts. Common reads never hit the Stripe API, avoiding rate limits and reducing latency for billing history pages.

Graceful degradation on payment failure

Access remains active during Stripe retry windows. Only the final failed payment triggers termination. This matches how real users expect billing failures to behave and reduces support tickets from false lockouts.

🔄

PayPal first_or_create idempotency

PaymentHistory.first_or_create on payment_token prevents a PayPal callback retry from marking the same invoice paid twice. The same pattern protects Stripe invoice payments from duplicate payment_histories.

"The billing system taught me that financial engineering is not about integrating an API. It is about anticipating every way a distributed system can lie to you — duplicate events, concurrent requests, network timeouts between charge and response — and building explicit defences against each one."

05 · What I Learned

Ten things this system
taught me as an engineer

01

Idempotency is the first requirement of any payment system

Networks fail between charge and response. Stripe retries webhooks. Users double-click buttons. Every financial side effect must be safe to trigger multiple times without multiplying the result.

02

Credential isolation is a security boundary, not a code style

Platform Stripe credentials and connected account credentials must never be confused. One wrong variable assignment charges the wrong account. Explicit account headers on every connected API call enforce this at the Stripe API level.

03

Local state is the source of truth, not the payment provider

Stripe's state and your database can diverge. Always sync local records from webhook events, not from real-time API calls. Your database is what your users see — keep it accurate and cache aggressively.

04

Billing cycle anchors matter more than the subscription update itself

Upgrading a monthly plan to annual without resetting the billing anchor generates incorrect prorations. The billing_cycle_anchor parameter is one line of code with enormous financial consequences.

05

Graceful degradation protects both users and business

Immediately terminating access on first payment failure loses customers who would have paid if given time. Stripe's retry window exists for a reason. Honour it in your local state machine.

06

Stripe Prices are immutable — design around it

You cannot update amount, currency, or interval on an existing Stripe Price. Archive the old one and create a new one. Building this into the update flow from the start prevents production errors later.

07

Webhook signature verification is non-negotiable

Unsigned webhook payloads must never be processed in production. Anyone who can hit your endpoint can fake a payment success event. The signature check is the entire security model of webhook-driven billing.

08

Multiple payment providers means multiple failure modes

Stripe and PayPal fail differently, retry differently, and represent success differently. Building a shared invoice and payment_history layer on top abstracts these differences and keeps reporting consistent.

09

Checkout session reuse prevents duplicate subscriptions

Users open multiple tabs. Caching the pending checkout session and reusing it within its expiry window is a simple, effective defence against duplicate subscription creation that most tutorials never mention.

10

Audit trails are legal infrastructure, not nice-to-haves

invoice_histories creates an immutable record of every state transition. payment_histories records every successful charge. These are not logs — they are the evidence trail for disputes, chargebacks, and reconciliation.

06 · What I'd Do Next

If I were continuing
to improve this system

Reliability

Stripe API call optimization

Several flows use Stripe list endpoints where a direct retrieve by ID would suffice. Replacing list calls with ID-based retrieval reduces API latency and avoids pagination issues under load.

Security

Guest invoice payment authorization

Public invoice payment flows rely on invoice ID and session ID combinations. Adding a signed token to public payment URLs prevents unauthorized access to invoices not intended for that payer.

Performance

Credit balance Redis caching

SMS credit balance is calculated via a database aggregation query on every SMS send. Caching with Redis and invalidating on transaction creation would eliminate this query from the hot path entirely.

Reliability

Stripe Price archive cleanup

Each payment option update archives the old price and creates a new one. Over time this creates significant archived price accumulation. A background job to clean up stale archived prices would maintain account hygiene.

UX

Real-time payment status via webhooks

Currently clients poll for payment confirmation. A WebSocket or server-sent events channel that pushes the webhook result to the open browser tab would eliminate polling latency entirely.

Security

Stripe Connect credential rotation

Connected account tokens in omini_auths have no automatic rotation. Adding refresh token rotation on a schedule and alerting on expired tokens would prevent silent payment failures from stale credentials.

07 · Outcome

What shipped and
what it changed

3×
Separate payment domains unified under one Rails application
$0
Duplicate charges in production after idempotency implementation
99%+
Platform uptime maintained through payment failures and retries
Webhook events processed, zero duplicate side effects

The billing system now handles three completely different payment flows cleanly: coaches subscribing to the platform, clients paying coaches through Stripe Connect, and clients paying coaches through PayPal. All three domains write to the same invoice and payment_history layer for consistent reporting.

What I am most proud of is what did not happen. No duplicate subscriptions in production. No coach accidentally charged through another coach's account. No invoice marked paid twice from a PayPal callback retry. The defensive engineering held under real production conditions.

08 · Tech Stack

Everything that went
into this system

Ruby on Rails 6.1 Stripe Ruby SDK Stripe Connect Stripe Webhooks PayPal Express Checkout PostgreSQL Delayed Job Vue 2 Stripe JS Rails Credentials UUID primary keys JSONB columns Row-level locking