Logo

Goal

  • Build a distributed Payment Gateway Service
  • Using a PSP (Stripe, Razorpay, etc)
  • Handle for high traffic load, Auth measures & failures

Functional Requirements

  • High Availability (99.9% uptime)
  • Consistency >> Availability if given the choice
  • Atomic: Pass or Fail transactions, not "maybe" (even if inconvenient)
  • Reliability, dealing with:
    1. Cascading failures -> service dead doesn't jam up traffic
    2. Service dead -> know when services down & mitigate
    3. Idem-potency -> if user tries multiple, still don't retry transactions for some TTL

Components

Pasted image 20241030223743.png

  • Payment Services: Interacts with PSP & pass into MQ if pass
  • PSP -> Stripe / Razorpay (Interacts with banks)
  • Wallet Service: Aggregate account value
  • Ledger Service: User details
  • Service Heartbeat: when lots of systems, checks which alive or not

Solutions

Cascading Failures -> apply rate limit on 2 factors:

  1. "x%" of user have latency >= "y" threshold in "z" time window
  2. Service not detected after "k" pings -> limit that type of message to be added (until old processed) (kind of a circuit breaker)

Idem potency

  • Use a idempotent-key (unique 32 bit signed payload) -> pass into HTTP header -> disallows same requests if user refreshes for a limited TTL (adjustable).

High Availability

  1. Hash based Sharding (abstract the hash function - may change), what it solves?
    1. if one down, other can be used
    2. if user traffic high, make new connection pool on other DB
  2. Replication can be:
    1. Async -> more available, but we need consistency
    2. Sync -> slower (less available), but better consistency (better for this case)

© 2025 All rights reservedBuilt with DataHub Cloud

Built with LogoDataHub Cloud