System Design for Frontend Engineers: Latency, Consistency, and Why Every Decision Is a Tradeoff - AlexWebLab in Bangkok, Thailand now, before in Hong Kong 香港

When people say "system design," many frontend engineers hear "backend interview trivia." That framing is wrong. The browser is a distributed-system client. Every time your React app waits on an API, retries a failed mutation, renders stale cache data, or handles an optimistic update rollback, you are dealing with system-design tradeoffs whether you call them that or not.

This article is the foundation for the series. Not the memorized-interview version. The practical version: how to reason about latency, throughput, consistency, availability, and bottlenecks from the point of view of the user staring at your UI.

The Browser Sits at the End of a Long Chain

When a user clicks "Place order," the browser is only the first hop in a chain that might include:

DNS lookup
CDN edge
Load balancer
API gateway
Auth service
Order service
Payment provider
Database write
Queue for email and analytics

If any one of those pieces is slow, overloaded, or inconsistent, the user experiences the problem in the UI. That is why system design matters to frontend engineers: the frontend is where distributed-system behavior becomes visible.

Click button
  -> request starts
  -> edge cache miss
  -> origin fetch
  -> service calls downstream dependencies
  -> database commit succeeds
  -> analytics event times out
  -> partial success reaches the UI

The frontend engineer now has to answer hard questions:

Do we show success, failure, or "processing"?
Do we retry automatically?
Is it safe to retry, or will we create a duplicate order?
Can stale cache data be shown while fresh data loads?
How much latency can we hide with optimistic UI before correctness suffers?

That is system design.

Every Decision Is a Tradeoff

The image you shared gets one thing exactly right: every decision is a tradeoff. There is no architecture that is fast, cheap, simple, globally available, strongly consistent, and infinitely scalable at the same time.

Examples:

A CDN reduces latency globally, but invalidation becomes harder.
A replica database improves read throughput, but introduces replication lag.
A queue smooths traffic spikes, but makes workflows asynchronous and eventually consistent.
A rate limiter protects the backend, but can create UI friction if the client has no backoff strategy.

The wrong mental model is "what is the best architecture?" The right mental model is "which failure mode are we willing to accept for this product?"

Latency, Throughput, and the User's Patience Budget

Two numbers matter constantly in product engineering:

Latency: how long one operation takes.
Throughput: how many operations a system can handle over time.

You can often improve one while hurting the other. Batching requests may improve throughput but increase per-request latency. Synchronous validation across many services may improve correctness but slow the response path.

From the UI side, users do not care which layer caused the problem. They only see outcomes:

Under 100ms: it feels immediate.
Around 300ms: it feels responsive, but noticeable.
1 second and above: users start wondering if something is wrong.
3 seconds and above: drop-off risk increases sharply.

That is why system design and UX are linked. Architecture decides whether your interface feels instant, sluggish, or unreliable.

Consistency Is a Product Decision, Not Just a Database Property

Consistency is usually explained with databases, but the frontend feels it first.

Imagine a dashboard where the user updates a profile photo. The write goes to the primary database. The next page load reads from a replica that is 800ms behind. The UI shows the old photo. Technically the system is working. Product-wise it feels broken.

This is why you need to know the difference between:

Strong consistency: every read reflects the latest successful write.
Eventual consistency: reads may temporarily return older data, but the system converges.

The frontend response must match the backend guarantee:

async function saveProfilePhoto(file: File) {
  const optimisticUrl = URL.createObjectURL(file)
  setUser((current) => ({ ...current, avatarUrl: optimisticUrl }))

  try {
    await api.uploadAvatar(file)
    await queryClient.invalidateQueries({ queryKey: ['user-profile'] })
  } catch {
    queryClient.invalidateQueries({ queryKey: ['user-profile'] })
  }
}

If the backend is eventually consistent, optimistic UI and deliberate revalidation are not optional polish. They are how you keep the product feeling coherent.

Availability and Reliability Are Different Things

People often blur these together.

Availability asks: can the system respond right now?
Reliability asks: does it keep behaving correctly over time?

A service can be highly available and still unreliable if it returns stale data, times out frequently under load, or drops background jobs. Conversely, a very strict system may preserve correctness while sacrificing availability during incidents.

Frontend engineers need this distinction because the UI has to encode it:

If the write path is unavailable, disable the action and show a retry strategy.
If the read path is available but stale, show a freshness hint.
If the system accepted the action asynchronously, show a processing state rather than lying with a fake success.

The Four Questions to Ask About Any System

When you look at a new architecture, ask these four questions first:

Where is the bottleneck? CPU, database locks, network round-trips, cache misses, third-party APIs?
What happens under failure? Retry, fail open, fail closed, queue, degrade gracefully?
What is allowed to be stale? Product catalog, analytics, profile data, payment status?
How does the frontend learn the truth? Polling, push events, revalidation, optimistic rollback?

These questions are more valuable than memorizing architecture buzzwords because they force you to reason from product behavior back to infrastructure.

A Frontend-Centric Mental Model

For frontend and React work, system design usually shows up in six recurring forms:

Read path design: cache, CDN, replicas, indexes, and query shape.
Write path design: synchronous writes, queues, idempotency, and retries.
Failure containment: circuit breakers, timeouts, fallback data, disabled UI.
Traffic shaping: rate limiting, autoscaling, batching, and load balancing.
Consistency management: optimistic UI, invalidation, revalidation, conflict resolution.
Observability in the product: loading states, error states, freshness indicators, and recovery paths.

If you train yourself to map every feature onto those six concerns, system design stops feeling abstract.

Conclusion

System design for frontend engineers is the discipline of understanding how distributed-system tradeoffs surface in user-facing behavior. Latency becomes loading states. Consistency becomes stale or fresh UI. Availability becomes whether the button should be clickable. Reliability becomes whether retrying is safe.

The rest of this series drills into the specific building blocks: monoliths and microservices, gateways and queues, rate limiters and failover, replication and sharding, caches and CDNs, and the data-system tradeoffs that shape what users ultimately experience.