Streaming SSR: How Servers Send HTML Before the Page Is Ready - AlexWebLab in Bangkok, Thailand now, before in Hong Kong 香港

Traditional server-side rendering is synchronous: the server fetches everything the page needs, renders the full HTML string, and sends it all at once. The browser gets nothing until the server is done — and the server's speed is limited by its slowest data source.

Streaming SSR inverts this. The server starts sending HTML immediately with the parts it already knows, then flushes additional chunks as deferred data resolves. The browser can start parsing and rendering long before the server has finished.

The Problem with Waterfall SSR

In a classic Next.js Pages Router application, getServerSideProps runs before any HTML is sent:

Browser → Request → Server fetches A, B, C → Renders HTML → Sends all at once → Browser parses

If fetching A, B, and C takes 800ms total, the browser waits 800ms before it sees a single byte. For users on slow connections, or pages with many data sources, this is a significant delay.

How Streaming Works at the HTTP Level

HTTP/1.1 supports chunked transfer encoding: the server sends a Transfer-Encoding: chunked header and writes the response body in multiple pieces. The browser receives and processes each chunk as it arrives, without waiting for the response to complete.

React's renderToPipeableStream API (Node.js) and renderToReadableStream API (edge runtimes) both produce streaming output compatible with chunked transfer.

React Suspense as the Streaming Boundary

React's streaming integration is built around <Suspense>. When the server encounters a Suspense boundary wrapping a component that is not yet ready, it:

Sends the fallback (spinner, skeleton) in the initial HTML flush
Continues rendering and streaming other parts of the page
When the suspended content resolves, inlines it in a <script> tag that contains the HTML + a small script to replace the fallback in place

// App.jsx
export default function Page() {
  return (
    <Layout>
      <Hero />          {/* fast — ships in first flush */}
      <Suspense fallback={<Skeleton />}>
        <SlowDataSection /> {/* streams in when ready */}
      </Suspense>
    </Layout>
  )
}

The user sees the hero immediately, then the skeleton, then the real content when the data is available — all from a single server request with no client-side JavaScript required for the data fetch.

Next.js App Router and Streaming

The Next.js App Router is built on streaming. Every async Server Component that doesn't have a parent Suspense boundary participates in the default streaming behavior. Adding explicit Suspense boundaries gives you control over what shows while data loads:

// app/dashboard/page.tsx
import { Suspense } from 'react'
import { Stats } from './Stats'
import { RecentOrders } from './RecentOrders'

export default function Dashboard() {
  return (
    <div>
      <h1>Dashboard</h1>
      <Suspense fallback={<StatsSkeleton />}>
        <Stats />
      </Suspense>
      <Suspense fallback={<OrdersSkeleton />}>
        <RecentOrders />
      </Suspense>
    </div>
  )
}

Stats and RecentOrders each fetch their own data. They can resolve and stream in independently — neither blocks the other.

Streaming vs Static Generation

Streaming SSR is not always better than static generation. Static pages (generated at build time) have near-zero time to first byte since they are served directly from a CDN. Streaming SSR is the right tool when:

Content is personalized per user (can't be cached globally)
Data is highly dynamic and must be fresh per request
Some parts of the page are slow but others are fast and should show immediately

If your page can be statically generated, that is still the fastest option.

The Loading UI Convention in Next.js

The App Router's loading.tsx file creates an automatic Suspense boundary around the page segment:

app/
  dashboard/
    loading.tsx   ← shown while page.tsx is loading
    page.tsx

This is syntactic sugar for wrapping the page in <Suspense fallback={<Loading />}>, but at the routing level, enabling instant loading states during navigation as well as on the initial visit.

Caveats

Streaming requires HTTP/1.1 chunked transfer or HTTP/2 multiplexing — both are universally supported. The complexity shifts to understanding Suspense boundaries: placing them at the wrong granularity (too coarse or too fine) can degrade the perceived experience. A skeleton that flashes briefly before the real content is worse than a brief wait for the full content.

Conclusion

Streaming SSR is a fundamental shift in how servers deliver pages: from batch-and-send to stream-as-ready. React's Suspense model is the primitive that makes streaming composable — each boundary is a natural breakpoint where slow data can be deferred without blocking the rest of the page. For dynamic, data-heavy applications, streaming is the key to making server rendering feel as fast as client rendering.