Principles of frontend performance optimization from the perspective of the browser rendering pipeline

Preface

In the article What happens after entering a URL in the browser, we learned that after the browser receives HTML data from the Network Process, it hands it off to the Render Process for rendering. From a frontend engineering perspective, performance optimization is about tuning the code during the Render Process rendering to speed up rendering. Therefore, understanding how the Render Process renders a page is very important for frontend performance optimization!

Browser Rendering Pipeline

The browser rendering pipeline can basically be divided into the following steps:

Build the DOM tree
Compute styles
Layout
Layering
Paint
Compositing (tiling, rasterization, compositing)

Since this is a pipeline structure—meaning the output of one step is the input of the next step—if we just focus on what each step takes as input and what it outputs to the next step, we can clearly understand the entire Render Pipeline:

Building the DOM tree

The input is the simplest possible HTML file, which is then parsed by the HTMLParse module inside the Render Process into a tree-structured DOM Tree, and then output to the next stage—computing styles.
Computing styles

The purpose of this stage is to compute the style of every DOM node. When the Render Process receives a CSS file (whether through <style></style>, <link ref=""/>, or inline), it converts it into styleSheets so the browser can understand it. Then it computes the style of each DOM (handling inheritance, unit unification, etc.) and saves it in a ComputedStyle structure, which is output to the next stage—layout.
Layout

Combining the DOM Tree and ComputedStyle, then removing some invisible elements (such as <head>, display:none, etc.), it then computes the position of each DOM in the page, forming a Layout Tree that is output to the layering stage.
Layering

Some pages have complex visual effects (such as opacity, z-index, 3D effects in CSS, etc.), so the Render Process also needs to generate dedicated layers for some nodes, forming a Layer Tree, which is then output to the paint stage.
Paint

At this stage, the Render Process generates corresponding paint instructions based on the Layer Tree (in plain words, something like “draw blue at coordinates (100, 30)”, etc.), and submits the list of instructions to the compositor thread.
Compositor thread

The paint instructions are split into tiles based on the viewport (so the browser does not have to render everything at once), and then bitmaps are generated with priority for tiles near the viewport. This process usually involves the GPU Process to accelerate rendering. The compositor thread then sends a DrawQuad paint-tile command to the Browser Process.
Finally, the Browser Process generates the page based on the received DrawQuad message and displays it on the monitor.

Performance optimization based on the rendering pipeline

Once we understand the rough flow of the rendering pipeline, we can, like a master butcher dissecting an ox, optimize each step that we can manipulate, and thus achieve overall performance optimization! From the browser’s perspective, performance optimization can be further split into loading-stage optimization and interaction-stage optimization:

Loading-stage optimization

Resources like images and videos do not block the first page render, but Javascript files and CSS files do. This is because when building the DOM tree, when HTMLParse encounters a <script> tag, it pauses rendering and executes that script. And when constructing the Layout Tree, the CSS file is needed. (Additionally, if a JS file modifies CSS properties, it must wait for the CSS file to load and the CSSOM to be built before that JS script can run.)

Javascript file optimization:
- Use a CDN (because for the same site, the Network Process can run only a limited number of TCP connections (six?), and using a CDN bypasses this limitation)
- Compress JS file size (webpack plugins)
- If a JS file does not contain DOM-manipulating scripts, it can be loaded asynchronously (defer and async. The difference between them is: async executes immediately after the file finishes loading, while defer waits until the DOM has been built (DOMContentLoaded) to execute)
CSS file optimization
- A large CSS file can be split into different CSS files for different purposes, then specific CSS files can be loaded in specific situations.
- Use the layering technique to optimize. If you modify CSS directly, such as having JS apply a geometric transform, opacity change, or scaling on some DOM, the entire rendering pipeline is affected (Reflow/Repaint).
  - Reflow—because the DOM’s geometric properties (such as size) have been changed, the browser has to recompute styles, layout, and layering, which means restarting the rendering pipeline from the compute styles stage.
  - Repaint—if you only change a DOM’s color, you also need to recompute styles, but you do not need to redo layout or layering, so these two stages are skipped in the rendering pipeline; but it is still costly.
  - However, if you add the will-change property in the CSS file to tell the Render Process that this element will undergo a special change, the Render Process will move that change to the compositor thread for execution, and the compositor thread does not cause Reflow or Repaint of the entire rendering pipeline.
```
.box {
will-change: transform, opacity, background-color;
}
```

Interaction-stage optimization

In the interaction stage—the process from when the page finishes loading until the user interacts with it—the most important thing here is Javascript.

Reduce the execution time of Javascript
- Break a large Javascript task into many small tasks. If one JS task hogs the main thread for a long time, the user experience suffers badly.
  - For example: lazy-loading in an SPA router
- Use web workers to run JS scripts that are not related to the DOM but are time-consuming.
Avoid forced synchronous layout and layout thrashing
- Forced synchronous layout, simply put, is when within a single operation you both compute styles and change layout. For example, in function foo, you append a child node to .testDom and at the same time query the offsetHeight of .testDom.
- Layout thrashing is most likely to happen when there are multiple forced synchronous layouts inside a single function.
- The Virtual DOM, in a sense, also solves the problem of JS manipulating the DOM many times. You modify the Virtual DOM in the JS environment first, then render it into the real DOM, so you only have to modify it once.
Avoid frequent garbage collection
- For example, in a for loop, if a variable or object is shared, declare it outside the for loop instead of inside. That way you do not have to collect it after each iteration.

ChangeLog

20221026 - init
20260501–translate by claude code

Ref

Geek Time: 《Browser Principles and Practice》

[Frontend Note] Frontend performance optimization from the browser's perspective

Analyzing frontend performance optimization through the browser rendering principles