In-depth Babel Principles Series (I) Introduction to Babel Workflow and Project Structure

Posted on 2021-06-22 Edited on 2025-07-02 In Babel Waline:

I recently came across a bit of knowledge about Babel and got some interest in it, so I planned to take a look at the principle of Babel and then summarize it for learning, which is too complicated to write in multiple blogs.

Workflow

It is important to mention here that this structure where Traverser calls multiple Transformers in the figure is the microkernel.
That means that the core code of Babel is actually only the left column, and Parser has built-in support for many syntaxes. For example, JSX, Typescript, Flow, and the latest ECMAScript specification. Currently, for the sake of efficiency, the parser is不支持扩展的。
Other additional features are implemented in the form of plugins, but Babel implements some built-in transformers to achieve some common functions, such as converting es2015+ code.

Parsing (Tokenizer + Parser)

For the source code, at this point we see it as a string, and the first step in its analysis is definitely to convert the source code into an AST first, before the subsequent operation.

There is a在线AST转换器, we can experiment on this and write the code and it translates it for us into AST::

I’ll write nothing, and the AST will have a root node: the

// AST
{
  "type": "Program",
  "start": 0,
  "end": 0,
  "body": [],
  "sourceType": "module"
} // It can be seen as an object with some fields, the root node of this code tree.

Then I write a sentence const text = 'Hello World'; and it becomes

{
  "type": "Program",
  "start": 0,
  "end": 27,
  "body": [
    {
      "type": "VariableDeclaration",
      "start": 0,
      "end": 27,
      "declarations": [
        {
          "type": "VariableDeclarator",
          "start": 6,
          "end": 26,
          "id": {
            "type": "Identifier",
            "start": 6,
            "end": 10,
            "name": "text"
          },
          "init": {
            "type": "Literal",
            "start": 13,
            "end": 26,
            "value": "Hello World",
            "raw": "'Hello World'"
          }
        }
      ],
      "kind": "const"
    }
  ],
  "sourceType": "module"
}

From this structure we can take a brief look at the structure of the AST node.

each node has type, start and end. type indicates the type of the node, just like the type of the root node is Program, which is all code, so it has start as 0 and end as last;

And different types of nodes may have their own different definitions, such as the VariableDeclaration node has the kind attribute, which indicates whether the variable is declared by const or var or let, and the declarations attribute, which indicates the specific content, which is an array, that is to say, a VariableDeclaration can declare more than one node, and the init attribute of each node indicates what data the variable is initialized with.

To summarize the characteristics of the AST tree:

Nodes are typed. When we learn a data structure like a tree, the nodes are the simplest, here it’s complicated, there are types.
The relationship between nodes and children is linked by the attributes of the nodes. We learn the tree structure, are left, right left child right child. But AST tree, different types of nodes, different properties, Program type node’s child node is its body property, VariableDeclaration type of child node, is its declarations, kind property. That is, the properties of the node are seen as the children of the node, and the children may also have types, and nearly form a tree.
The parent node is the combination of all the child nodes. We can see that the const text = ‘Hello World’ represented by VariableDeclaration is split into the following two child nodes, and the child nodes continue to be split again.

I hope that the above analysis will give you the most intuitive understanding of AST, which is a tree with types of nodes.

Then the type system of nodes is necessary to understand, here isBabel的AST类型系统说明. As you can see, you can say that the type system is abstracting the various members of the code, identifiers, literals, declarations, expressions. So having a tree structure of nodes of these types can be used to express our code.

Crossbar

Step 2: Convert. Now that you’ve got the ast, it’s time to manipulate it. babel-traverse in Babel is used to do this.

// Installation
npm install --save babel-traverse

// Experimental code
import * as babylon from "babylon";
import traverse from "babel-traverse";

const code = `const text = 'Hello World';`;
const ast = babylon.parse(code);

traverse(ast, {
  enter(path) {
    console.log('path', path);
  }
})

console.log('ast', ast);

The babel-traverse library exposes the traverse method, the first parameter is ast, the second parameter is an object, we wrote an enter method, the parameter of the method is a path, how is it not a node? Let’s look at the output:

In fact, this path contains the node attribute, but also contains many other attributes used for analysis, such as the scope attribute of the analysis scope.

Generator babel-generator

Step 3: Generate. The babel-generator in Babel is used to do this.

npm install --save babel-generator

// 加入babel-generator
import * as babylon from "babylon";
import traverse from "babel-traverse";
import * as t from "babel-types";
import generate from "babel-generator";

const code = `const text = 'Hello World';`;
const ast = babylon.parse(code);

traverse(ast, {
  enter(path) {
    if (t.isIdentifier(path.node, { name: "text" })) {
      path.node.name = 'alteredText';
    }
  }
})

const genCode = generate(ast, {}, code);

console.log('genCode', genCode);

Microkernels and Plugins

We’ve talked about Babel’s workflow above, and we’ve found that Babel’s core functionality is small. It’s very small, in fact, in four steps: splitting the code into tokens, building the token sequence into an AST, performing some operations on the AST, and finally converting the processed AST into a new code.

This core functionality is not large, but again, in order to be able to support complex functionality, so in the third step of the processing of the AST provides a plug-in mechanism (this plug-in mechanism is through访问者模式implemented), and this architectural approach is called a microkernel.

A detailed explanation can be found in this blog: https://bobi.ink/2019/10/01/babel/#访问者模式

Reference link:

https://juejin.cn/post/6844903905961181191

https://www.babeljs.cn/docs/