How codesandbox runs the npm module in the browser

At present, there are more and more web IDE products running on the browser side. According to their functional characteristics, the current web IDE can be divided into two types. One is to migrate the functions of the local IDE basically intact. To the IDE on the web side, such as the most popular front-end IDE VS Code, with the help of cloud + containerization capabilities, VS Code has almost exactly the same functions as the local IDE on the browser side; there is also a web IDE that is more Focus on the presentation of’page development and real-time code parsing, compiling, and previewing ', and the implementation of code packaging and construction It is not limited to the implementation at the server level (such as based on Docker containers, etc.). Some products implement the functions of’compile, package, build, and run ‘based on browser-side code, and all of this is only based on our original development system.’ Local IDE + Node local build, local service + browser access preview 'has the ability. Representatives of such products are CodeSandbox, codepen, StackBlitz, JSFiddle, etc.

That is to say, the former just puts the code editing on the web segment. In fact, it uses the cloud to store the code, compile the project, package and run it, etc. In this way, what we finally get is no different from local development. The difference is that we don’t have to specially download an editor.

The latter is a part of the compile packaging function and the final run in the browser, and due to the limitations of the browser, the size of the application it can support is limited (PS: recently encountered a problem, see the error should be Code size exceeds 500K, leave a pit, leave it to be solved later, hehe)

Can be’similar to the ability to build based on local webpack packaging 'migrated to the browser side seems to be a very incredible thing, the above has also been mentioned, there are often two ways to achieve, one is based on the server side of the webpack packaging build, after the build will build the code and then transferred to the browser side parsing execution, related practices such as:基于webpack打造前端在线编译器Another implementation is to provide the code of the dependent package (pulled from the npm installation) at the server level and return it to the Client. The packaging build is completely implemented on the browser side, implementing the’webpack 'on the browser side. For example, CodeSandbox is the implementation of this pattern. Today we will take a look at the introduction of this article by the author of CodeSandbox, how all this is achieved.

Note here that the implementation of codesandbox has gone through multiple iterations, but only how the server level provides dependencies has changed. After loading dependencies from the server level, they are returned to the client for use.

Content organization

Because many of the contents below are translated from the original text of the codesandbox author, some places are difficult to understand, I will sort out the whole process first

  • In the first version, you need to download the dependencies to the local in advance, analyze the required dependencies dynamically at runtime, and then require the required dependencies to be downloaded by the stub local implementation, not only can not support all dependencies, but also the recursion analysis performance has a bottleneck
  • With the idea of webpack DllPlugin, first send the dependency to the background, according to the hash of the dependency, find whether there is a cache in the background, if not, analyze the dependency, download it through yarn, and then package it into a dll and send it back to the caller. One problem with this version is that if it is not clearly defined in the dependency relationship, it cannot be packaged, and the cache is based on the dependency relationship. If there is the same package in two different dependency trees, it will not be reused
  • In order to solve the first problem of the above version, the author implemented a webpack packer that can add its own entrance
  • In order to solve the second problem, the author combines serverless and splits the dependencies. The server caches independent dependencies one by one. The server just returns the downloaded dependencies to the front end. The real responsibility for packaging is the front end, so that the front end can achieve on-demand packaging, which cannot be achieved on the back end because there is no actual code on the back end, so there is no such “need”.
  • Then in order to achieve the offline version, the author made another layer of caching on the front end
  • At this point, the codesandbox we are currently using is implemented

First version

This version of codesandbox just implements an algorithm by itself, using a loading method similar to require to load dependencies one by one to the local (I personally think this local should refer to the user’s personal browser). The author of codesandbox personally believes that the first version cannot be regarded as full support for npm.

That is to say, this version does not load dependencies from the npm repository in real time according to the dependencies in the code, but downloads the dependencies to the local in advance, and then stubs the require in the code, so the author says that this version does not support all npm dependencies.

And this version should be specifically require a time to analyze what a depends on, and then layer by layer recursion, this kind of recursion if the project depends on complex, performance also has a big bottleneck.

This

Even

Webpack version

The first version authors thought that full support for npm was impossible until someone actually implemented it.

So the author is considering how to achieve the general, he began to design an algorithm, but this algorithm is more complex, and ultimately there is no actual use, I will not repeat, interested can see the reference link at the end of the article.

Then, the author refers to the implementation of the DLLPlugin plugin of webpack.

Simply put, what DllPlugin does is to package a project into a dll dependency, wrap the dependencies in the project in the dll, and then expose the interface to the outside world. This is the official doc address:DllPlugin文档

Webpack’sDLLPluginDependencies can be packaged, and a manifest list is used to mark which dependencies are included in the typed js package. The list looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
"name": "dll_bundle",
"content": {
"./node_modules/fbjs/lib/emptyFunction.js": 0,
"./node_modules/fbjs/lib/invariant.js": 1,
"./node_modules/fbjs/lib/warning.js": 2,
"./node_modules/react": 3,
"./node_modules/fbjs/lib/emptyObject.js": 4,
"./node_modules/object-assign/index.js": 5,
"./node_modules/prop-types/checkPropTypes.js": 6,
"./node_modules/prop-types/lib/ReactPropTypesSecret.js": 7,
"./node_modules/react/cjs/react.development.js": 8
}
}

Each path maps a module id. If I wanted to introduce React, I’d just call dll_bundle (3) and I’d have React! This is perfect for requirements,

So the author started to act, based on the idea of Webpack DllPlugin, and came up with the following system:

For each request that is packaged, I will create a new directory under’tmp/: hash ‘, then run’yarn add ${dependencyList}’, and then let’webpack ‘do the packaging process. At the same time, as a caching solution, I will save the new package to gcloud. This looks much simpler than the scheme diagram above, more because I use yarn to install dependencies and use’webpack’ for packaging as an alternative to the previous implementation.

It may be a little difficult to understand here. Let me tell you my opinion. In the previous version, that is, the first version, codesandbox needs to download the dependencies to the local in advance, and then when it really runs to require, it will analyze the dependencies recursively., and then require to stub to the local downloaded dependencies

And this version, with the help of webpack

However, this system still has a very big limitation, it does not support the introduction of files that are not in the webpack dependency diagram. This means something like the following example:

1
require('react-icons/lib/fa/fa-beer')

Will not work properly, because it is not needed from the beginning to the end of the dependency entry, and will not be packaged in it. (The packaging of webpack is based on the dependent modules in package.json and the dependencies of each dependent module. Files that are not included in this system will not be packaged)

Webpack with entrance

In order to solve the limitation just mentioned, that is, files that are not in the webpack dependency cannot be packaged into the final dll.

Manually added the entry configuration to ensure that’webpack 'can also pack these files into it. After a lot of adjustments to this scheme, the system can now support any (? Translator’s Note: The author added a question mark here, indicating that he is not sure about supporting any) combination of packaging requirements. So you can also load react-icons, css files are also possible.

Access Serverless

What is Serverless?

Based on serverless, you can define a function that will trigger execution when the server is requested: the function will be started first, then process the request, and kill and release itself after a period of time. This also means that you will have very high scalability: if your server has 1000 requests coming at the same time, you can start 1000 services immediately. This also means that you only need to pay for the actual running time.

How to combine serverless

Serverless sounds perfect for our service: the service is not always running, and if there are multiple requests at the same time, we need high concurrency. So I started very eagerly using something calledServerlessThe framework.

Thanks to Serverless, our service migration was very smooth, and I had a working version within two days. I created three serverless functions:

A source data parser: This service is used to parse versions and peerDependencies, and request packaging functions;

    1. A packer: This service is used for the installation and packaging of actual dependencies;
    1. An uglifier (compression & obfuscation): responsible for asynchronously uglifying packages generated by packaging.

A few days later I found a limitation: a lambda function can only have a maximum of 500M disk space, which means that some combined dependencies cannot be installed (Translator’s Note: The backend needs to pack all the dependencies when doing the build. The code is loaded into memory). This was really a devastating limitation, and I had to switch the service back to the original implementation.

A few months later, I released a new builder for CodeSandbox.I released a new bundler for CodeSandbox). This builder is very powerful and can easily allow us to support more frameworks like Preact or Vue. By supporting these frameworks, our service has received some very interesting requests. For example: If you want to use React in Preact, you need to rename’require (‘react’) ‘to:’ require (‘preact-compat’) ‘. For Vue, you might include’ @/components/App.vue 'as your sandbox file. Our server-side packager doesn’t handle this kind of thing, but our browser-side bundler does.

That’s when I started thinking that we might be able to get the browser-side builder to do the actual packaging. If the server level just sends the relevant files to the browser (without doing the server-level packaging and building), and then we use the browser-side builder to actually package the dependencies, this should be faster because we are not handling the entire large package, only part of the package.

The server level packaged build based on webpack DLLPLugin will recursion traverse all dependencies from the dependency entry and then perform the packaged build, while the packaged build of the browser is only packaged on demand. So there are two reasons for it to be faster. First, the browser-side packaging construction does not require the server level to do the packaging construction. The server level is just a pure recursion acquisition of dependencies, and then sent to the browser side, which saves the server level. The time of packaging and building also saves server overhead; the second is that the packaging and building on the browser side is built on demand rather than in full.

This solution has a very big advantage: ** We can achieve separate installation and caching of dependencies ** (remember what the webpack version said, from that version, we cached not one dependency, but all the dependencies in the dependency combination), and then we implement the merge of dependencies on the end. This means that if you request a new dependency on top of all existing dependencies, you only need to collect files for the new dependency! This will solve the limitation of AWS Lambda500M memory limit well, because we will only install a dependent module at the server level. We can also drop’webpack 'in the packer, because now the packer is solely responsible for finding the relevant files that are dependent and sending them to the browser side.

Join browser cache

The author said that he did not take the scheme of dynamically requesting files directly from the unpkg.com because he wanted to support the offline scheme, that is, even if you do not have a network, you can also implement the compile packaging build preview on the browser side, provided that you have already done the relevant files on the browser side. Based on the server level single dependency packaging implemented by the author, the scheme caches all the files of the entire dependency module in the local browser, while the dynamic request file from the unpkg.com is a single request for a single file in a dependency module, which is prone to A dependency file does not exist.

That is to say, every time you go to request a separate dependency, you will first check if there is a local cache before going to the background to get the dependency.

Final version

** CodeSandbox packaging and running does not depend on the server, just if you need to rely on the client without caching, you need to go to the server to request **

  • ** Editor **: Editor. Mainly used to modify files, CodeSandbox integrates’VsCode ‘here, and notifies’Sandbox’ for translation after file changes.
  • ** Sandbox **: Code Runner. ** Sandbox runs in a separate iframe, responsible for code translation (Transpiler) and run (Evalation) **. As shown in the top picture, the left side is Editor, the right side is Sandbox
  • Packager. Similar to yarn and npm, responsible for pulling and caching npm dependencies

Author of CodeSandbox Ives van Hoorne I have also tried to port Webpack to the browser to run, because almost all CLIs are now built using Webpack. If you can port Webpack to the browser, you can take advantage of Webpack’s powerful ecosystem and translation mechanism (loader/plugin), low cost compatible with various CLIs.

However, Webpack is too heavy 😱, and the compressed size is 3.5MB, which is barely acceptable; the bigger problem is to simulate the Node runtime environment on the browser side, which costs too much and outweighs the gains.

So CodeSandbox decided to build its own packager, which is lighter and optimized for the CodeSandbox platform. For example, CodeSandbox only cares about the code building of the Development Environment, and the goal is to run it. Compared with Webpack, the following features have been cut out:

  • Production mode. CodeSandbox only considers the development mode, and does not need to consider some features of production, such as
    • Code compression, optimization
    • Tree-shaking
    • Performance optimization
    • Code Splitting
  • File output. No need to pack into chunks
  • Server communication. Sandbox translates and runs directly in situ, while Webpack needs to establish a long connection with the development server to receive instructions, such as HMR.
  • Static file processing (such as images). These images need to be uploaded to CodeSandbox’s server
  • Plugin mechanisms and more.

So it can be considered that CodeSandbox is a simplified version of Webpack, and optimized for the browser environment, such as using workers for parallel translation.

Project build process

1
packager -> transpilation -> evaluation

Sandbox construction is divided into three stages:

Packager loading phase, download and handle all npm module dependencies

  • ** Transpilation ** Transpilation phase, translates all changed code, constructs modeling block dependencies
  • ** Evaluation ** Execution phase, run module code with’eval 'for preview

Packer

Since CodeSandbox already covers the code building part, we don’t need’devDependencies’, that is to say ** in CodeSandbox we only need to install all the dependencies needed for the actual code to run, which can reduce hundreds of dependency downloads. So don’t worry about the browser being overwhelmed for the time being **.

Before Packer downloads dependencies, it actually goes through the Transpilation transfer phase to analyze dependencies on demand, and then takes the analysis product to Packer.

Transpilation

This stage starts from the application’s entry file, translates the source code, parses the AST, finds the subordinate dependency modules, and then recursion translation, and finally forms a’dependency graph '.

The entire transpiler of CodeSandbox runs in a separate iframe.

img

The Editor is responsible for changing the source code. The source code changes will be passed to the Compiler through postmessage, which will carry’Module + template '.

  • ** Module ** contains all source code content and module path, which also contains package.json, Compiler will read npm dependencies according to package.json;
  • ** template ** represents the Preset of the Compiler, such as’create-react-app ‘,’ vue-cli ', which defines some loader rules for translating different types of files, and the preset also determines the template and entry file of the application. From the above we know that these templates are currently predefined.

The overall situation can be basically divided into the following four stages:

  • ** Configuration phase **: The configuration phase creates Preset objects, determines entry files, etc. CodeSandbox currently only supports limited application templates, such as vue-cli and create-react-app. The directory structure convention between different templates is different, such as entry files and html template files. In addition, the rules for file processing are different, for example, vue-cli needs to handle ‘.vue’ files.
  • ** Dependency download phase **: Packager phase, download all dependencies of the project and generate Manifest objects
  • ** Change calculation stage **: Calculate the added, updated, and removed modules according to the source code passed by the Editor.
  • ** The translation stage **: When the translation really starts, first re-translate the modules that need to be updated calculated in the previous stage. Then from the entry file as a starting point, translate and build a new dependency graph. Modules and their submodules that have not changed will not be translated repeatedly here

Evaluation

Although it is called a bundler, CodeSandbox does not package, which means it does not package all modules into chunks files like Webpack.

Transpilation starts from the entry file, then analyzes the module import rules of the file, and recursion translates the dependent modules. By the Evaluation stage, CodeSandbox has built a complete dependency graph. Now it’s time to run the application

Reference link:

https://www.yuque.com/wangxiangzhong/aob8up/uf99c5?language=en-us

codesandbox作者解读

https://segmentfault.com/a/1190000019679430