JavaScript Execution Mechanism (7) How to quickly find properties on objects
In the last blog we talked about the overall execution process of V8, and this time I will continue to summarize how V8 implements fast lookup of properties on objects.
Note that this part of the content is not a scope chain, nor a prototype chain. The scope chain is used to find objects, such as obj. The prototype chain is to go to its prototype if there is no attribute a on obj, and the topic of this article is how to quickly find a on obj.
Fast and Slow Attributes
An object in JavaScript is a collection of group properties and values. From the perspective of the JavaScript language, a JavaScript object is like a dictionary, with a string as the key name, and any object can be used as a key value, which can be read and written by the key name.
However, when implementing object storage in V8, the dictionary storage method was not completely adopted, mainly due to performance considerations. Because dictionaries are non-linear data structures, query efficiency will be lower than linear data structures. In order to improve storage and search efficiency, V8 adopts a complex storage strategy.
General and sorting properties
1 |
|
In the above code, we use the constructor function Foo to create a bar object. In the constructor function, we set a lot of properties to the bar object, including numerical properties and string properties, and then we enumerate all the properties in the bar object. Properties, and print them one by one, the following is the result of executing this code:
1 |
|
Observing the printed data, we found that the printed attribute order is not the order we set. When we set the attributes, we set them out of order. For example, we set 100 first, and then set 1, but the output content is very regular. In general, it is reflected in the following two points:
The set number properties are printed first, and they are printed in the order of number size;
The set string property is still printed in the previous setting order. For example, we set it in the order of B, A, and C, and it is still printed in this order.
The reason for this result is that the ECMAScript specification defines that numeric properties should be sorted in ascending order according to the size of the index value, and string properties should be sorted in ascending order according to the order in which they were created. Here we refer to numeric properties in objects as sorted properties, elements in V8, string properties as regular properties, and properties ** in V8.
In V8, in order to effectively improve the performance of storing and accessing these two attributes, two linear data structures are used to save the sorted attributes and the general attributes respectively. The specific structures are shown in the following figure:
Through the above figure, we can find that the bar object contains two hidden properties: the elements property and the properties property. The elements property points to the elements object. In the elements object, the sorted properties are stored in order, and the properties property points to the properties object. In the properties object, the regular properties are saved in the order in which they were created.
After decomposing into these two linear data structures, if an indexing operation is performed, V8 will first read all the elements in order from the elements attribute, and then read all the elements in the properties attribute, thus completing an indexing operation.
Speed attribute
Save different properties to the elements property and properties property respectively, which undoubtedly simplifies the complexity of the program, but when searching for elements, there is one more step, such as executing the statement bar. B to find the property value of B, then in V8 It will first find the object properties pointed to by the properties property, and then find the B property in the properties object. This method adds a step to the search process, so it will affect the search efficiency of the element.
For this reason, V8 has adopted a trade-off strategy to speed up the efficiency of finding properties. This strategy is to store some regular properties directly to the object itself, which we call in-object properties. You can see the following image for the representation of objects in memory:
After using the properties within the object, the regular properties are saved to the bar object itself, so that when using bar. B again to find the property value of B, V8 can directly obtain the value from the bar object itself. This method Reduce the steps of finding property values and increase search efficiency.
However, the number of properties in the object is fixed, 10 by default. If the added properties exceed the space allocated by the object, they will be saved in the regular property storage. Although the property storage has an extra layer of indirection, it can be expanded freely.
Usually, we call the properties saved in the linear data structure “fast properties”, because the properties can be accessed only through the index in the linear data structure. Although the speed of accessing the linear structure is fast, if a large number of properties are added or deleted from the linear structure, the execution efficiency will be very low, mainly because of the large amount of time and memory overhead.
Therefore, if an object has too many attributes, V8 will adopt another storage strategy, that is, the “slow attribute” strategy, but the slow attribute object will have an independent nonlinear data structure (dictionary) as the attribute storage container. All attribute meta-information is no longer stored linearly, but is directly stored in the attribute dictionary.
Examples illustrate
1 |
|
We can run this code in the browser console and use the snapshot function of Memory to see the properties in these three objects.
First of all, you can see that all the sorted properties are in elements, while the regular properties are directly on the object, not on properties
Then the second one, with the properties property on the object, has 10 regular properties sorted by key inside.
Finally, there is the third one. We can see that the properties in the properties at this time are already out of order, it is a map.
Hidden classes
In the last part, when talking about the speed attribute, there is another attribute in the snapshot of chrome, called map, also known as hidden class, what does this map do?
We know that JavaScript is a dynamic language, and its execution efficiency is lower than that of static languages. In order to improve the execution speed of JavaScript, V8 draws on many features of static languages, such as the implementation of the JIT mechanism, and the introduction of hidden objects in order to improve the property access speed of objects. Class, in order to speed up the operation and introduce the internal connection cache. Today we will focus on analyzing the hidden class in V8 to see how it improves the speed of accessing object property values.
Why is static language more efficient?
Since hidden classes borrow some of the features of static languages, to explain this problem, let’s first analyze why static languages are more efficient than dynamic languages.
We through the following two pieces of code, to compare the dynamic language and static language at runtime some features, a dynamic language is JavaScript, another static language C++ source code, specific source code you can refer to the following figure:
So at runtime, what is the difference between the execution process of these two pieces of code?
We know that the properties of objects can be modified at runtime in JavaScript, so when V8 uses an object, such as start.x, it does not know whether there is x in the object, nor does it know whether x is relative to x. What is the offset of the object, and it can be said that V8 does not know the specific shape of the object.
So, when you want to query the x property in the object start in JavaScript, V8 will query step by step according to the specific rules, which is very slow and time-consuming
This dynamic way of querying object properties is different from C++ static language. C++ the structure of an object needs to be defined before declaring it. We can also call it a shape. For example, the Point structure is a shape. We can use this shape to Define specific objects. C++ code needs to be compiled before execution. When compiling, the shape of each object is fixed, that is to say, the shape of Point cannot be changed during the execution of the code. So when accessing the property of an object in the C++, it is natural to know the offset value of the property relative to the address of the object. For example, when using start.x in the C++, the compiler will directly write the address of x relative to the start. In the assembly instruction, then when the x property in the object start is used, the CPU can directly go to the memory address to remove the content, without any intermediate search links. Because in static languages, the property values of objects can be queried directly through offset queries, which is one reason for the high execution efficiency of static languages.
What is a hidden class
Since the query efficiency of static language is so high, can this static feature be introduced into V8?
The answer is feasible. One of the current ideas is to make objects in JavaScript static, that is, V8 will assume that objects in JavaScript are static during the running of JavaScript. Specifically, V8 makes the following two assumptions for each object:
No new properties will be added after the object is created;
Attributes will not be deleted after the object is created.
After meeting these two assumptions, V8 can do deep optimization of objects in JavaScript, so how to optimize it?
Specifically, V8 will create a hidden class for each object. The hidden class of the object records some basic layout information of the object, including the following two points:
- all properties contained in the object;
The offset of each property relative to the object.
After having a hidden class, then when V8 accesses a property in an object, it will first go to the hidden class to find the offset of the property relative to its object. With the offset and property type, V8 You can directly go to memory to retrieve the property value without going through a series of search processes, which greatly improves the efficiency of finding objects in V8.
1 | let point = {x:100,y:200} |
The hidden class describes the property layout of the object, which mainly includes the property name and the offset corresponding to each property. For example, the hidden class of the point object includes the x and y properties, the offset of x is 4, and the offset of y is 4. The offset is 8.
After having the map, when you use point.x to access the x property again, V8 will query the offset of the x property in the map of the point relative to the point object, and then add the offset to the starting position of the point object to get the x The value of the property is located in the memory, and with this location, the value of x is obtained, so we save a more complicated search process.
Multiple objects share a hidden class Now we know that in V8, each object has a map property whose value points to the hidden class of the object.
However, if the shape of two objects is the same, V8 will be able to reuse the same hidden class for them, which has two advantages:
Reduce the number of hidden classes created and indirectly speed up code execution;
Reduced storage space for hidden classes.
So, when the shape of two objects is the same, the following two points must be met:
the same property name;
Equal number of attributes.
Rebuild hidden classes
Regarding hidden classes, there is one more issue you need to pay attention to. At the beginning of this lesson, we mentioned that in order to implement hidden classes in V8, two assumptions are required:
No new properties will be added after the object is created;
Attributes will not be deleted after the object is created.
However, JavaScript is still a dynamic language. During execution, the shape of an object can be changed. If the shape of an object changes, the hidden class will also change, which means that V8 has to rebuild the newly changed object. Build a new hidden class, which is a big overhead for V8’s execution efficiency. It is commonly understood that adding new properties to an object, deleting new properties, or changing the data type of a property will change the shape of the object, which will inevitably trigger V8 to rebuild a new hidden class for the object after changing shape.
Best practices
Okay, now we know that V8 will assign a hidden class to each object during execution.
If the shape of the object has not changed, then the object will always use the hidden class.
- If the shape of the object changes, V8 will rebuild a new hidden class for the object.
We certainly hope that the hidden class in the object will not be changed casually, because this will trigger V8 to refactor the hidden class of the object, which directly affects the execution performance of the program. Then in actual work, we should try to pay attention to the following points:
When initializing objects with literals, ensure that the order of properties is consistent
Try to use literals to initialize complete object properties at once
Try to avoid using the delete method
Internal connection cache
First look at a piece of code
1 |
|
Usually the process of obtaining o.x in V8 is like this: find the hidden class of object o, then find the offset of x attribute through the hidden class, and then obtain the attribute value according to the offset. In this code, the loadX function will be executed repeatedly, so the process of obtaining o.x also needs to be executed repeatedly. Is there a way to simplify this search process again, it is best to find the attribute value of x in one step? The answer is, yes.
In fact, this is a question about internal connection caching. We can see that the function loadX is repeatedly executed many times in a for loop, so V8 will do everything possible to compress the search process to improve the efficiency of object search. The strategy to speed up the execution of this function is the internal connection cache (Inline Cache), referred to as IC.
What is internal connection cache
To answer this question, we need to know the working principle of IC. In fact, the principle of IC is very simple, intuitively understood, that is, in the process of executing function in V8, it will observe the key intermediate data on some call points (CallSite) in function, and then cache these data. When the function is executed again, V8 can directly use these intermediate data, saving the process of obtaining these data again. Therefore, V8 can effectively improve the execution efficiency of some repetitive codes by using IC.
The IC maintains a FeedBack Vector for each function, which records some key intermediate data during the execution of the function. For the relationship between function and feedback vector, you can refer to the following figure:
The feedback vector is actually a table structure, which consists of many items, each of which is called a slot (Slot). V8 will write the intermediate data of the loadX function into the slot of the feedback vector in turn. For example, the following function:
1 | function loadX(o) { |
When V8 executes this function, it will determine that o.y = 4 and return o.x are CallSites because they use objects and properties, then V8 will assign each call in the feedback vector of the loadX function. Point is assigned a slot. Each slot includes the slot index (slot index), the type of slot (type), the state of the slot (state), the address of the hidden class (map), and the offset of the attribute. For example, the above function Both call points use the object o, then the map attribute in the two slots of the feedback vector also points to the same hidden class, so the map address of the two slots is the same.
Polymorphism and metamorphism
Well, by caching the basic information in the execution process, we can improve the efficiency of the next execution of the function, but there is a premise, that is, when executed multiple times, the shape of the object is fixed, if the shape of the object is not fixed, then how will V8 handle it? Let’s adjust the above loadX function code, the adjusted code is as follows:
1 |
|
When loadX is executed for the first time, V8 will record the hidden class of o in the feedback vector and record the offset of the attribute x. Then when the loadX function is called again, V8 will take out the hidden class recorded in the feedback vector and compare it with the hidden class of the new o1 and find that it is not a hidden class, so V8 cannot use the offset recorded in the feedback vector at this time. amount information. Faced with this situation, V8 will choose to record the new hidden class in the feedback vector as well, and at the same time record the offset of the attribute value. At this time, the first slot in the feedback vector contains the two hidden classes and offset.
Now that we know that a slot in a feedback vector can contain information about multiple hidden classes, then:
If a slot contains only one hidden class, then we call this state monomorphic.
If a slot contains 2-4 hidden classes, we call this state polymorphic.
- If there are more than 4 hidden classes in a slot, we call this state a magamorphic.
In general, we only need to remember one thing, which is that monomorphic performance is better than polymorphic and supermorphic, so we need to slightly avoid polymorphic and supermorphic situations.