Difference between Set, Map and Object in JavaScript

Recently, in reading the computer foundation section of “Web Efficient Programming and Practice Optimization”, I saw that he talked about the difference between Set and Map in JS, and the implementation of Object, which solved the usual doubts. Here is a brief summary:

Set

Sets are generally implemented using Red-Black Tree, which is a balanced lookup binary tree with a lookup time complexity of O (logN). In fact, the Set Chrome V8 is implemented with a hash, which is a hash Set

Suppose we have a piece of code like this:

1
2
3
4
5
6
var set = new Set(); 
//The data is 20 numbers
var data = [3, 62, 38, 42, 14, 4, 14, 33, 56, 20, 21, 63, 49, 41, 10, 14, 24, 59, 49, 29];
for(var i = 0; i < data.length; i++){
set.add(data[i]);
}

A key part of hashing is the hash algorithm, which hashes a bunch of numbers or strings to get their random values

1
2
3
4
5
6
7
8
9
10
11
function ComputeIntegerHash(key, seed) { 
var hash = key;
hash = hash ^ seed; //seed = 505553720
hash = ~hash + (hash << 15); // hash = (hash << 15) - hash - 1;
hash = hash ^ (hash >>> 12);
hash = hash + (hash << 2);
hash = hash ^ (hash >>> 4);
hash = (hash * 2057) | 0; // hash = (hash + (hash << 3)) + (hash << 11);
hash = hash ^ (hash >>> 16);
return hash & 0x3fffffff;
}

Do various bit operations on the number to get a relatively random number, and then scatter the number. The purpose of scattering is to get which index of the array the number is placed in.

1
2
3
4
5
6
var capacity = 64; 
var indexes = [];
for(var i = 0; i < data.length; i++){
indexes.push(ComputeIntegerHash(data[i], seed) & (capacity - 1)); //去掉高位
}
console.log(indexes)

There are 20 numbers, and the capacity starts to increase from 16, and doubles each time. When it reaches 64, it can ensure that capacity > size * 2, because ** the repeated value of the scattering result can only be compared when the capacity is twice the actual storage size. Low **.

Now to find if key = 56 exists in this Set, first hash 56, and then scatter, according to the same process when storing

1
2
3
4
5
function SetHas(key){ 
var index = ComputeIntegerHash(56, seed) & this.capacity;
//There may be duplicate values, so it is necessary to verify that the keys stored in the hit index are equal
return setArray[index] ! null && setArray[index] = key;
}

The above is a typical implementation of a hash storage structure, but Chrome V8’s Set/Map is not implemented in this way, slightly different.

The hash algorithm is the same, but the scattering is not used to remove the high bit capacity, but with half of the capacity, called the number of buckets, this is to remove the high bit by hash value and capacity to do a bit and operation, the effect is equivalent to the remainder.

Insertion process

Suppose we insert 9, 33, 68, 57 in order.

The storage structure of Set is divided into three parts. The first part has three elements, which represent the number of valid elements, the number of deleted elements, and the number of buckets. The addition of the first two numbers represents the total number of elements.

After inserting 9, the number of elements plus 1 becomes 1, and the number of buckets initialized is 2.

The second part corresponds to buckets. Buckets [0] represents the index of the original data source stored in the first bucket. The source code is called entry. 9 is in the data array, and the index is 0, so its storage value in the bucket is 0, and the scattering value of the bucket is 0, so bucket [0] = 0.

The third part is the space to record the key value. The entry of 9 is 0, so it is placed in the position of 3 + buckets.length + entry * 2 = 5. Each key value has two element spaces. The first one stores the key value., the second is keyChain, its role will be mentioned below

After inserting 33

Insert 68, the bucket value of 68 is also 1, and 33 is repeated, because entry = buckets [1] = 1, not empty, indicating that it has been stored before, the position of the array pointed to by entry is 1 is 3 + buckets.length + entry * 2 = 7, that is to say, the previous number is placed in the position of array 7, so the adjacent element of 68 stores the value keyChain as 7, and the entry of bucket [1] becomes 68 is 2

Insert 57

Search process

Now to find the number 33, through the same hash scattering, get bucket = 1 of 33, bucket [1] = 3, the index position pointed to by 3 is 11, but 11 is 57, not the 33 to be found, so check The adjacent elements are 9, which is non-empty, and you can continue to search. The position 9 stores 68, which is also not equal to 33, while the adjacent index = 10 points to the position 7, and 7 stores 33. By comparing the key values Equal, so there is a number 33 in this Set

Here are some highlights to note:

  • The data here is a total of 4 numbers, but there are many times to compare. The key value is compared 3 times, and the adjacent keyChain value of the key value is compared 2 times, a total of 5 times, which is more than the direct for loop. Therefore, when the amount of data is relatively small, the use of hash storage speed is slower and slower, but when the amount of data is too large, the advantage will be more obvious

  • Another problem is that once the capacity is not enough, it will trigger expansion, and all the above hash calculation processes will be repeated.

Map

It is basically the same as Set. The difference is that map has more places to store values.

Of course, it does not directly store the string “hello”, but stores the pointer address of hello, pointing to the memory location where hello is actually stored.

JS

You can refer to this blog: https://sunra.top/posts/37711/