Nodejs Asynchronous IO and Event Driven

The two major features of’nodejs’, namely’asynchronous IO ‘and’event-driven’. Through the *** “easy to understand nodejs”\ *** and several blog reading, have a general understanding, summarize.

** Note that the content of this article is based on node11 above. **

Synchronous and asynchronous, blocking and non-blocking

“Blocking” and “non-blocking” and “synchronous” and “asynchronous” cannot be simply taken literally, providing an answer from a distributed system perspective.
** 1. Synchronous and Asynchronous **
Synchronous and asynchronous focus on ** message communication mechanism ** (synchronous communication/asynchronous communication)
The so-called synchronization is that when a * call * is issued, the * call * does not return until the result is obtained. But once the call returns, the return value is obtained.
In other words, the * caller * actively waits for the result of this * call *.

Asynchronous is the opposite. After the *** call\ * is issued, the call returns directly, so no result is returned **. In other words, when an asynchronous procedure call is issued, the caller does not get the result immediately. Instead, after the * call * is issued, the * callee * informs the caller through status, notification, or handling the call through a callback function.

Typical asynchronous programming model such as Node.js.

Take a popular example:
You call the bookstore owner and ask if there is a book “Distributed Systems”. If it is a synchronous communication mechanism, the bookstore owner will say, wait a minute, “I’ll check it”, and then start checking and checking, and wait until the check is done (it may be 5 seconds, it may also be a day) to tell you the result (return result).
As for the asynchronous communication mechanism, the bookstore owner directly tells you that I will check it out, call you after checking it out, and then hang up the phone directly (no results are returned). Then after checking, he will take the initiative to call you. Here the boss calls back by “calling back”.

Blocking and non-blocking
Blocking and non-blocking focus on the state of the program while waiting for the result (message, return value) of the call.

A blocking call means that the current thread is suspended until the result of the call returns. The calling thread does not return until the result is obtained.
Non-blocking call means that the call does not block the current thread until the result cannot be obtained immediately.

Or the example above,
You call and ask the bookstore owner if there is a book “Distributed Systems”. If you are a blocking call, you will keep “hanging” yourself until you get the result of this book. If it is a non-blocking call, you don’t care if the boss tells you or not, you go play by yourself first. Of course, you should occasionally check for a few minutes to see if the boss has returned the result.
Blocking and non-blocking here have nothing to do with whether it is synchronous and asynchronous. It has nothing to do with how the boss answers your results.

A few examples

Before we start, let’s look at a few simple examples, which are also some of the more confusing examples I encountered when using’nodejs’.


example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
var fs = require("fs");
var debug = require('debug')('example1');

debug("begin");

setTimeout(function(){
debug("timeout1");
});

setTimeout(function(){
debug("timeout2");
});

debug('end');
/** Run result
Sat, 21 May 2016 08:41:09 GMT example1 begin
Sat, 21 May 2016 08:41:09 GMT example1 end
Sat, 21 May 2016 08:41:09 GMT example1 timeout1
Sat, 21 May 2016 08:41:09 GMT example1 timeout2
*/

question 1

Why are the results of’timeout1 ‘and’timeout2’ after’end '?


example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
var fs = require("fs");
var debug = require('debug')('example2');

debug("begin");

setTimeout(function(){
debug("timeout1");
});

setTimeout(function(){
debug("timeout2");
});

debug('end');

while(true);
/** Run result
Sat, 21 May 2016 08:45:47 GMT example2 begin
Sat, 21 May 2016 08:45:47 GMT example2 end
*/

question 2

Why are’timeout1 ‘and’timeout2’ not output to end point? What exactly is’while (true) 'blocking?


example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
var fs = require("fs");
var debug = require('debug')('example3');

debug("begin");

setTimeout(function(){
debug("timeout1");
while (true);
});

setTimeout(function(){
debug("timeout2");
});

debug('end');
/** Run result
Sat, 21 May 2016 08:49:12 GMT example3 begin
Sat, 21 May 2016 08:49:12 GMT example3 end
Sat, 21 May 2016 08:49:12 GMT example3 timeout1
*/

question 3

Why does the callback function in timeout1 block the execution of the callback function in timeout2?


example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
var fs = require("fs");
var debug = require('debug')('example4');

debug("begin");

setTimeout(function(){
debug("timeout1");
/**
* Simulation is computationally intensive
*/
for(var i = 0 ; i < 1000000 ; ++i){
for(var j = 0 ; j < 100000 ; ++j);
}
});

setTimeout(function(){
debug("timeout2");
});

debug('end');
/**
Sat, 21 May 2016 08:53:27 GMT example4 begin
Sat, 21 May 2016 08:53:27 GMT example4 end
Sat, 21 May 2016 08:53:27 GMT example4 timeout1
Sat, 21 May 2016 08:54:09 GMT example4 timeout2//Note that the time here is a long time late
*/

question 4

As in the question above, why would the computationally intensive work of timeout1 block the callback function of timeout2?


example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
var fs = require("fs");
var debug = require('debug')('example5');

debug("begin");

fs.readFile('package.json','utf-8',function(err,data){
if(err)
debug(err);
else
debug("get file content");
});

setTimeout(function(){
debug("timeout2");
});

debug('end');
/** Run result
Sat, 21 May 2016 08:59:14 GMT example5 begin
Sat, 21 May 2016 08:59:14 GMT example5 end
Sat, 21 May 2016 08:59:14 GMT example5 timeout2
Sat, 21 May 2016 08:59:14 GMT example5 get file content
*/

question 5

Why does the’IO ‘operation of reading a file not block the execution of’timeout2’?


Next, we will take the above doubts to understand how asynchronous IO and event-driven work in nodejs.

IO (asynchronous)

First, let’s understand a few confusing concepts, 'blocking IO (blocking I/O) ’ and’non-blocking IO (non-blocking I/O) ‘,’ synchronous IO (synchronous I/O) and asynchronous IO (synchronous I/O) '.

Bloggers have been naive to think that’non-blocking I/O ‘is’asynchronous I/O’ T_T, ‘apue’ has not understood.

Blocking I/O

In simple terms, blocking I/O means that when the user sends an operation to read the file Descriptor, the process will be blocked until all the data to be read is ready to be returned to the user, at which time the process will not be released from the’block 'state.

What about ** non-blocking I/O **, in contrast to the above situation, when the user initiates a file descriptor operation, the function returns immediately, without any waiting, and the process continues to execute. But how does the program know that the data to be read is ready? The easiest way is polling.

In addition, there is also a mode called’IO multiplex to reuse ‘, which is to use a blocking function to listen to multiple file descriptors at the same time. When one of the file descriptors is ready, it will return immediately. Under’linux’, ‘select’, ‘poll’, ‘epoll’ all provide the function of’IO multiplex to reuse '.

Synchronous I/O

So what is the difference between’synchronous I/O ‘and’asynchronous I/O’? Can’asynchronous I/O ‘be achieved as long as’non-blocking IO’ is achieved?

Actually not.

  • 'Synchronous I/O (synchronous I/O) ’ will block the process when doing ‘I/O operation’, so ‘blocking I/O’, ‘non-blocking I/O’, ‘IO multiplexing to reuse I/O’ are all’synchronous I/O '.
  • 'Asynchronous I/O (asynchronous I/O) ’ will not cause any blocking when doing’I/O opertaion '.

'Non-blocking I/O ‘does not block. Why not’asynchronous I/O’? In fact, when’non-blocking I/O ‘is ready for data, it still blocks the process to get data from the kernel. So it is not’asynchronous I/O’.

Borrow a picture here (picture from这里To explain the difference between them

Event driven

Event-driven is the second biggest feature of nodejs. What is event-driven? Simply put, it is to make corresponding operations by listening to the state changes of events. For example, if a file is read, the file is read, or the file is read incorrectly, then the corresponding state is triggered, and then the corresponding callback function is called to process it.

Thread-driven and event-driven

So what is the difference between’thread-driven ‘programming and’event-driven’ programming?

  • ‘Thread-driven’ means that when a request is received, a new thread will be opened for the request to process the request. Generally, there is a thread pool. There are idle threads in the thread pool, and threads will be taken from the thread pool for processing. If there are no idle threads in the thread pool, the new request will be queued until there are idle threads in the thread pool.
  • ‘Event-driven’ means that when a new request comes in, the request will be pushed into the queue, and then through a loop to detect the event state change in the queue. If an event with a state change is detected, then the event will be executed. The corresponding processing code is usually a callback function.

For event-driven programming, if the callback function at a certain time is’computationally intensive ‘or’blocking I/O’, then this callback function will block the execution of all subsequent event callback functions. This is particularly important.

Event-Driven and Asynchronous I/O for NodeJS

Event-driven model

So many concepts have been introduced above, now let’s take a look at how event-driven and asynchronous I/O are implemented in nodejs.

‘Nodejs’ is ** single thread (single thread) ** running, through an event loop (event-loop) ** to loop out the message in the message queue (event-queue) ** for processing, the process is basically to call the callback function corresponding to the message. Message queue is when an event state changes, a message is pushed into the queue.

The time-driven model of’nodejs’ generally pays attention to the following points:

  • Because it is ** single-threaded **, the ** event loop ** is suspended when executing the code in the’js’ file sequentially.
  • When the js file is finished executing, the event loop starts running and retrieves the message from the message queue to execute the callback function
  • because it is ** single-threaded **, the ** event loop ** is suspended when the callback function is executed
  • When it comes to I/O operations, ‘nodejs’ will open a separate thread for’asynchronous I/O 'operations, and push the message into the ** message queue ** after the operation is over.

Let’s start with a simple’js’ file to see how’nodejs’ is executed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
var fs = require("fs");
var debug = require('debug')('example1');

debug("begin");

fs.readFile('package.json','utf-8',function(err,data){
if(err)
debug(err);
else
debug("get file content");
});

setTimeout(function(){
debug("timeout2");
});

Debug ('end '); // The event loop is paused until this point
  1. Execute’debug (“begin”) 'synchronously
  2. Call’fs.readFile () ‘asynchronously, at this time a new thread will be opened for’asynchronous I/O’ operation
  3. Call’setTimeout () 'asynchronously and immediately push the timeout information into the ** message queue **
  4. Call’debug (“end”) 'synchronously
  5. Open the event loop and pop up the message queue (currently timeout).
  6. Then execute the callback function corresponding to the information (** event loop ** is suspended again)
  7. ** Callback function ** After execution, start the ** event loop ** (currently there is nothing in the ** message queue **, the file has not been read)
  8. ‘Asynchronous I/O’ reads the file, and pushes the message into the ** message queue (the ** message contains the file content or an error message)
  9. ** Event loop ** Get message, execute callback
  10. Program withdrawal.

Here is a picture to illustrate the event-driven model of’nodejs’ (the picture comes from这里

How does the JS event loop work?

There is actually no problem with the understanding of “perform synchronous operations first and asynchronous operations are in the event queue”, but if we go deeper, it will lead to many other concepts, such as event table and event queue. Let’s take a look at the running process:

  1. First determine whether the JS is synchronous or asynchronous, the synchronization will enter the main thread to run, and the asynchronous will enter the event table.
  2. Asynchronous Tasks registers events in the event table. When the trigger condition is met (the trigger condition may be a delay or an ajax callback), it is pushed into the event queue.
  3. After the synchronization task enters the main thread, it has been executed. When the main thread is idle, it will go to the event queue to see if there are executable Asynchronous Tasks. If there are, it will be pushed into the main thread.
1
2
3
setTimeout(() => {
Console.log ('2 seconds up')
}, 2000)

We use the second above to analyze this script. SetTimeout is an asynchronous operation that first enters the event table, the registered event is its callback, and the trigger condition is 2 seconds later. When the condition is met, the callback is pushed into the event queue. When the main thread is idle, go to the event queue to see if there is an executable task.

1
2
3
Console.log (1)//Sync task into main thread
setTimeout (fun (), 0)//Asynchronous Tasks, placed in the event table, pushed into the event queue after 0 seconds
Console.log (3)//Sync task into main thread

1 and 3 is that the synchronous task will be executed immediately. After the execution is completed, the main thread is free to go to the event queue (event queue) to see if there is a task waiting to be executed. This is why the delay time of setTimeout is 0 milliseconds but it is executed at the end.

One thing to note about setTimeout is that the delay time is sometimes not so accurate.

1
2
3
4
setTimeout(() => {
Console.log ('2 seconds up')
}, 2000)
wait(9999999999)

Analyze the running process:

  1. console enter the Event Table and register, the timer starts.
    Execute the sleep function. Although the sleep method is a synchronous task, the sleep method performs a lot of logical operations, which takes more than 2 seconds.
    3.2 seconds is up, the timing event timeout is completed, the console enters the Event Queue, but sleep has not been executed, the main thread is still occupied, and can only wait.
  2. sleep finally finished executing, the console finally entered the main thread execution from the Event Queue, this time has been far more than 2 seconds.

In fact, a delay of 2 seconds only means that after 2 seconds, the function in setTimeout will be pushed into the event queue, and the tasks in the event queue (event queue) will only be executed when the main thread is idle. After the above process is completed, we know that the setTimeout function adds the task to be executed (console in this case) to the Event Queue after a specified time, and because it is a single-threaded task to be executed one by one, if the previous task takes too long, then can only wait, resulting in a real delay time far greater than 2 seconds. We also often encounter code like setTimeout (fn, 0), which means that a task is specified to be executed at the earliest idle time of the main thread, which means that there is no need to wait for many seconds. As long as all the synchronous tasks in the main thread execution stack are executed, the stack will be executed immediately if it is empty. But even if the main thread is empty, 0 milliseconds is actually not reached. According to HTML standards, the minimum is 4 milliseconds.

About setInterval: Taking setInterval (fn, ms) as an example, setInterval is executed in a loop. setInterval will place the registered function into the Event Queue every specified time. It will not execute fn every ms seconds, but every ms seconds. Fn enters the Event Queue. It should be noted that once the execution time of the setInterval callback function fn exceeds the delay time ms, there is no time interval at all.

The above concept is very basic and easy to understand but unfortunately the news is that everything above is not absolutely correct, because it involves Promise, async/await, process.nextTick (node) so there is a more refined definition of the task:

** We refer to the tasks initiated by the host as macro tasks, and the tasks initiated by the JavaScript engine as micro tasks. **

Macro-task: including the overall code script, setTimeout, setInterval, MessageChannel, postMessage, setImmediate.
微任务(micro-task):Promise、process.nextTick、MutationObsever。

When dividing macro tasks and microtasks, async/await is not mentioned because the essence of async/await is a Promise.

** What is the event loop mechanism? * Different types of tasks will enter the corresponding Event Queue, such as setTimeout and setInterval will enter the same (macro task) Event Queue. Promise and process.nextTick will enter the same (microtask) Event Queue **. *

  1. “Macro task” and “micro task” are both queues. When a piece of code is executed, the synchronization code in the macro task will be executed first.
  2. During the first round of the event loop, all js scripts will be run as a macro task.
  3. If you encounter a macro task such as setTimeout during execution, then push the function inside the setTimeout into the “queue of macro tasks” and call it when the next macro task is executed.
  4. If a microtask such as promise.then () is encountered during execution, it will be pushed into the “** microtask queue of the current macro task **” (that is to say, these microtasks still belong to the current macro task), After the synchronization code of this round of macro tasks is completed, all microtasks are executed in turn.
  5. In the first round of the event loop, when all the synchronization scripts and events in the microtask queue are executed, this round of event loop ends and the second round of event loop begins.
  6. The second round of event loops is the same. ** First, execute the synchronization script in the macro task. ** If you encounter other macro tasks, the Code Block will continue to be added to the “Macro Task Queue”. If you encounter a microtask, it will be pushed into the “Current macro task microtask queue”. After the synchronous code execution of this round of macro tasks is completed, all current microtasks will be executed in turn.
  7. Start the third round and repeat…

The following code is used to deeply understand the above mechanism.

1
2
3
4
5
6
7
8
9
10
11
setTimeout(function() {
console.log('4')
})

new Promise(function(resolve) {
Console.log ('1')//sync task
resolve()
}).then(function() {
console.log('3')
})
console.log('2')
  1. This code enters the main thread as a macro task.
  2. Encounter setTimeout first, then register its callback function and distribute it to the macro task Event Queue.
  3. Next the Promise is encountered, the new Promise is executed immediately, then the function is distributed to the microtask Event Queue.
  4. Encountered console.log (), executed immediately.
  5. The overall code script ends as the first macro task execution. Check if there is currently an executable microtask and execute the then callback. (The first round of event loop is over, we start the second round of loop.)
  6. Start from the macro task Event Queue. We found the callback function corresponding to setTimeout in the macro task Event Queue, and executed immediately. Execution result: '1 - 2 - 3 - 4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
new Promise(function (resolve) { 
Console.log ('1')//macro task one
resolve()
}).then(function () {
Console.log ('3')//Microtasks for macro task one
})
setTimeout (function () { // macro task 2
console.log('4')
setTimeout (function () { // macro task 5
console.log('7')
new Promise(function (resolve) {
console.log('8')
resolve()
}).then(function () {
console.log('10')
setTimeout (function () { // macro task 7
console.log('12')
})
})
console.log('9')
})
})
setTimeout (function () { // macro task 3
console.log('5')
})
setTimeout (function () { // macro task 4
console.log('6')
setTimeout (function () { // macro task 6
console.log('11')
})
})
Console.log ('2')//macro task one
  1. All code enters the main thread for execution as the first macro task.
  2. First output 1, which is the synchronization code. Then the callback enters the microtask queue of macro task 1 as a microtask.
  3. The three outermost setTimeouts below are macro task two, macro task three, and macro task four in sequence into the macro task queue.
  4. Output 2, now the synchronization code of macro task 1 has been executed. Next, execute micro task output 3 of macro task 1. The first round of event loop is completed
  5. Now execute macro task 2 output 4, and the subsequent setTimeout is placed in the macro task queue as macro task 5.
  6. Execute macro task three output 5, execute macro task four output 6, macro task four inside the setTimeout as macro task six.
  7. Execute macro task five output 7, 8. Then callback as macro task five microtasks into macro task five microtask queue.
  8. Output the synchronization code 9, the synchronization code of macro task 5 has been executed, and now the micro task of macro task 5 is executed.
  9. Output 10, followed by setTimeout as macro task seven into the macro task queue. Macro task five execution is complete.
  10. Execute macro task six output 11, and execute macro task seven output 12.

**-^-*, this case is a bit disgusting, the purpose is to let everyone understand the order of execution between macro tasks and the execution relationship between macro tasks and micro tasks. *

We call the content in the main thread (execution queue) from the beginning to the end of the execution a tick. Once the content of the main thread is executed, go back to the head of the macro task queue and take a new tick to join the main thread. If there are other methods to register a new microtask or the macro task itself registers a microtask, then take out all the microtasks and complete them before the end of this tick.

  1. The microtasks added in the current tick will not be left to the next tick, but will be triggered at the end of the tick
  2. In an event loop, after the task in the tick is executed, there will be a separate step called Perform a microtask checkpoint, that is, execute the microtask checkpoint. This operation is to check whether there are microtasks. If there are, the microtask queue will also be treated as an execution queue to continue execution, and the execution queue will be empty after completion.
  3. The microtask generated by the same tick will always be executed before the macro task, because the microtask registered before the end of this tick will be executed at the microtask checkpoint, but the macro task will wait until the next tick.

(Note that all macro tasks and microtasks discussed here are already tasks that enter the event loop, that is, the state of entering the event queue from the event table after the asynchronous condition is reached).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
new Promise(function (resolve) { 
Console.log ('1')//macro task one
resolve()
}).then(function () {
Console.log ('3')//Microtasks for macro task one
})
setTimeout (function () { // macro task 2
console.log('4')
setTimeout (function () { // macro task 5
console.log('7')
new Promise(function (resolve) {
console.log('8')
resolve()
}).then(function () {
console.log('10')
setTimeout (function () { // macro task 7
console.log('12')
})
})
console.log('9')
})
}, 2000)
setTimeout (function () { // macro task 3
console.log('5')
}, 1000)
setTimeout (function () { // macro task 4
console.log('6')
setTimeout (function () { // macro task 6
console.log('11')
}, 2000)
})
Console.log ('2')//macro task one

If you further add a delay time to the timeout, so that their callback function does not immediately enter the macro task queue, what is the print order?

Give the answer first: 1, 2, 3, 6, 5, 4, 7, 8, 9, 10, 11, 12 (or 1, 2, 3, 6, 5, 4, 11, 7, 8, 9, 10, 12)

In fact, according to the logic we just talked about, the order should be 1, 2, 3, 6, 5, 4, 11,

Why are there two different possibilities?

Because Chrome based on

The difference between this example and the one above is that by setting different timeouts, the callback functions are pushed into the macro task queue in a different order.

Question answer

Okay, so far, I can answer the above questions


question 1

Why are the results of’timeout1 ‘and’timeout2’ after end?

answer 1

Because at this time’timeout1 ‘and’timeout2’ are only pushed into the queue by the asynchronous function, the event loop is still suspended


question 2

Why are’timeout1 ‘and’timeout2’ not output to end point? What exactly is’while (true) 'blocking?

answer 2

Because the event loop is directly blocked here, it has already been blocked before it has started


question 3,4

answer 3,4

Because the callback function execution returns to the event loop, it will continue to execute, and the callback function will block the operation of the event loop


question 5

Why does the IO operation of reading the file not block the execution of’timeout2 '?

answer 5

Because the’IO 'operation is asynchronous, it will start a new thread and will not block until the event loop.

Reference link:

https://www.zhihu.com/question/19732473

https://segmentfault.com/a/1190000005173218

https://juejin.im/post/5c148ec8e51d4576e83fd836