Node多进程exec方法执行流程源码分析

站长

2022年09月10日 05:31 · 阅读数 80

在自研脚手架阶段，对node的同步/异步执行进行了使用；但是未能深层次去了解为何这么设计，这次也是通过问题的方式来进行了一些思考。⚠️注：目前是12版本的node；你可能现在用的版本是16版以上用TS来写的；但是node的基本上的思路是没有太多变化的。

思考

问题一：exec和execFile到底什么区别？
问题二： 为什么exec/execFile/fork都是通用spawn实现的，spawn的作用到底是什么？
问题三：为什么spawn没有回调；exec和execFile能够回调？
问题四：为什么spawn调用后，需手动调用child（spawn返回值）.stdout.on('data',callback)；spawn.stdout和spawn.stderr到底是什么？
问题五：为什么有data/errer/exit这么多种回调；他们的执行顺序到底是什么？

一、源码分析

源码分析方法：根据方法的执行调用顺序来进行分析源码

exec源码分析

源码目录结构

child_process.js

exec
execFile
spawn

internal/child_process.js

ChildProcess
spawn

代码执行流程

执行本地代码
首先在本地创建index.js;

    const cp = require('child-process');
    const child = cp.exec('ls -la|grep node_modules', function(err, stdout stderr){
        console.log(err, stdout, stderr);
    });

Node多进程exec方法执行流程源码分析

执行 cp.exec方法
将会调用node的内置库【child_process】中的【exec】方法，进行参数的标准化处理【normalizeExecArgs】

function exec (command, options, callback) {
    const opts = noramlizeExeArgs(command, options, callback);
    return module.exports.execFile(opts.file,
                                   opts.optionns,
                                   opts.callback);
}

入参处理：

opts:{
file : "ls -al|grep node_modules",
options : {shell : ture},
callback : .....,
}

入参处理后的结果：

file 是输入的命令
options 添加了shell为true
callback 没有变化

返回结果：

return module.exports.execFile
直接调用execFile方法

Node多进程exec方法执行流程源码分析通过exec方法中的noramilizeExecArgs方法将参数转化成execFile方法的参数一样；

进入execFile方法
a. 首先还是会进行一个参数的标准化处理【normalizeExecFileArgs】 b. 调用spawn child方法，主要目的是创建一个子进程并且对它进行异步执行

const child = spawn (file, args, {
    cwd: options.cwd,
    env: options.env,
    gid: options.gid,
    uid: options.uid,
    shell: options.shell,
    .....
});

spawn参数说明：

file: "ls -al|grep node_modules"，
agrs: 没有参数，
object:{shell: true}; 中只有shell参数显示true，需要用内部的shell脚本去执行.

Node多进程exec方法执行流程源码分析

调用spawn方法
a. 进行参数的标准化处理【normalizeExecFileArgs】b. 调用 child = new ChildProcess()

function spawn(file, args, options) { 
    const opts = normalizeSpawnArguments(file, args, options);
    const child = new ChildProcess();
// 在child，创建的子进程当中有一个_handle是实际的进程，_handle= Process{onexit:Function,...};调用的方式是需要用spwan来调用。spwan最终执行的是_handle的spwan；
};

opts返回的参数处理：

file: '/bin/sh'

//这个是shell的主命令在本地执行下： Node多进程exec方法执行流程源码分析

shell的使用方法一：直接执行shell文件
    /bin/sh test.shell
方法二：直接执行shell语句
    /bin/sh -c "ls-al|grep node_modules"

因为传入了参数： shell = true；所以file就设置为/bin/sh；表示用shell主命令来执行。

args: ["bin/sh", "-c", "ls -al|grep node_modules"]
options: {
cwd: null,
...
shell: true,
...
}
envPairs: // 操作系统的环境变量数据

Node多进程exec方法执行流程源码分析

new ChildProcess
a. 来源是node的内部库【interinternal/child_process】,node的内置库才可以调用到。b. 这个内部库中有一个ChildProcess类；对应的就是子进程类，（就是子进程）；所以在spawn中new ChildProcess()就创建了一个子进程类；

child_process.js

const child_process = require('internal/child_process')
 
const {
    ...
    ChildProcess,
    ...
    } = child_process;

interinternal/child_process库中的ChildProcess

internal/child_process.js

const {Process} =  internalBinding('process_warp');
// 引入c++文件
 this._handle = Process;

Node多进程exec方法执行流程源码分析

创建子进程之后，调用spawn方法，child.spwan;利用子进程去执行命令，执行完之后，直接返回，return child；
```
function spawn(file, args, options) { 
 const opts = normalizeSpawnArguments(file, args, options);
 const child = new ChildProcess();
...
child.spawn({
 file: opts.file, // "/bin/sh"
 args: opts.args,  // ["/bin/sh", "-c", "ls -a | grep node_modules"]
 cwd: options.cwd,
 ...
})

 return child；
```
child.spawn是把命令执行起来的方法，位于internal/child_process文件中，最核心的是this._handle.spawn方法，之前只是创建了进程对象，没有分配任何实际资源。调用this._handle.spawn进程就被执行起来。相应也会生成子进程ID；
spawn执行完之后，回到execFile中，进行往下执行
a. 对输出流进行监听: child.stdout.on('data')b. 对错误流进行监听: child.stderr.on('data')

....
if (child.stdout) {
    if(encoding) {
        child.stdout.setEncoding(encoding);
child,stdout.on('data', function onChildStdout(chunk){
        const encoding = child.stdout.readableEncoding;
            ...
        }
     }
}
...
if (child.stderr) {
    if(encoding) {
        child.stderr.setEncoding(encoding);
child,stdout.on('data', function onChildStderr(chunk){
        const encoding = child.stderr.readableEncoding;
            ...
        }
     }
}
...

child.addListener('close', exithandler);
child.addListener('error', errorhander);
return chuild;

这也是为什么execFile可以返回callback的原因；手动做了监听；

c. 对进程进行exit和error的监听；

Node多进程exec方法执行流程源码分析

总结

1. exec方法执行流程的分析

如图： Node多进程exec方法执行流程源码分析

2. exec/execFile/fork/spawn的区别

exec: 通过调用 /bin/sh -c来执行传入的shell脚本，底层是调用execFile；只是对参数进行了处理。
execFile：原理是执行传入的file和args，底层调用spawn方法创建和执行子进程，并手动创建了回调函数，一次性将所有的stderr（错误流）和stdout（输出流）结果返回。适合去执行快速执行完成的任务，并且能直接拿到最终的结果。
spawn：原理是调用了内部的 inernal/child_process,实力化ChilProcess子进程对象，实力化之后子进程并没有创建，再调用child.spawn方法去创建子进程和执行命令，底层是调用internal/ChildProcess中的this._handle.spawn执行子进程，执行C++的process_wrap的文件，这个文件是帮助创建进程的，因为划分内存空间要通过C++来实现的，JS没有办法直接操作内存；process_wrap中提供的spawn方法是直接创建子进程和执行子调用起来的，同步是完成了子进程的创建和调用；接下来会执行异步流，通过pipe管道进行单向数据通信，有三个流，输入流、输出流、错误流、每个流都会建立一个pipe，结束通讯之后，子进程主动发起onexit回调，同步的Socket会执行close回调。
fork：原理是调用的spawn创建子进程和执行命令，采用node去执行命令，通过setupchannel创建IPC通道，实现父子进程之间的双向通信。

转载自:https://segmentfault.com/a/1190000042253211

Node多进程exec方法执行流程源码分析

思考

一、 源码分析

exec源码分析

源码目录结构

代码执行流程

入参处理：

入参处理后的结果：

返回结果：

spawn参数说明：

opts返回的参数处理：

总结

1. exec方法执行流程的分析

2. exec/execFile/fork/spawn的区别

一、源码分析