likes
comments
collection
share

「Webpack5源码」seal阶段(流程图)分析(一)

作者站长头像
站长
· 阅读数 10

本文内容基于webpack 5.74.0版本进行分析

本文是webpack5核心流程解析的一篇文章,共有6篇,使用流程图的形式分析了webpack5的构建原理


前言

  1. 由于webpack5整体代码过于复杂,为了减少复杂度,本文所有分析将只基于js文件类型进行分析,不会对其它类型(cssimage)进行分析,所举的例子也都是基于js类型
  2. 为了增加可读性,会对源码进行删减、调整顺序、改变的操作,文中所有源码均可视作为伪代码
  3. 文章默认读者已经掌握tapableloaderplugin等基础知识,对文章中出现asyncQueuetapableloaderplugin相关代码都会直接展示,不会增加过多说明
  4. 由于webpack5整体代码过于复杂,因此会抽离出核心代码进行分析讲解

核心代码是笔者认为核心代码的部分,肯定会造成部分内容(读者也觉得是核心代码)缺失,如果发现缺失部分,请参考其它文章或者私信/评论区告知我

文章内容

编译入口->make->seal,然后进行seal阶段整体流程的概述(以流程图和简化代码的形式),然后根据流程图抽离出来的核心模块展开具体的分析,在分析seal阶段整体流程源码的基础上,着重分析:

  • ModuleChunkChunkGroupChunkGraph之间的关系
  • seal阶段与make阶段的区别
  • SplitChunksPlugin的分包规则

力求对复杂情况下的Chunk构建有一个清晰的了解

1.seal阶段流程概述

1.1 编译入口->make->seal

//node_modules/webpack/lib/webpack.js
const webpack = (options, callback) => {
  const { compiler, watch, watchOptions } = create(options);
  compiler.run();
  return compiler;
}

// node_modules/webpack/lib/Compiler.js
class Compiler {
    run(callback) {
        const run = () => {
            this.compile(onCompiled);
        }
        run();
    }
    compile(callback) {
        const params = this.newCompilationParams();
        this.hooks.beforeCompile.callAsync(params, err => {
            const compilation = this.newCompilation(params);
            this.hooks.make.callAsync(compilation, err => {
                compilation.seal(err => {
                    this.hooks.afterCompile.callAsync(compilation, err => {
                        return callback(null, compilation);
                    });
                });
            });
        });
    }
}

「Webpack5源码」seal阶段(流程图)分析(一)

1.2 seal阶段整体概述

  • create chunks: 遍历this.entries,进行多个Chunks的构建,包括入口文件形成Chunk、异步依赖形成Chunk等等
  • optimize: 对形成的Chunk进行优化,涉及SplitChunkPlgins插件
  • code generation: 根据上面的Chunk形成最终的代码,涉及到runtime以及各种module代码的生成
seal(callback) {
    const chunkGraph = new ChunkGraph(
        this.moduleGraph,
        this.outputOptions.hashFunction
    );
    this.chunkGraph = chunkGraph;
    //...

    this.logger.time("create chunks");
    /** @type {Map<Entrypoint, Module[]>} */
    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {
        const chunk = this.addChunk(name);
        const entrypoint = new Entrypoint(options);
        //...
    }
    //...
    buildChunkGraph(this, chunkGraphInit);
    this.logger.timeEnd("create chunks");

    this.logger.time("optimize");
    //...
    while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) {
        /* empty */
    }
    //...
    this.logger.timeEnd("optimize");

    this.logger.time("code generation");
    this.codeGeneration(err => {
        //...
        this.logger.timeEnd("code generation");
    }
}
const buildChunkGraph = (compilation, inputEntrypointsAndModules) => {
    // PART ONE
    logger.time("visitModules");
    visitModules(...);
    logger.timeEnd("visitModules");

    // PART TWO
    logger.time("connectChunkGroups");
    connectChunkGroups(...);
    logger.timeEnd("connectChunkGroups");

    for (const [chunkGroup, chunkGroupInfo] of chunkGroupInfoMap) {
        for (const chunk of chunkGroup.chunks)
            chunk.runtime = mergeRuntime(chunk.runtime, chunkGroupInfo.runtime);
    }

    // Cleanup work
    logger.time("cleanup");
    cleanupUnconnectedGroups(compilation, allCreatedChunkGroups);
    logger.timeEnd("cleanup");
};

1.3 seal阶段整体流程图

「Webpack5源码」seal阶段(流程图)分析(一)

1.4 重要概念

Dependency & Module

单一文件会先构建出Dependency,根据类型的不同,会有不同的Dependency,比如EntryDependency

不同类型的Dependency可以使用不同的ModuleFactory来进行Dependency->NormalModule的转化

一个文件形成的NormalModule,除了原始源代码之外,还包含许多有意义的信息,例如:使用的loaders、它的dependencies、它的exports等等

下图来自An in-depth perspective on webpack's bundling process

「Webpack5源码」seal阶段(流程图)分析(一)

Chunk & ChunkGroup & EntryPoint

Chunk封装一个或者多个Module

ChunkGroup由一个或者多个Chunk组成,一个ChunkGroup可以是其它ChunkGroupparent或者child

EntryPoint是入口类型的ChunkGroup,包含了入口Chunk

下图来自An in-depth perspective on webpack's bundling process

「Webpack5源码」seal阶段(流程图)分析(一)

ChunkGraph

管理module、chunk和chunkGroup之间的关系

下面的类图没有写全属性,只是写上笔者认为重要的属性,下面两个图只是为了更好理解ChunkGraph的作用以及管理逻辑,不是作为概括使用

「Webpack5源码」seal阶段(流程图)分析(一)

「Webpack5源码」seal阶段(流程图)分析(一)

2.遍历this.entries,创建Chunk和ChunkGroup

  1. 进行new ChunkGraph()的初始化
  2. 遍历this.entries集合,根据name进行addChunk()创建一个新的Chunk,并且创建对应的new Entrypoint(),也就是ChunkGroup
  3. 进行一系列对象的存储:namedChunkGroupsentrypointschunkGroups,为后续的逻辑做准备
  4. 最后进行chunk和ChunkGroup的关联: connectChunkGroupAndChunk()
  5. 最后进行this.entries.dependencies的遍历,因为一个入口Chunk可能存在多个文件,比如entry: {A: ["1.js", "2.js"]}ChunkA存在1.js2.js,此时的this.entries.dependencies就是1.js2.js
seal() {
    const chunkGraph = new ChunkGraph(
        this.moduleGraph,
        this.outputOptions.hashFunction
    );
    this.chunkGraph = chunkGraph;
    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {
        // 1.获取chunk对象
        const chunk = this.addChunk(name);
        // 2.根据options创建Entrypoint,entrypoint为chunkGroup对象
        const entrypoint = new Entrypoint(options);
        // 3.多个Map对象的设置
        if (!options.dependOn && !options.runtime) {
            entrypoint.setRuntimeChunk(chunk); // 后面生成runtime代码有用
        }
        entrypoint.setEntrypointChunk(chunk);
        this.namedChunkGroups.set(name, entrypoint);
        this.entrypoints.set(name, entrypoint);
        this.chunkGroups.push(entrypoint);
        // 4.关联chunkGroup和chunk
        // const connectChunkGroupAndChunk = (chunkGroup, chunk) => {
        //     if (chunkGroup.pushChunk(chunk)) {
        //         chunk.addGroup(chunkGroup);
        //     }
        // };
        connectChunkGroupAndChunk(entrypoint, chunk);

        for (const dep of [...this.globalEntry.dependencies, ...dependencies]) {
            entrypoint.addOrigin(null, { name }, /** @type {any} */(dep).request);

            const module = this.moduleGraph.getModule(dep);
            if (module) {
                chunkGraph.connectChunkAndEntryModule(chunk, module, entrypoint);
                //...
            }
        }
    }
}

2.1 this.entries

this.entries是什么?

在触发hooks.make.tapAsync()的分析中,我们知道一开始会传入入口文件entry,然后使用createDependency()构建EntryDependency,然后调用compilation.addEntry()开始make阶段的执行

// node_modules/webpack/lib/EntryPlugin.js
apply(compiler) {
    const { entry, options, context } = this;
    const dep = EntryPlugin.createDependency(entry, options);
    compiler.hooks.make.tapAsync("EntryPlugin", (compilation, callback) => {
        compilation.addEntry(context, dep, options, err => {
            callback(err);
        });
    });
}
static createDependency(entry, options) {
  const dep = new EntryDependency(entry);
  // TODO webpack 6 remove string option
  dep.loc = { name: typeof options === "object" ? options.name : options };
  return dep;
}

而在addEntry()中:

  • 创建entryData数据
  • entryData[target].push(entry)
  • this.entries.set(name, entryData)

换句话说,this.entries存放的就是入口文件类型的Dependency数组

// node_modules/webpack/lib/Compilation.js
addEntry(context, entry, optionsOrName, callback) {
    this._addEntryItem(context, entry, "dependencies", options, callback);
}
_addEntryItem(context, entry, target, options, callback) {
    const { name } = options;
    let entryData =
        name !== undefined ? this.entries.get(name) : this.globalEntry;
    if (entryData === undefined) {
        entryData = {
            dependencies: [],
            includeDependencies: [],
            options: {
                name: undefined,
                ...options
            }
        };
        entryData[target].push(entry);
        this.entries.set(name, entryData);
    } else {
        entryData[target].push(entry);
        //...
    }
    //...
    this.addModuleTree();
}

回到文章要分析的seal阶段,我们就可以知道,一开始遍历this.entries实际就是遍历入口文件,其中name是入口文件的名称,dependencies就是入口文件类型的EntryDependency,总结起来就是:

「Webpack5源码」seal阶段(流程图)分析(一)

在遍历过程中,我们对每一个入口文件,都调用addChunk()进行Chunk对象的构建+调用new Entrypoint()进行ChunkGroup对象的构建,然后使用connectChunkGroupAndChunk()建立起ChunkGroupChunk的关联

seal() {
    const chunkGraph = new ChunkGraph(
        this.moduleGraph,
        this.outputOptions.hashFunction
    );
    this.chunkGraph = chunkGraph;
    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {
        // 1.获取chunk对象
        const chunk = this.addChunk(name);
        // 2.根据options创建Entrypoint,entrypoint为chunkGroup对象
        const entrypoint = new Entrypoint(options);
        // 3.多个Map对象的设置
        if (!options.dependOn && !options.runtime) {
            entrypoint.setRuntimeChunk(chunk); // 后面生成runtime代码有用
        }
        entrypoint.setEntrypointChunk(chunk);
        this.namedChunkGroups.set(name, entrypoint);
        this.entrypoints.set(name, entrypoint);
        this.chunkGroups.push(entrypoint);
        // 4.关联chunkGroup和chunk
        // const connectChunkGroupAndChunk = (chunkGroup, chunk) => {
        //     if (chunkGroup.pushChunk(chunk)) {
        //         chunk.addGroup(chunkGroup);
        //     }
        // };
        connectChunkGroupAndChunk(entrypoint, chunk);
        //...
    }
}
addChunk(name) {
    //name存在namedChunks则返回当前chunk
    if (name) {
        const chunk = this.namedChunks.get(name);
        if (chunk !== undefined) {
            return chunk;
        }
    }
    //新建chunk实例
    const chunk = new Chunk(name, this._backCompat);
    this.chunks.add(chunk);
    if (this._backCompat)
        //添加至ChunkGraphForChunk Map
        ChunkGraph.setChunkGraphForChunk(chunk, this.chunkGraph);
    if (name) {
        //添加至namedChunks Map
        this.namedChunks.set(name, chunk);
    }
    return chunk;
}

2.2 this.entries.dependencies

比如entry: {A: ["1.js", "2.js"]}ChunkA存在1.js2.js,此时的this.entries.dependencies就是1.js2.js

  1. 通过dep获取对应的NormalModule,即利用dependency获取对应的Module对象
  2. 使用chunkGraph.connectChunkAndEntryModule()关联chunk、module和chunkGroup的关系
  3. assignDepths()方法会遍历入口module所有的依赖,为每一个module设置深度标记
seal() {
    const chunkGraph = new ChunkGraph(
        this.moduleGraph,
        this.outputOptions.hashFunction
    );
    this.chunkGraph = chunkGraph;
    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {
        // 每一个入口都进行new Chunk()和new ChunkGroup()
        // 关联chunkGroup和chunk

    // 关联chunk、module、chunkGroup
    const entryModules = new Set();
    for (const dep of [...this.globalEntry.dependencies, ...dependencies]) {
        entrypoint.addOrigin(null, { name }, /** @type {any} */(dep).request);

        const module = this.moduleGraph.getModule(dep);
        if (module) {
            // const cgm = this._getChunkGraphModule(module);
            // const cgc = this._getChunkGraphChunk(chunk);
            // if (cgm.entryInChunks === undefined) {
            //     cgm.entryInChunks = new Set();
            // }
            // cgm.entryInChunks.add(chunk);
            // cgc.entryModules.set(module, entrypoint);
            chunkGraph.connectChunkAndEntryModule(chunk, module, entrypoint);
            entryModules.add(module);
            const modulesList = chunkGraphInit.get(entrypoint);
            if (modulesList === undefined) {
                chunkGraphInit.set(entrypoint, [module]);
            } else {
                modulesList.push(module);
            }
        }
    }

    // 为module设置深度标记
    this.assignDepths(entryModules);
    }
}

3.buildChunkGraph概述

从下面代码可以知道,buildChunkGraph()主要分为三个部分:

  • visitModules()
  • connectChunkGroups()
  • cleanupUnconnectedGroups

由于每一点的逻辑都比较复杂,因此下面我们将针对每一个点进行具体的分析

seal(callback) {
    const chunkGraph = new ChunkGraph(
        this.moduleGraph,
        this.outputOptions.hashFunction
    );
    this.chunkGraph = chunkGraph;
    //...

    this.logger.time("create chunks");
    /** @type {Map<Entrypoint, Module[]>} */
    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {
        const chunk = this.addChunk(name);
        const entrypoint = new Entrypoint(options);
        //...
    }
    //...
    buildChunkGraph(this, chunkGraphInit);
    //...
}
const buildChunkGraph = (compilation, inputEntrypointsAndModules) => {
    // PART ONE
    logger.time("visitModules");
    visitModules(...);
    logger.timeEnd("visitModules");

    // PART TWO
    logger.time("connectChunkGroups");
    connectChunkGroups(...);
    logger.timeEnd("connectChunkGroups");

    for (const [chunkGroup, chunkGroupInfo] of chunkGroupInfoMap) {
        for (const chunk of chunkGroup.chunks)
            chunk.runtime = mergeRuntime(chunk.runtime, chunkGroupInfo.runtime);
    }

    // Cleanup work
    logger.time("cleanup");
    cleanupUnconnectedGroups(compilation, allCreatedChunkGroups);
    logger.timeEnd("cleanup");
};

4.buildChunkGraph-1-visitModules

从下面代码块知道,visitModules主要分为三个部分:

  • inputEntrypointsAndModules:遍历inputEntrypointsAndModules,初始化chunkGroupInfo
  • 遍历chunkGroupsForCombining:处理chunkGroup有父chunkGroup的情况,将两个chunkGroupInfo进行互相关联
  • 处理queue数据:两个队列,不断循环处理
const visitModules = {
    for (const [chunkGroup, modules] of inputEntrypointsAndModules) {
      // 遍历inputEntrypointsAndModules,初始化chunkGroupInfo
    }
    for (const chunkGroupInfo of chunkGroupsForCombining) {
      // 处理chunkGroup有父chunkGroup的情况,将两个chunkGroupInfo进行互相关联
    }
    while (queue.length || queueConnect.size) {
      processQueue(); // 内层遍历
      if (chunkGroupsForCombining.size > 0) {
        processChunkGroupsForCombining();
      }
      if (queueConnect.size > 0) {
        processConnectQueue();
        if (chunkGroupsForMerging.size > 0) {
          processChunkGroupsForMerging();
        }
      }
      if (outdatedChunkGroupInfo.size > 0) {
        processOutdatedChunkGroupInfo();
      }
    }
}

4.1 visitModules流程图

「Webpack5源码」seal阶段(流程图)分析(一)

4.2 遍历inputEntrypointsAndModules,初始化chunkGroupInfo

在上面2.1的分析中,如下面代码所示,我们会进行chunkGraphInit数据结构的初始化,使用entrypoint作为key,将对应入口所包含的Module都加入到数组中

比如entry: {A: ["1.js", "2.js"]}ChunkA存在1.js2.js,此时的this.entries.dependencies就是1.js2.jschunkGraphInit根据entrypoint创建的数组包含1.js2.js

// node_modules/webpack/lib/Compilation.js
for (const [name, { dependencies, includeDependencies, options }] of this
    .entries) {
    const chunk = this.addChunk(name);
    if (options.filename) {
        chunk.filenameTemplate = options.filename;
    }
    const entrypoint = new Entrypoint(options);

    //...
    for (const dep of [...this.globalEntry.dependencies, ...dependencies]) {
        entrypoint.addOrigin(null, { name }, /** @type {any} */(dep).request);

        const module = this.moduleGraph.getModule(dep);
        if (module) {
            chunkGraph.connectChunkAndEntryModule(chunk, module, entrypoint);
            entryModules.add(module);
            const modulesList = chunkGraphInit.get(entrypoint);
            if (modulesList === undefined) {
                chunkGraphInit.set(entrypoint, [module]);
            } else {
                modulesList.push(module);
            }
        }
    }
    //...
}

从下面代码可以知道,我们会遍历所有inputEntrypointsAndModules,获取所有入口文件相关的NormalModule,然后把它们都加入到queue

加入到queue之前会判断当前入口文件类型的chunkGroup是否具有parent,如果有的话,直接放入chunkGroupsForCombining,而不放入queue

// 精简代码,只留下要分析的代码
// inputEntrypointsAndModules = { Entrypoint: [NormalModule] }
// 由于Entrypoint extends ChunkGroup,因此
// inputEntrypointsAndModules = { ChunkGroup: [NormalModule] }
for (const [chunkGroup, modules] of inputEntrypointsAndModules) {
    const runtime = getEntryRuntime(
        compilation,
        chunkGroup.name,
        chunkGroup.options
    );
   
    // 为entry创建chunkGroupInfo
    const chunkGroupInfo = {
        chunkGroup,
        runtime,
        minAvailableModules: undefined, // 可追踪的最小module数量
        minAvailableModulesOwned: false,
        availableModulesToBeMerged: [],
        skippedItems: undefined,
        resultingAvailableModules: undefined,
        children: undefined,
        availableSources: undefined,
        availableChildren: undefined
    };
    if (chunkGroup.getNumberOfParents() > 0) {
        // 如果chunkGroup有父chunkGroup,那么可能父chunkGroup已经在其它地方已经引用它了,需要另外处理
        chunkGroupsForCombining.add(chunkGroupInfo);
    } else {
        chunkGroupInfo.minAvailableModules = EMPTY_SET;
        const chunk = chunkGroup.getEntrypointChunk();
        for (const module of modules) {
            queue.push({
                action: ADD_AND_ENTER_MODULE,
                block: module,
                module,
                chunk,
                chunkGroup,
                chunkGroupInfo
            });
        }
    }
    chunkGroupInfoMap.set(chunkGroup, chunkGroupInfo);
    if (chunkGroup.name) {
        namedChunkGroups.set(chunkGroup.name, chunkGroupInfo);
    }
}

4.3 检测chunkGroupsForCombining,处理EntryPoint有父chunkGroup的情况

遍历chunkGroupsForCombining,将两个chunkGroupInfo进行互相关联,本质就是availableSourcesavailableChildren互相添加对方chunkGroupInfo

// 处理chunkGroup有父chunkGroup的情况,将两个chunkGroupInfo进行互相关联
for (const chunkGroupInfo of chunkGroupsForCombining) {
    const { chunkGroup } = chunkGroupInfo;
    chunkGroupInfo.availableSources = new Set();
    for (const parent of chunkGroup.parentsIterable) {
        const parentChunkGroupInfo = chunkGroupInfoMap.get(parent);
        chunkGroupInfo.availableSources.add(parentChunkGroupInfo);
        if (parentChunkGroupInfo.availableChildren === undefined) {
            parentChunkGroupInfo.availableChildren = new Set();
        }
        parentChunkGroupInfo.availableChildren.add(chunkGroupInfo);
    }
}

4.4 processQueue:处理queue

将所有入口类型的module压入queue后,赋予初始状态ADD_AND_ENTER_MODULE,然后不断变化状态值,调用不同方法进行处理从下面processQueue()可以知道,会执行由于几个状态都不存在break语句,因此会执行ADD_AND_ENTER_ENTRY_MODULE->ADD_AND_ENTER_MODULE->ENTER_MODULE->PROCESS_BLOCK

for (const [chunkGroup, modules] of inputEntrypointsAndModules) {
    // 为entry创建chunkGroupInfo
    const chunkGroupInfo = {
        chunkGroup,
        runtime,
        //...
    };
    chunkGroupInfo.minAvailableModules = EMPTY_SET;
    const chunk = chunkGroup.getEntrypointChunk();
    for (const module of modules) {
        queue.push({
            action: ADD_AND_ENTER_MODULE,
            block: module,
            module,
            chunk,
            chunkGroup,
            chunkGroupInfo
        });
    }
}
// 取queue要pop(),为了保证访问顺序,需要反转一下数组
queue.reverse();

const processQueue = () => {
    while (queue.length) {
        statProcessedQueueItems++;
        const queueItem = queue.pop();
        module = queueItem.module;
        block = queueItem.block;
        chunk = queueItem.chunk;
        chunkGroup = queueItem.chunkGroup;
        chunkGroupInfo = queueItem.chunkGroupInfo;
        switch (queueItem.action) {
            case ADD_AND_ENTER_ENTRY_MODULE:
            //...
            case ADD_AND_ENTER_MODULE:
            //...
            case ENTER_MODULE:
            //...
            case PROCESS_BLOCK: {
                processBlock(block);
                break;
            }
            case PROCESS_ENTRY_BLOCK: {
                processEntryBlock(block);
                break;
            }
            case LEAVE_MODULE:
            //...
        }
    }
}

下面将按照ADD_AND_ENTER_ENTRY_MODULE->ADD_AND_ENTER_MODULE->ENTER_MODULE->PROCESS_BLOCK顺序进行讲解

4.4.1 ADD_AND_ENTER_ENTRY_MODULE

取目前的入口entryModule,然后进行chunkmodulechunkGroup的关联

switch (queueItem.action) {
    case ADD_AND_ENTER_ENTRY_MODULE:
        chunkGraph.connectChunkAndEntryModule(
            chunk,
            module,
            /** @type {Entrypoint} */(chunkGroup)
        );
}
// node_modules/webpack/lib/ChunkGraph.js
connectChunkAndEntryModule(chunk, module, entrypoint) {
  const cgm = this._getChunkGraphModule(module);
  const cgc = this._getChunkGraphChunk(chunk);
  if (cgm.entryInChunks === undefined) {
    cgm.entryInChunks = new Set();
  }
  cgm.entryInChunks.add(chunk);
  cgc.entryModules.set(module, entrypoint);
}

4.4.2 ADD_AND_ENTER_MODULE

chunkmodule进行互相关联

switch (queueItem.action) {
    case ADD_AND_ENTER_ENTRY_MODULE:
        chunkGraph.connectChunkAndEntryModule(
            chunk,
            module,
            /** @type {Entrypoint} */(chunkGroup)
        );
    // fallthrough
    case ADD_AND_ENTER_MODULE: {
        if (chunkGraph.isModuleInChunk(module, chunk)) {
            // already connected, skip it
            break;
        }
        // We connect Module and Chunk
        chunkGraph.connectChunkAndModule(chunk, module);
    }
}
// node_modules/webpack/lib/ChunkGraph.js
connectChunkAndModule(chunk, module) {
    const cgm = this._getChunkGraphModule(module);
    const cgc = this._getChunkGraphChunk(chunk);
    cgm.chunks.add(chunk);
    cgc.modules.add(module);
}
isModuleInChunk(module, chunk) {
    const cgc = this._getChunkGraphChunk(chunk);
    return cgc.modules.has(module);
}

4.4.3 ENTER_MODULE

switch (queueItem.action) {
    case ADD_AND_ENTER_ENTRY_MODULE:
        chunkGraph.connectChunkAndEntryModule(
            chunk,
            module,
            /** @type {Entrypoint} */(chunkGroup)
        );
    // fallthrough
    case ADD_AND_ENTER_MODULE: {
        if (chunkGraph.isModuleInChunk(module, chunk)) {
            // already connected, skip it
            break;
        }
        // We connect Module and Chunk
        chunkGraph.connectChunkAndModule(chunk, module);
    }
    case ENTER_MODULE: {
        const index = chunkGroup.getModulePreOrderIndex(module);
        // ...省略设置index的逻辑
        queueItem.action = LEAVE_MODULE;
        queue.push(queueItem);
    }
}

4.4.4 PROCESS_BLOCK

ADD_AND_ENTER_ENTRY_MODULE->ADD_AND_ENTER_MODULE->ENTER_MODULE->PROCESS_BLOCK,此时会触发processBlock()的执行

const processQueue = () => {
    while (queue.length) {
        statProcessedQueueItems++;
        const queueItem = queue.pop();
        module = queueItem.module;
        block = queueItem.block;
        chunk = queueItem.chunk;
        chunkGroup = queueItem.chunkGroup;
        chunkGroupInfo = queueItem.chunkGroupInfo;
        switch (queueItem.action) {
            case ADD_AND_ENTER_ENTRY_MODULE:
            //...
            case ADD_AND_ENTER_MODULE:
            //...
            case ENTER_MODULE:
            //...
            case PROCESS_BLOCK: {
                processBlock(block);
                break;
            }
            case PROCESS_ENTRY_BLOCK: {
                processEntryBlock(block);
                break;
            }
            case LEAVE_MODULE:
            //...
        }
    }
}

processBlock()中先触发getBlockModules()

同步依赖的block=module,异步依赖就传递不同的参数

const processBlock = block => {
    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);
}

getBlockModules() {
    //...省略初始化blockModules和blockModulesMap的逻辑
    extractBlockModules(module, moduleGraph, runtime, blockModulesMap);
    blockModules = blockModulesMap.get(block);
    return blockModules;
}
const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {
    //...省略很多条件判断
    for (const connection of moduleGraph.getOutgoingConnections(module)) {
        const m = connection.module;
        const i = index << 2;
        modules[i] = m;
        modules[i + 1] = state;
    }
    //...省略处理modules[t]为空的逻辑
    //最终返回的就是module所有import的依赖+对应的state的数组
}

moduleGraph.getOutgoingConnections()是一个看起来非常熟悉的方法,在make阶段中我们就遇到过

// node_modules/webpack/lib/ModuleGraph.js
getOutgoingConnections(module) {
    const connections = this._getModuleGraphModule(module).outgoingConnections;
    return connections === undefined ? EMPTY_SET : connections;
}

make阶段addModule()方法执行后,我们会执行moduleGraph.setResolvedModule(),其中会涉及到originModuledependencymodule等变量

// node_modules/webpack/lib/Compilation.js
const unsafeCacheableModule =
	/** @type {Module & { restoreFromUnsafeCache: Function }} */ (
  module
);
for (let i = 0; i < dependencies.length; i++) {
  const dependency = dependencies[i];
  moduleGraph.setResolvedModule(
    connectOrigin ? originModule : null,
    dependency,
    unsafeCacheableModule
  );
  unsafeCacheDependencies.set(dependency, unsafeCacheableModule);
}
// node_modules/webpack/lib/ModuleGraph.js
setResolvedModule(originModule, dependency, module) {
    const connection = new ModuleGraphConnection(
        originModule,
        dependency,
        module,
        undefined,
        dependency.weak,
        dependency.getCondition(this)
    );
    const connections = this._getModuleGraphModule(module).incomingConnections;
    connections.add(connection);
    if (originModule) {
        const mgm = this._getModuleGraphModule(originModule);
        if (mgm._unassignedConnections === undefined) {
            mgm._unassignedConnections = [];
        }
        mgm._unassignedConnections.push(connection);
        if (mgm.outgoingConnections === undefined) {
            mgm.outgoingConnections = new SortableSet();
        }
        mgm.outgoingConnections.add(connection);
    } else {
        this._dependencyMap.set(dependency, connection);
    }
}
  • originModule: 父Module,比如下面示例中的index.js
  • dependency: 是父Module的依赖集合,比如下面示例中的"./item/index_item-parent1.js",它会在originModule中产生4个dependency
// index.js
import {getC1} from "./item/index_item-parent1.js";
var test = _.add(6, 4) + getC1(1, 3);
var test1 = _.add(6, 4) + getC1(1, 3);
var test2 =  getC1(4, 5);

「Webpack5源码」seal阶段(流程图)分析(一)

sortedDependencies[0] = {
    dependencies: [
        { // HarmonyImportSideEffectDependency
            request: "./item/index_item-parent1.js",
            userRequest: "./item/index_item-parent1.js"
        },
        { // HarmonyImportSpecifierDependency
            name: "getC1",
            request: "./item/index_item-parent1.js",
            userRequest: "./item/index_item-parent1.js"
        }
        //...
    ],
    originModule: {
        userRequest: "/Users/wcbbcc/blog/Frontend-Articles/webpack-debugger/js/src/index.js",
        dependencies: [
            //...10个依赖,包括上面那两个Dependency
        ]
    }
}
  • module: 在make阶段中,依赖对象dependency会进行handleModuleCreation(),这个时候触发的是NormalModuleFactory.create(),会拿出第一个dependencies[0],也就是上面示例中的HarmonyImportSideEffectDependency,也就是import {getC1} from "./item/index_item-parent1.js",然后转化为module
// node_modules/webpack/lib/NormalModuleFactory.js
create(data, callback) {
    const dependencies = /** @type {ModuleDependency[]} */ (data.dependencies);
    const dependency = dependencies[0];
    const request = dependency.request;
    const dependencyType =
        (dependencies.length > 0 && dependencies[0].category) || "";

    const resolveData = {
        request,
        dependencies,
        dependencyType
    };
    // 利用resolveData进行一系列的resolve()和buildModule()操作...
}

回到processBlock()的分析,我们就可以知道,connection.module实际就是当前module的所有依赖

其中要记住的是 当前module的同步依赖是建立在 blockModulesMap.set(block, arr)的arr数组中,此时block是当前module 而当前module的异步依赖会另外起一个数组arr,即使blockModulesMap.set(block, arr)的block是当前module的异步依赖

const processBlock = block => {
    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);
}

getBlockModules() {
    //...省略初始化blockModules和blockModulesMap的逻辑
    extractBlockModules(module, moduleGraph, runtime, blockModulesMap);
    blockModules = blockModulesMap.get(block);
    return blockModules;
}
const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {
    const queue = [module];
    while (queue.length > 0) {
        const block = queue.pop();
        const arr = [];
        arrays.push(arr);
        blockModulesMap.set(block, arr);
        for (const b of block.blocks) {
            queue.push(b);
        }
    }
    for (const connection of moduleGraph.getOutgoingConnections(module)) {
        const m = connection.module;
        const i = index << 2;
        modules[i] = m;
        modules[i + 1] = state;
    }
    //...省略处理modules去重逻辑
    //最终返回的就是module所有import的依赖+对应的state的数组
}

最终extractBlockModules()会得到一个依赖数据对象blockModulesgetBlockModules()通过当前module获取所有的同步依赖,即下面示例中的Array(14)「Webpack5源码」seal阶段(流程图)分析(一)

processBlock()-处理同步依赖

经过上面的分析,我们通过getBlockModules()获取当前block的所有同步依赖后,我们对这些依赖进行遍历

同步依赖的block=module,异步依赖就传递不同的参数,如下面的queueBuffer的数据结构,blockmodule都是同一个数据refModule

主要分为三个方面的处理:

  • 如果activeState不为true,则加入到skipConnectionBuffer集合中
  • 如果activeState为true,但是minAvailableModules/minAvailableModules已经有该module,也就是parent chunks已经含有该module,则加入到skipBuffer集合中
  • 如果能够满足上面两个检查,则把当前的module加入到queueBuffer
const processBlock = (block, isSrc) => {
    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);
    for (let i = 0; i < blockModules.length; i += 2) {
        const refModule = /** @type {Module} */ (blockModules[i]);
        if (chunkGraph.isModuleInChunk(refModule, chunk)) {
            // skip early if already connected
            continue;
        }
        const activeState = /** @type {ConnectionState} */ (
            blockModules[i + 1]
        );
        if (activeState !== true) {
            skipConnectionBuffer.push([refModule, activeState]);
            if (activeState === false) continue;
        }
        if (
            activeState === true &&
            (minAvailableModules.has(refModule) ||
                minAvailableModules.plus.has(refModule))
        ) {
            // already in parent chunks, skip it for now
            skipBuffer.push(refModule);
            continue;
        }
        // enqueue, then add and enter to be in the correct order
        // this is relevant with circular dependencies
        queueBuffer.push({
            action: activeState === true ? ADD_AND_ENTER_MODULE : PROCESS_BLOCK,
            block: refModule,
            module: refModule,
            chunk,
            chunkGroup,
            chunkGroupInfo
        });
    }
    // 处理skipConnectionBuffer
    // 处理skipBuffer
    // 处理queueBuffer
}

由于三段逻辑比较明显和分散,我们可以把它们合在一起

如果activeState不为true,则将当前同步依赖加入到skipConnectionBuffer集合中,然后放入到当前module的chunkGroupInfo.skippedModuleConnections

for (let i = 0; i < blockModules.length; i += 2) {
    const activeState = /** @type {ConnectionState} */ (
        blockModules[i + 1]
    );
    if (activeState !== true) {
        skipConnectionBuffer.push([refModule, activeState]);
        if (activeState === false) continue;
    }
}
if (skipConnectionBuffer.length > 0) {
    let { skippedModuleConnections } = chunkGroupInfo;
    if (skippedModuleConnections === undefined) {
        chunkGroupInfo.skippedModuleConnections = skippedModuleConnections =
            new Set();
    }
    for (let i = skipConnectionBuffer.length - 1; i >= 0; i--) {
        skippedModuleConnections.add(skipConnectionBuffer[i]);
    }
    skipConnectionBuffer.length = 0;
}

如果activeState为true,但是minAvailableModules/minAvailableModules已经有该module,也就是parent chunks已经含有该module,则加入到skipBuffer集合中,然后放入到当前module的chunkGroupInfo.skippedItems

for (let i = 0; i < blockModules.length; i += 2) {
    const activeState = /** @type {ConnectionState} */ (
        blockModules[i + 1]
    );
    if (
        activeState === true &&
        (minAvailableModules.has(refModule) ||
            minAvailableModules.plus.has(refModule))
    ) {
        // already in parent chunks, skip it for now
        skipBuffer.push(refModule);
        continue;
    }
}
if (skipBuffer.length > 0) {
    let {skippedItems} = chunkGroupInfo;
    if (skippedItems === undefined) {
        chunkGroupInfo.skippedItems = skippedItems = new Set();
    }
    for (let i = skipBuffer.length - 1; i >= 0; i--) {
        skippedItems.add(skipBuffer[i]);
    }
    skipBuffer.length = 0;
}

如果能够满足上面两个检查,则把当前的module的同步依赖加入到queueBuffer中,然后加入到queue,继续在内层循环中处理同步依赖

for (let i = 0; i < blockModules.length; i += 2) {
    const activeState = /** @type {ConnectionState} */ (
        blockModules[i + 1]
    );
    queueBuffer.push({
        action: activeState === true ? ADD_AND_ENTER_MODULE : PROCESS_BLOCK,
        block: refModule,
        module: refModule,
        chunk,
        chunkGroup,
        chunkGroupInfo
    });
}
if (queueBuffer.length > 0) {
    for (let i = queueBuffer.length - 1; i >= 0; i--) {
        queue.push(queueBuffer[i]);
    }
    queueBuffer.length = 0;
}

processBlock()-处理异步依赖

处理完成同步依赖后,会触发iteratorBlock(b)处理当前module的异步依赖从下面的代码块分析可以知道,主要分为3种情况

  • 情况1: 这个异步依赖NormalModule还没有对应的chunkGroup
    • 场景1: Entry类型,压入queueDelayed,状态置为PROCESS_ENTRY_BLOCK,构件新的Chunk
    • 场景2: webpack.config.jsasyncChunks=false/chunkLoading=false,还是使用目前的Chunk,与同步依赖集成在同一文件中
    • 场景3: 非Entry+允许asyncChunk的情况,使用addChunkInGroup()建立新的ChunkGroup和新的Chunk,形成新的文件存放该异步依赖
  • 情况2: 这个异步依赖NormalModule有对应的chunkGroup,而且它是入口类型的
  • 情况3: 这个异步依赖NormalModule有对应的chunkGroup,而且它不是入口类型的

最后再进行Entry类型和非Entry类型的分开处理

const processBlock = (block, isSrc) => {
  //...处理同步依赖
  for (const b of block.blocks) {
  		iteratorBlock(b);
  }
}

const iteratorBlock = b => {
    let cgi = blockChunkGroups.get(b);
    const entryOptions = b.groupOptions && b.groupOptions.entryOptions;
    if (cgi === undefined) {
        // 情况1: 这个异步NormalModule还没有对应的chunkGroup
        if (entryOptions) {
            // 场景1: Entry类型
            queueDelayed.push({
                action: PROCESS_ENTRY_BLOCK,
                block: b,
                module: module,
                chunk: entrypoint.chunks[0],
                chunkGroup: entrypoint,
                chunkGroupInfo: cgi
            });
        } else if (!chunkGroupInfo.asyncChunks || !chunkGroupInfo.chunkLoading) {
            // 场景2: webpack.config.js中asyncChunks=false/chunkLoading=false
            queue.push({
                action: PROCESS_BLOCK,
                block: b,
                module: module,
                chunk,
                chunkGroup,
                chunkGroupInfo
            });
        } else {
            // 场景3: 非Entry+允许asyncChunk的情况
            c = compilation.addChunkInGroup(
                b.groupOptions || b.chunkName,
                module,
                b.loc,
                b.request
            );
            blockConnections.set(b, []);
        }
    } else if (entryOptions) {
        // 情况2: 这个异步NormalModule有对应的chunkGroup,而且它是入口类型的
        entrypoint = cgi.chunkGroup;
    } else {
        // 情况3: 这个异步NormalModule有对应的chunkGroup,而且它不是入口类型的
        c = cgi.chunkGroup;
    }

    if (c !== undefined) {
      // 处理不是Entry类型
    } else if (entrypoint !== undefined) {
      // 处理Entry类型
        chunkGroupInfo.chunkGroup.addAsyncEntrypoint(entrypoint);
    }
}
处理不是Entry类型:queueConnection的构建

c !== undefined时,该异步依赖不是Entry类型,将它放入到queueConnection中然后把当前异步依赖也放入queueDelayed数组中,等待下一次处理,此时我们要注意,chunkGroup已经变为c,此时的c有可能是异步依赖建立的新的ChunkGroup

if (c !== undefined) {
    blockConnections.get(b).push({
        originChunkGroupInfo: chunkGroupInfo,
        chunkGroup: c
    });

    let connectList = queueConnect.get(chunkGroupInfo);
    if (connectList === undefined) {
        connectList = new Set();
        queueConnect.set(chunkGroupInfo, connectList);
    }
    connectList.add(cgi);

    // TODO check if this really need to be done for each traversal
    // or if it is enough when it's queued when created
    // 4. We enqueue the DependenciesBlock for traversal
    queueDelayed.push({
        action: PROCESS_BLOCK,
        block: b,
        module: module,
        chunk: c.chunks[0],
        chunkGroup: c,
        chunkGroupInfo: cgi
    });
}

processBlock()-处理异步依赖的异步依赖

存储在blocksWithNestedBlocks这个Set数据结构中,等到下一个阶段进行处理

const processBlock = (block, isSrc) => {
    //...处理同步依赖

    // 处理异步依赖
    for (const b of block.blocks) {
        iteratorBlock(b);
    }

    if (block.blocks.length > 0 && module !== block) {
        blocksWithNestedBlocks.add(block);
    }
}

在上面的分析中,我们知道当异步依赖是entry类型时,我们会将它加入到queueDelayed,并且状态置为PROCESS_ENTRY_BLOCK,那么这个状态执行了什么逻辑呢?

4.4.5 PROCESS_ENTRY_BLOCK

从下面代码可以看出,processEntryBlock()processBlock()的整体逻辑是一样的,都是遍历所有同步依赖blockModules,然后压入到queueBuffer中,然后处理异步依赖,然后处理异步依赖的异步依赖

const processEntryBlock = block => {
    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);
    for (let i = 0; i < blockModules.length; i += 2) {
        const refModule = /** @type {Module} */ (blockModules[i]);
        const activeState = /** @type {ConnectionState} */ (
            blockModules[i + 1]
        );
        queueBuffer.push({
            action:
                activeState === true ? ADD_AND_ENTER_ENTRY_MODULE : PROCESS_BLOCK,
            block: refModule,
            module: refModule,
            chunk,
            chunkGroup,
            chunkGroupInfo
        });
    }

    if (queueBuffer.length > 0) {
        for (let i = queueBuffer.length - 1; i >= 0; i--) {
            queue.push(queueBuffer[i]);
        }
        queueBuffer.length = 0;
    }

    for (const b of block.blocks) {
        iteratorBlock(b);
    }

    if (block.blocks.length > 0 && module !== block) {
        blocksWithNestedBlocks.add(block);
    }
}

4.4.6 LEAVE_MODULE

最后一个状态,设置index,没有什么特别的逻辑

const processQueue = () => {
    while (queue.length) {
        statProcessedQueueItems++;
        const queueItem = queue.pop();
        module = queueItem.module;
        block = queueItem.block;
        chunk = queueItem.chunk;
        chunkGroup = queueItem.chunkGroup;
        chunkGroupInfo = queueItem.chunkGroupInfo;
        switch (queueItem.action) {
            case ADD_AND_ENTER_ENTRY_MODULE:
            //...
            case ADD_AND_ENTER_MODULE:
            //...
            case ENTER_MODULE:
            //...
            case PROCESS_BLOCK: {
                processBlock(block);
                break;
            }
            case PROCESS_ENTRY_BLOCK: {
                processEntryBlock(block);
                break;
            }
            case LEAVE_MODULE:
                const index = chunkGroup.getModulePostOrderIndex(module);
                if (index === undefined) {
                    chunkGroup.setModulePostOrderIndex(
                        module,
                        chunkGroupInfo.postOrderIndex++
                    );
                }

                if (
                    moduleGraph.setPostOrderIndexIfUnset(
                        module,
                        nextFreeModulePostOrderIndex
                    )
                ) {
                    nextFreeModulePostOrderIndex++;
                }
                break;
        }
    }
}

4.4.7 总结

  1. 处理同步的依赖->将异步依赖加入队列中->将异步依赖的异步依赖放入到Set()中
  2. queue->queueBuffer(ADD_AND_ENTER_MODULE)->queueDelayed(PROCESS_ENTRY_BLOCK或者PROCESS_BLOCK)

4.5 处理chunkGroupsForCombining,即chunkGroup有父chunkGroup的情况

chunkGroupsForCombining数据是在哪里添加的?数据结构是怎样的?最后是如何处理的?

在上面visitModules()的分析中,会进行inputEntrypointsAndModules遍历,然后选择压入queue处理或者压入chunkGroupsForCombining处理,而这些数据,会等到一轮queue处理完毕后再进行处理

if (chunkGroup.getNumberOfParents() > 0) {
    // minAvailableModules for child entrypoints are unknown yet, set to undefined.
    // This means no module is added until other sets are merged into
    // this minAvailableModules (by the parent entrypoints)
    const skippedItems = new Set();
    for (const module of modules) {
        skippedItems.add(module);
    }
    chunkGroupInfo.skippedItems = skippedItems;
    chunkGroupsForCombining.add(chunkGroupInfo);
} else {
    for (const module of modules) {
        queue.push({
            action: ADD_AND_ENTER_MODULE,
            block: module,
            module,
            chunk,
            chunkGroup,
            chunkGroupInfo
        });
    }
}
for (const chunkGroupInfo of chunkGroupsForCombining) {
    const { chunkGroup } = chunkGroupInfo;
    chunkGroupInfo.availableSources = new Set();
    for (const parent of chunkGroup.parentsIterable) {
        const parentChunkGroupInfo = chunkGroupInfoMap.get(parent);
        chunkGroupInfo.availableSources.add(parentChunkGroupInfo);
        if (parentChunkGroupInfo.availableChildren === undefined) {
            parentChunkGroupInfo.availableChildren = new Set();
        }
        parentChunkGroupInfo.availableChildren.add(chunkGroupInfo);
    }
}

processQueue()的内层循环结束时,我们会进行chunkGroupsForCombining数据的统一处理

每一次遍历完queue,都会触发一次chunkGroupsForCombining.size的检测

while (queue.length || queueConnect.size) {
    processQueue();
    if (chunkGroupsForCombining.size > 0) {
        processChunkGroupsForCombining();
    }
    //...
    if (queue.length === 0) {
        const tempQueue = queue;
        queue = queueDelayed.reverse();
        queueDelayed = tempQueue;
    }
}

processChunkGroupsForCombining()具体逻辑如下所示,涉及到一个比较难懂的方法: calculateResultingAvailableModules(),我们暂时理解为它可以计算出当前Chunk的可复用的最小模块,可以使用一个示例简单理解可复用的最小模块:

  • 目前parentModuleentry.js,它有同步依赖a.jsb.jsc.js,异步依赖async_B.js
  • 目前异步依赖async_B.js可以形成新的ChunkChunkGroup,它有同步依赖a.jsb.js
  • 由于异步依赖async_B.js的加载时间肯定慢于parentModule的同步依赖,因此异步依赖async_B.js可以直接复用parentModule的同步依赖a.jsb.js,而不用把a.jsb.js打包进去自己的Chunk

ChunkGroupInfo.minAvailableModules就是a.jsb.jsNormalModule集合

理清楚minAvailableModules的概念后,我们就可以对下面代码进行分析:

  • 遍历当前ChunkGroupInfo的所有parent ChunkGroupInfo,即info.availableSources,然后计算出它们的resultingAvailableModules可复用的模块,然后不断合并到当前ChunkGroupInfoavailableModules属性中
  • 最终进行ChunkGroupInfo.minAvailableModules的赋值
  • 最终outdatedChunkGroupInfo添加目前的ChunkGroupInfo
const processChunkGroupsForCombining = () => {
  for (const info of chunkGroupsForCombining) {
    for (const source of info.availableSources) {
      if (!source.minAvailableModules) {
        chunkGroupsForCombining.delete(info);
        break;
      }
    }
  }
  for (const info of chunkGroupsForCombining) {
    const availableModules = /** @type {ModuleSetPlus} */ (new Set());
    availableModules.plus = EMPTY_SET;
    const mergeSet = set => {
      if (set.size > availableModules.plus.size) {
        for (const item of availableModules.plus) availableModules.add(item);
        availableModules.plus = set;
      } else {
        for (const item of set) availableModules.add(item);
      }
    };
    // combine minAvailableModules from all resultingAvailableModules
    for (const source of info.availableSources) {
      const resultingAvailableModules =
        calculateResultingAvailableModules(source);
      mergeSet(resultingAvailableModules);
      mergeSet(resultingAvailableModules.plus);
    }
    info.minAvailableModules = availableModules;
    info.minAvailableModulesOwned = false;
    info.resultingAvailableModules = undefined;
    outdatedChunkGroupInfo.add(info);
  }
  chunkGroupsForCombining.clear();
};

4.6 处理queueConnect和chunkGroupsForMerging

queueConnect数据是在哪里添加的?数据结构是如何?最后是如何处理queueConnect这种数据的?

4.6.1 queueConnect数据添加

在上面的分析中,我们可以知道,处理NormalModule的异步依赖时,我们会触发iteratorBlock()方法在iteratorBlock()中,我们会将异步依赖新创建的ChunkGroup加入到queueConnect中,然后将目前的异步依赖的action置为PROCESS_BLOCK,重新进行processBlock的同步依赖和异步依赖的处理

如下面代码块所示,c实际上是一个非入口类型的chunkGroupqueueConnect存储的是:

  • key: 当前ChunkGroupInfo
  • value: 非入口类型创建的新chunkGroup集合数组
// 处理NormalModule的异步依赖b
const iteratorBlock = b => {
    // 如果c之前不存在,需要重新建立,这里只是为了更好理解而摘出这部分代码
    c = compilation.addChunkInGroup(
        b.groupOptions || b.chunkName,
        module,
        b.loc,
        b.request
    );
    c.index = nextChunkGroupIndex++;
    if (c !== undefined) {
        // b为非入口的异步依赖
        blockConnections.get(b).push({
            originChunkGroupInfo: chunkGroupInfo,
            chunkGroup: c
        });
        let connectList = queueConnect.get(chunkGroupInfo);
        if (connectList === undefined) {
            connectList = new Set();
            queueConnect.set(chunkGroupInfo, connectList);
        }
        connectList.add(cgi);
        queueDelayed.push({
            action: PROCESS_BLOCK,
            block: b,
            module: module,
            chunk: c.chunks[0],
            chunkGroup: c,
            chunkGroupInfo: cgi
        });
    } else if (entrypoint !== undefined) {
        chunkGroupInfo.chunkGroup.addAsyncEntrypoint(entrypoint);
    }
}

4.6.2 处理queueConnect数据

iteratorBlock()中进行queueConnect数据的构建后在processQueue()的内层循环结束时,我们会进行queueConnect数据的统一处理

每一次遍历完queue,都会触发一次queueConnect.size的检测

while (queue.length || queueConnect.size) {
    processQueue();
    if (chunkGroupsForCombining.size > 0) {
        processChunkGroupsForCombining();
    }
    if (queueConnect.size > 0) {
        // calculating available modules
        processConnectQueue();

        if (chunkGroupsForMerging.size > 0) {
            // merging available modules
            processChunkGroupsForMerging();
        }
    }
    //...
    if (queue.length === 0) {
        const tempQueue = queue;
        queue = queueDelayed.reverse();
        queueDelayed = tempQueue;
    }
}

processConnectQueue()处理当前ChunkGroupInfo的异步依赖,此时

  • chunkGroupInfo: 当前的ChunkGroupInfo
  • targets:当前的ChunkGroupInfo的异步依赖中非入口类型新建的ChunkGroup集合数组

下面代码整体流程可以概括为:

  • 先将非入口类型异步依赖新建的ChunkGroup都加入到当前的ChunkGroupInfo.children
  • 计算出当前的ChunkGroupInfo最小可复用的module集合数据,然后添加到新建的ChunkGroup.availableModulesToBeMerged属性中
  • 将非入口类型异步依赖新建的ChunkGroup都加入到chunkGroupsForMerging集合中,准备下一个阶段
const processConnectQueue = () => {
    // 处理异步依赖创建的<ChunkGroupInfo, chunkGroup[]>之间的关联
    for (const [chunkGroupInfo, targets] of queueConnect) {
        // 1. Add new targets to the list of children
        for (const target of targets) {
					chunkGroupInfo.children.add(target);
				}
        
        // 2. Calculate resulting available modules
        const resultingAvailableModules =
            calculateResultingAvailableModules(chunkGroupInfo);

        const runtime = chunkGroupInfo.runtime;

        // 3. Update chunk group info
        for (const target of targets) {
            target.availableModulesToBeMerged.push(resultingAvailableModules);
            chunkGroupsForMerging.add(target);
            const oldRuntime = target.runtime;
            const newRuntime = mergeRuntime(oldRuntime, runtime);
            if (oldRuntime !== newRuntime) {
                target.runtime = newRuntime;
                outdatedChunkGroupInfo.add(target);
            }
        }

        statConnectedChunkGroups += targets.size;
    }
    queueConnect.clear();
};

4.6.3 处理chunkGroupsForMerging数据

在上面调用processConnectQueue()处理完成queueConnect数据后,会触发processChunkGroupsForMerging()处理chunkGroupsForMergings数据

while (queue.length || queueConnect.size) {
    processQueue();
    if (chunkGroupsForCombining.size > 0) {
        processChunkGroupsForCombining();
    }
    if (queueConnect.size > 0) {
        // calculating available modules
        processConnectQueue();

        if (chunkGroupsForMerging.size > 0) {
            // merging available modules
            processChunkGroupsForMerging();
        }
    }
    //...
    if (queue.length === 0) {
        const tempQueue = queue;
        queue = queueDelayed.reverse();
        queueDelayed = tempQueue;
    }
}

注:由于processChunkGroupsForMerging()代码量过多,因此为了简化处理,将使用一个示例讲解该方法,并且只保留示例会运行的条件代码

「Webpack5源码」seal阶段(流程图)分析(一)

如上图所示,有两个入口会同时持有异步依赖async_B.js,在上面processConnectQueue()的分析中,我们可以知道,使用calculateResultingAvailableModules()可以计算出resultingAvailableModules为:

  • entry1.js['./src/entry1.js', './item/entry1_a.js', './item/entry1_b.js', './item/common_____g.js']
  • entry2.js['./src/entry2.js', './item/entry1_b.js', './item/entry2_aa', './item/common_____g.js']

然后触发target.availableModulesToBeMerged.push(resultingAvailableModules),会将上面得到的两个数组放入到ChunkGroupInfo.availableModulesToBeMerged数据中,最终这些数据会带到processChunkGroupsForMerging()

如下面processChunkGroupsForMerging()所示,一开始由于cachedMinAvailableModules为空,会先赋值一个resultingAvailableModulescachedMinAvailableModules,然后再开始比较计算并集

如下面代码注释所示,计算并集的逻辑其实也不难懂,先拿出cachedMinAvailableModules[i],然后比对availableModules有没有包含这个数据,如果没有,则说明得计算并集,最终触发outdatedChunkGroupInfo.add(info),进行下一个阶段的处理

为什么要计算并集其实也很好理解,如我们上面所分析那样

entry1.js可以为async_B.js一些复用的module,entry2.js可以为async_B.js一些复用的module

程序会先加载同步依赖(即复用的module),再加载async_B.js

那么如果async_B.js内部自己也import这些复用的module作为同步依赖,那么就不用把这些可复用的module打包进去async_B.js所形成的Chunk了,因为可以直接使用Parent Chunk的同步依赖

但是entry1.jsentry2.js可以提供复用的module有一些是不一样的怎么办?

比如entry1.js可以提供a、b、c,entry2.js可以提供b、c、d、e,async_B.js需要的同步依赖是a、c

由于不清楚是先加载哪个入口文件,因此只能计算entry1.jsentry2.js提供复用的module的并集,也就是b、c

因此async_B.js如果需要b、c,那就不用额外打包了,直接复用即可,但是实际async_B.js需要的同步依赖是a、c,因此async_B.js还得把a打包进去

const processChunkGroupsForMerging = () => {
    for (const info of chunkGroupsForMerging) {
        const availableModulesToBeMerged = info.availableModulesToBeMerged;
        let cachedMinAvailableModules = info.minAvailableModules;

        if (availableModulesToBeMerged.length > 1) {
            availableModulesToBeMerged.sort(bySetSize);
        }
        let changed = false;
        merge: for (const availableModules of availableModulesToBeMerged) {
            if (cachedMinAvailableModules === undefined) {
                cachedMinAvailableModules = availableModules;
                info.minAvailableModules = cachedMinAvailableModules;
                info.minAvailableModulesOwned = false;
                changed = true;
            } else {
                if (info.minAvailableModulesOwned) {
                   //...
                } else if (cachedMinAvailableModules.plus === availableModules.plus) {
                    //...
                    // !!!计算并集
                    for (const m of cachedMinAvailableModules) {
                        if (!availableModules.has(m)) {   
                            const newSet = /** @type {ModuleSetPlus} */ (new Set());
                            newSet.plus = availableModules.plus;
                            const iterator = cachedMinAvailableModules[Symbol.iterator]();
                    
                            let it;
                            while (!(it = iterator.next()).done) {
                                const module = it.value;
                                if (module === m) break;
                                newSet.add(module);
                            }
                            while (!(it = iterator.next()).done) {
                                const module = it.value;
                                if (availableModules.has(module)) {
                                    newSet.add(module);
                                }
                            }
                            info.minAvailableModulesOwned = true;
                            info.minAvailableModules = newSet;
                            changed = true;
                            continue merge;
                        }
                    }
                } else {
                    //...
                }
            }
        }
        if (changed) {
            info.resultingAvailableModules = undefined;
            outdatedChunkGroupInfo.add(info);
        }
    }
    chunkGroupsForMerging.clear();
};

4.7 处理outdatedChunkGroupInfo

在经历processQueue()->processConnectQueue()->processChunkGroupsForMerging()的处理后,最终到processOutdatedChunkGroupInfo()的执行

while (queue.length || queueConnect.size) {
    processQueue();
    if (chunkGroupsForCombining.size > 0) {
        processChunkGroupsForCombining();
    }
    if (queueConnect.size > 0) {
        // calculating available modules
        processConnectQueue();

        if (chunkGroupsForMerging.size > 0) {
            // merging available modules
            processChunkGroupsForMerging();
        }
    }
    if (outdatedChunkGroupInfo.size > 0) {
        // check modules for revisit
        processOutdatedChunkGroupInfo();
    }
    if (queue.length === 0) {
        const tempQueue = queue;
        queue = queueDelayed.reverse();
        queueDelayed = tempQueue;
    }
}

processOutdatedChunkGroupInfo()的代码也很多,但是逻辑是比较清晰易懂的,如下面所示,分为4个部分,由于当前异步依赖ChunkGroupInfominAvailableModules发生了变化,导致之前处理的一些逻辑都得重新检查一遍,主要包括:

  • skippedItems: 之前由于检测到minAvailableModules包含当前module,即Parent Chunks可以提供当前module进行复用,因此没有加入到queue中进行处理,现在重新检测了下,这些跳过的module是否还在minAvailableModules中,如果没有,则需要重新加入队列中进行处理
  • skippedModuleConnections:之前因为检测到activeState不为true,因此加入到skippedModuleConnections,现在重新检测下状态是否发生改变,如果发生改变,则需要重新加入队列中进行处理
  • children chunk groups:重新将children chunk加入到queueConnect中,也就是需要计算下异步依赖的minAvailableModules,因为异步依赖的minAvailableModules是依托于parent chunk,现在parent chunkminAvailableModules发生改变,对应的异步依赖也同样需要重新计算下minAvailableModules
  • availableChildren: 拿出当前ChunkGroup的子ChunkGroup,将children都重新加入到chunkGroupsForCombining重新计算下minAvailableModules
const processOutdatedChunkGroupInfo = () => {
    statChunkGroupInfoUpdated += outdatedChunkGroupInfo.size;
    // Revisit skipped elements
    for (const info of outdatedChunkGroupInfo) {
        // 1. Reconsider skipped items
        if (info.skippedItems !== undefined) {
            const { minAvailableModules } = info;
            for (const module of info.skippedItems) {
                if (
                    !minAvailableModules.has(module) &&
                    !minAvailableModules.plus.has(module)
                ) {
                    queue.push({
                        action: ADD_AND_ENTER_MODULE,
                        block: module,
                        module,
                        chunk: info.chunkGroup.chunks[0],
                        chunkGroup: info.chunkGroup,
                        chunkGroupInfo: info
                    });
                    info.skippedItems.delete(module);
                }
            }
        }

        // 2. Reconsider skipped connections
        if (info.skippedModuleConnections !== undefined) {
            const { minAvailableModules } = info;
            for (const entry of info.skippedModuleConnections) {
                const [module, activeState] = entry;
                if (activeState === false) continue;
                if (activeState === true) {
                    info.skippedModuleConnections.delete(entry);
                }
                if (
                    activeState === true &&
                    (minAvailableModules.has(module) ||
                        minAvailableModules.plus.has(module))
                ) {
                    info.skippedItems.add(module);
                    continue;
                }
                queue.push({
                    action: activeState === true ? ADD_AND_ENTER_MODULE : PROCESS_BLOCK,
                    block: module,
                    module,
                    chunk: info.chunkGroup.chunks[0],
                    chunkGroup: info.chunkGroup,
                    chunkGroupInfo: info
                });
            }
        }

        // 2. Reconsider children chunk groups
        if (info.children !== undefined) {
            statChildChunkGroupsReconnected += info.children.size;
            for (const cgi of info.children) {
                let connectList = queueConnect.get(info);
                if (connectList === undefined) {
                    connectList = new Set();
                    queueConnect.set(info, connectList);
                }
                connectList.add(cgi);
            }
        }

        // 3. Reconsider chunk groups for combining
        if (info.availableChildren !== undefined) {
            for (const cgi of info.availableChildren) {
                chunkGroupsForCombining.add(cgi);
            }
        }
    }
    outdatedChunkGroupInfo.clear();
};

4.8 calculateResultingAvailableModules详解

4.8.1 源码分析

在上面的流程中,我们多次使用到calculateResultingAvailableModules()这个方法,它本身的代码量也很少,逻辑方面也非常直白,主要是两个公式的计算,主要是minAvailableModules和minAvailableModules.plus的比较resultingAvailableModules分为两个部分

  • resultingAvailableModules = new Set():modules of chunk
  • resultingAvailableModules.plus = new Set():比较minAvailableModules/minAvailableModules.plus

当minAvailableModules的长度<=minAvailableModules.plus的长度时,维持plus不变,将minAvailableModules并入到resultingAvailableModules中当minAvailableModules的长度>minAvailableModules.plus的长度,此时plus需要扩充,将minAvailableModules并入到resultingAvailableModules.plus中因此最终的结果就是

  • resultingAvailableModules = (modules of chunk) + (minAvailableModules + minAvailableModules.plus)
  • resultingAvailableModules = (minAvailableModules + modules of chunk) + (minAvailableModules.plus)

唯一区别就是minAvailableModules到底是放在resultingAvailableModules还是resultingAvailableModules.plus

const calculateResultingAvailableModules = chunkGroupInfo => {
		if (chunkGroupInfo.resultingAvailableModules)
			return chunkGroupInfo.resultingAvailableModules;

		const minAvailableModules = chunkGroupInfo.minAvailableModules;

		// Create a new Set of available modules at this point
		// We want to be as lazy as possible. There are multiple ways doing this:
		// Note that resultingAvailableModules is stored as "(a) + (b)" as it's a ModuleSetPlus
		// - resultingAvailableModules = (modules of chunk) + (minAvailableModules + minAvailableModules.plus)
		// - resultingAvailableModules = (minAvailableModules + modules of chunk) + (minAvailableModules.plus)
		// We choose one depending on the size of minAvailableModules vs minAvailableModules.plus

		let resultingAvailableModules;
		if (minAvailableModules.size > minAvailableModules.plus.size) {
			// resultingAvailableModules = (modules of chunk) + (minAvailableModules + minAvailableModules.plus)
			resultingAvailableModules =
				/** @type {Set<Module> & {plus: Set<Module>}} */ (new Set());
			for (const module of minAvailableModules.plus)
				minAvailableModules.add(module);
			minAvailableModules.plus = EMPTY_SET;
			resultingAvailableModules.plus = minAvailableModules;
			chunkGroupInfo.minAvailableModulesOwned = false;
		} else {
			// resultingAvailableModules = (minAvailableModules + modules of chunk) + (minAvailableModules.plus)
			resultingAvailableModules =
				/** @type {Set<Module> & {plus: Set<Module>}} */ (
					new Set(minAvailableModules)
				);
			resultingAvailableModules.plus = minAvailableModules.plus;
		}

		// add the modules from the chunk group to the set
		for (const chunk of chunkGroupInfo.chunkGroup.chunks) {
			for (const m of chunkGraph.getChunkModulesIterable(chunk)) {
				resultingAvailableModules.add(m);
			}
		}
		return (chunkGroupInfo.resultingAvailableModules =
			resultingAvailableModules);
	}

4.8.2 示例图解

「Webpack5源码」seal阶段(流程图)分析(一)

5.buildChunkGraph-2-connectChunkGroups

5.1 blockConnections数据收集

blockConnections数据在iteratorBlock()处理异步依赖时初始化

处理不是Entry类型:queueConnection的构建
const processBlock = (block, isSrc) => {
    //...处理同步依赖
    for (const b of block.blocks) {
        iteratorBlock(b);
    }
}

const iteratorBlock = b => {

    if (c !== undefined) {
        blockConnections.get(b).push({
            originChunkGroupInfo: chunkGroupInfo,
            chunkGroup: c
        });

        let connectList = queueConnect.get(chunkGroupInfo);
        if (connectList === undefined) {
            connectList = new Set();
            queueConnect.set(chunkGroupInfo, connectList);
        }
        connectList.add(cgi);

        // TODO check if this really need to be done for each traversal
        // or if it is enough when it's queued when created
        // 4. We enqueue the DependenciesBlock for traversal
        queueDelayed.push({
            action: PROCESS_BLOCK,
            block: b,
            module: module,
            chunk: c.chunks[0],
            chunkGroup: c,
            chunkGroupInfo: cgi
        });
    }
}

5.2 处理blockConnections数据,绑定ChunkGroup

如下面代码块所示,areModulesAvailable()主要是判断该异步的chunkGroup所有的依赖是否都处于parent chunkGroupresultingAvailableModules中,也就是parent chunkGroup的一些同步依赖已经包含了异步依赖所需要的所有modules

异步依赖直接拿parent chunkGroup的同步依赖即可,不需要跟其他module建立关系

connectBlockAndChunkGroup(): 异步依赖AsyncDependenciesBlock跟新建立的ChunkGroup进行绑定connectChunkGroupParentAndChild(): 异步依赖ChunkGroup跟其parent ChunkGroup进行绑定

const connectChunkGroups = (compilation, blocksWithNestedBlocks, blockConnections, chunkGroupInfoMap) => {
    const { chunkGraph } = compilation;
    // 出现在父chunkA有异步依赖chunkB,chunkB有同步依赖chunkC
    // 但是chunkC是chunkA的同步依赖,那么chunkB就跳过这个异步chunkC的关联
    for (const [block, connections] of blockConnections) {
        if (
            !blocksWithNestedBlocks.has(block) &&
            connections.every(({ chunkGroup, originChunkGroupInfo }) =>
              // originChunkGroupInfo包含了这个chunkGroup的所有Modules
              //说明异步依赖block所在的chunk已经被所在的chunk的父chunk包含了
                areModulesAvailable(
                    chunkGroup,
                    originChunkGroupInfo.resultingAvailableModules
                )
            )
        ) {
            continue;
        }

        for (let i = 0; i < connections.length; i++) {
            const { chunkGroup, originChunkGroupInfo } = connections[i];
            // 关联这个AsyncDependenciesBlock和chunkGroup
            chunkGraph.connectBlockAndChunkGroup(block, chunkGroup);
            // 关联这个chunkGroup和它的父chunkGroup
            connectChunkGroupParentAndChild(originChunkGroupInfo.chunkGroup, chunkGroup);
        }
    }
};

上面的分析可能看起来有点懵,但是举一个具体的例子就能很快明白connectChunkGroups()的逻辑,如下面所示

  • 如果entry1.js没有同步依赖async_B.js,那么由于它有异步依赖async_B.jsasync_B.js会单独形成一个ChunkChunkGroup
  • 但是现在entry1.js已经有了同步依赖async_B.js,那么它就没必要再让async_B.js单独形成一个ChunkChunkGroup,因为entry1.js已经把async_B.js打包进去自己的Chunk了,而上面代码中areModulesAvailable()就是检测这个逻辑的具体方法,如果originChunkGroupInfo包含了这个chunkGroup的所有Modules,那么这个异步ChunkGroup就可以删除了

具体删除逻辑请看下一节的分析

「Webpack5源码」seal阶段(流程图)分析(一)

6.buildChunkGraph-3-cleanupUnconnectedGroups

清除所有没有连接的chunkGroups

6.1 allCreatedChunkGroups数据收集

allCreatedChunkGroups也是在处理异步依赖iteratorBlock()中进行数据初始化

const processBlock = (block, isSrc) => {
  //...处理同步依赖
  for (const b of block.blocks) {
  		iteratorBlock(b);
  }
}

const iteratorBlock = b => {
    let cgi = blockChunkGroups.get(b);
    const entryOptions = b.groupOptions && b.groupOptions.entryOptions;
    if (cgi === undefined) {
        // 情况1: 这个异步NormalModule还没有对应的chunkGroup
        if (entryOptions) {
            // 场景1: Entry类型
        } else if (!chunkGroupInfo.asyncChunks || !chunkGroupInfo.chunkLoading) {
            // 场景2: webpack.config.js中asyncChunks=false/chunkLoading=false
        } else {
            // 场景3: 非Entry+允许asyncChunk的情况
            c = compilation.addChunkInGroup(
                b.groupOptions || b.chunkName,
                module,
                b.loc,
                b.request
            );
            blockConnections.set(b, []);
            allCreatedChunkGroups.add(c);
        }
    } else if (entryOptions) {
        // 情况2: 这个异步NormalModule有对应的chunkGroup,而且它是入口类型的
        entrypoint = cgi.chunkGroup;
    } else {
        // 情况3: 这个异步NormalModule有对应的chunkGroup,而且它不是入口类型的
        c = cgi.chunkGroup;
    }

    if (c !== undefined) {
      // 处理不是Entry类型
    } else if (entrypoint !== undefined) {
      // 处理Entry类型
        chunkGroupInfo.chunkGroup.addAsyncEntrypoint(entrypoint);
    }
}

6.2 allCreatedChunkGroups数据处理

通过chunkGroup.getNumberOfParents()检测异步ChunkGroup是否没有关联其Parent Chunk,如果没有关联,直接清除该ChunkGroup

const cleanupUnconnectedGroups = (compilation, allCreatedChunkGroups) => {
	const { chunkGraph } = compilation;

	for (const chunkGroup of allCreatedChunkGroups) {
    // 清理依赖,如果这个chunkGroup的父chunk为0,说明没有连接,直接清除
		if (chunkGroup.getNumberOfParents() === 0) {
			for (const chunk of chunkGroup.chunks) {
				compilation.chunks.delete(chunk);
				chunkGraph.disconnectChunk(chunk);
			}
			chunkGraph.disconnectChunkGroup(chunkGroup);
			chunkGroup.remove();
		}
	}
};

如下面所示,当entry1.js已经有了同步依赖async_B.js,那么它就没必要再让async_B.js单独形成一个ChunkChunkGroup,因此在上面connectChunkGroups()中不会进行connectChunkGroupParentAndChild(originChunkGroupInfo.chunkGroup, chunkGroup)关联ChunkGroup之间的关系,因此会导致异步依赖async_B.js对应的ChunkGroup.getNumberOfParents() === 0,最终触发ChunkGroup删除逻辑,移除该ChunkGroup

7.hooks.optimizeChunks

while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) {
  /* empty */
}

在经过visitModules()处理后,会调用hooks.optimizeChunks.call()进行chunks的优化,如下图所示,会触发多个Plugin执行,其中我们最熟悉的就是SplitChunksPlugin插件

「Webpack5源码」seal阶段(流程图)分析(一)

由于篇幅原因,具体分析请看下一篇文章《「Webpack5源码」seal阶段分析(二)》

参考

工程化专栏系列文章