likes
comments
collection
share

踩坑记:构建wasm-pack的Docker镜像

作者站长头像
站长
· 阅读数 10

背景

试图在Rust的镜像基础上,构建一个wasm-pack的镜像,以便在我们的GitLab流水线中使用。 但是用了两种方式,运行时都遇到一个问题。

  1. GitHub上wasm-pack的对应的二进制文件复制到镜像中
  2. 使用cargo install wasm-pack

问题是一样的:

wasm-pack build --target web
[INFO]: Checking for the Wasm target...
[INFO]: Compiling to Wasm...
    Finished release [optimized] target(s) in 0.02s
[WARN]: :-) origin crate has no README
[INFO]: Installing wasm-bindgen...

然后就没有下文了,就像卡住了。

但是pkg这个目录已经生成了,里面的产物wasm、js一应俱全,测试了下,也都是可以使用的。

排查

扒下来wasm-pack的GitHub源码,看到可以有--dev的选项(其实文档里也有,我没有注意),我试着看下是不是有日志,没想到一瞬间就结束了,不过生成的文件比较大而已。

wasm-pack build --dev --target web
[INFO]: Checking for the Wasm target...
[INFO]: Compiling to Wasm...
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
[WARN]: :-) origin crate has no README
[INFO]: Installing wasm-bindgen...
[INFO]: Optional fields missing from Cargo.toml: 'description', 'repository', and 'license'. These are not necessary, but recommended
[INFO]: :-) Done in 0.22s
[INFO]: :-) Your wasm pkg is ready to publish at /udolphin/wasm_test/pkg.

所以我猜测wasm-pack是在生产模式下又做了下什么耗时的操作。

这次继续使用wasm-pack build --target web,过一段时间再来看它的结果,发现进程居然结束了(确实是意外,因为昨天我等了十几分钟都没动静):

wasm-pack build --release --target web
[INFO]: Checking for the Wasm target...
[INFO]: Compiling to Wasm...
   Compiling rust_wasm v0.1.0 (/udolphin/wasm_test)
    Finished release [optimized] target(s) in 0.15s
[WARN]: :-) origin crate has no README
[INFO]: License key is set in Cargo.toml but no LICENSE file(s) were found; Please add the LICENSE file(s) to your project directory
[INFO]: Installing wasm-bindgen...
[INFO]: Optimizing wasm binaries with `wasm-opt`...
[INFO]: :-) Done in 1m 27s
[INFO]: :-) Your wasm pkg is ready to publish at /udolphin/wasm_test/pkg.

而且后面继续执行这个命令,都是瞬间结束。

所以猜测生产模式下,很可能是在下载文件

看源码(v0.11.1),最终找到了这几段代码(不得不赞下,GitHub现在代码查找比以前方便多了): 踩坑记:构建wasm-pack的Docker镜像

以下是相关代码片段:

/// Install a cargo CLI tool
///
/// Prefers an existing local install, if any exists. Then checks if there is a
/// global install on `$PATH` that fits the bill. Then attempts to download a
/// tarball from the GitHub releases page, if this target has prebuilt
/// binaries. Finally, falls back to `cargo install`.
pub fn download_prebuilt_or_cargo_install(
    tool: Tool,
    cache: &Cache,
    version: &str,
    install_permitted: bool,
) -> Result<Status> {
    // If the tool is installed globally and it has the right version, use
    // that. Assume that other tools are installed next to it.
    //
    // This situation can arise if the tool is already installed via
    // `cargo install`, for example.
    if let Ok(path) = which(tool.to_string()) {
        debug!("found global {} binary at: {}", tool, path.display());
        if check_version(&tool, &path, version)? {
            let download = Download::at(path.parent().unwrap());
            return Ok(Status::Found(download));
        }
    }

    // 我们看到的打印就是这里
    let msg = format!("{}Installing {}...", emoji::DOWN_ARROW, tool);
    PBAR.info(&msg);

    let dl = download_prebuilt(&tool, cache, version, install_permitted);
    match dl {
        Ok(dl) => return Ok(dl),
        Err(e) => {
            warn!(
                "could not download pre-built `{}`: {}. Falling back to `cargo install`.",
                tool, e
            );
        }
    }

    cargo_install(tool, cache, version, install_permitted)
}

/// Downloads a precompiled copy of the tool, if available.
pub fn download_prebuilt(
    tool: &Tool,
    cache: &Cache,
    version: &str,
    install_permitted: bool,
) -> Result<Status> {
    let url = match prebuilt_url(tool, version) {
        Ok(url) => url,
        Err(e) => bail!(
            "no prebuilt {} binaries are available for this platform: {}",
            tool,
            e,
        ),
    };
    match tool {
        Tool::WasmBindgen => {
            let binaries = &["wasm-bindgen", "wasm-bindgen-test-runner"];
            match cache.download(install_permitted, "wasm-bindgen", binaries, &url)? {
                Some(download) => Ok(Status::Found(download)),
                None => bail!("wasm-bindgen v{} is not installed!", version),
            }
        }
        Tool::CargoGenerate => {
            let binaries = &["cargo-generate"];
            match cache.download(install_permitted, "cargo-generate", binaries, &url)? {
                Some(download) => Ok(Status::Found(download)),
                None => bail!("cargo-generate v{} is not installed!", version),
            }
        }
        Tool::WasmOpt => {
            let binaries: &[&str] = match Os::get()? {
                Os::MacOS => &["bin/wasm-opt", "lib/libbinaryen.dylib"],
                Os::Linux => &["bin/wasm-opt"],
                Os::Windows => &["bin/wasm-opt.exe"],
            };
            match cache.download(install_permitted, "wasm-opt", binaries, &url)? {
                Some(download) => Ok(Status::Found(download)),
                // TODO(ag_dubs): why is this different? i forget...
                None => Ok(Status::CannotInstall),
            }
        }
    }
}

/// Get the download URL for some tool at some version, architecture and operating system
pub fn prebuilt_url_for(tool: &Tool, version: &str, arch: &Arch, os: &Os) -> Result<String> {
    let target = match (os, arch, tool) {
        (Os::Linux, Arch::X86_64, Tool::WasmOpt) => "x86_64-linux",
        (Os::Linux, Arch::X86_64, _) => "x86_64-unknown-linux-musl",
        (Os::MacOS, Arch::X86_64, Tool::WasmOpt) => "x86_64-macos",
        (Os::MacOS, Arch::X86_64, _) => "x86_64-apple-darwin",
        (Os::MacOS, Arch::AArch64, Tool::CargoGenerate) => "aarch64-apple-darwin",
        (Os::MacOS, Arch::AArch64, Tool::WasmOpt) => "arm64-macos",
        (Os::Windows, Arch::X86_64, Tool::WasmOpt) => "x86_64-windows",
        (Os::Windows, Arch::X86_64, _) => "x86_64-pc-windows-msvc",
        _ => bail!("Unrecognized target!"),
    };
    match tool {
        Tool::WasmBindgen => {
            Ok(format!(
                "https://github.com/rustwasm/wasm-bindgen/releases/download/{0}/wasm-bindgen-{0}-{1}.tar.gz",
                version,
                target
            ))
        },
        Tool::CargoGenerate => {
            Ok(format!(
                "https://github.com/cargo-generate/cargo-generate/releases/download/v{0}/cargo-generate-v{0}-{1}.tar.gz",
                "0.17.3",
                target
            ))
        },
        Tool::WasmOpt => {
            Ok(format!(
        "https://github.com/WebAssembly/binaryen/releases/download/{vers}/binaryen-{vers}-{target}.tar.gz",
        vers = "version_111",
        target = target,
            ))
        }
    }
}

在Docker容器里也找到了下载的产物:

~/.cache/.wasm-pack# ls
wasm-bindgen-bf5e6b635dbd98f1  wasm-opt-fc03871f1779aa83

~/.cache/.wasm-pack/wasm-bindgen-bf5e6b635dbd98f1# ls -l
total 14408
-rwxr-xr-x 1 root root 5223592 May 16 06:41 wasm-bindgen
-rwxr-xr-x 1 root root 9526968 May 16 06:41 wasm-bindgen-test-runner

~/.cache/.wasm-pack/wasm-opt-fc03871f1779aa83/bin# ls -l
total 11060
-rwxr-xr-x 1 root root 11324160 Nov 22  2022 wasm-opt

~/.cache/.wasm-pack# du -sh
25M     .

就是这两个文件夹比较坑爹,下载了25M的资源,而且还是从GitHub上下载的,就我们的国情,是多么忧桑的一件事。

有些好奇为什么要这么做,因为我们前面也说了,就是下载完成前生成的产物也没有问题。找了下相关记录,在CHANGELOG.md里找到一段:

 - **coordinating wasm-bindgen versions and installing from binaries for improved speed - [datapup], [issue/146] [pull/244] [pull/324]**

      This is the true gem of this release. Have you been frustrated by how long `wasm-pack` takes to
      run? Overusing `--mode no-install`? This is the release you're looking for.

      Many releases back we realized that folks were struggling to keep the `wasm-bindgen` library
      that their project used in sync with the `wasm-bindgen` CLI application which `wasm-pack`
      runs for you. This became such an issue that we opted to force install `wasm-bindgen` to ensure
      that every `wasm-pack` user had the latest version.

      Like many technical solutions, this solved our original problem, but caused a new one. Now, we
      we are forcing a `cargo install` of `wasm-bindgen` on every run, and that means downloading
      and compiling `wasm-bindgen` everytime you want to run `wasm-pack`. That's unacceptable!

      We're happy to announce that we have a pretty great solution, and several more planned for
      future releases. As of this release, we will read your `Cargo.lock` to find the version of
      `wasm-bindgen` you are using in your local project. We will attempt to fetch a binary version
      of `wasm-bindgen` that matches your local version. We place that binary local to your project,
      and use it when you run `wasm-pack build`. The next time you run `wasm-pack build` we'll use
      that binary, instead of fetching a new one. We still fall back to `cargo install` for
      less common architectures but this is a huge speed improvement. Check out these benchmarks!

看起来很高大上,让ChatGPT解释下:

该段文字介绍了该版本的主要新特性——协调 wasm-bindgen 版本并从二进制文件进行安装,以提高速度。这是本次发布的亮点之一。 在过去的几个版本中,wasm-pack 的用户很难确保他们所使用的 wasm-bindgen 库与 wasm-pack 所运行的 CLI 工具同步。为了解决这个问题,开发人员决定强制安装 wasm-bindgen,以确保每个 wasm-pack 用户都使用最新版本的 wasm-bindgen。 然而,这种解决方案也带来了新的问题:强制每次使用 wasm-pack 时都进行 cargo install wasm-bindgen,这意味着需要每次下载和编译 wasm-bindgen,这是不可接受的性能影响。 此版本引入了一个较好的解决方案,未来版本还会引入更多相关特性。该版本中,wasm-pack 会读取你项目本地 Cargo.lock 文件中所记录的 wasm-bindgen 版本,并尝试获取与该版本匹配的预编译二进制文件。wasm-pack 会将这个二进制文件放置在你的项目本地,并在运行 wasm-pack build 时使用它。下一次你运行 wasm-pack build 时,wasm-pack 将继续使用该二进制文件而不会重复下载和编译。对于一些不常见的架构,wasm-pack 仍会使用 cargo install 进行安装。 这个特性带来的性能提升非常明显,官方提供了数据来支持这个变化。

总而言之,言而总之,是个坑。

解决方案

下来的事情就简单了。有以下几种方案:

  1. 基于当前的容器提交出来一个镜像。缺点是一锤子买卖,后续升级Rust版本或者wasm-pack版本还得再来一遍。
  2. Fork源码,将上面的GitHub地址搞成可配置的变量,换成我们自己的CDN地址,提高下载速度。可行是可行,后续再说吧。
  3. 考虑GitHub Action,在它里面构建镜像,它拉取自家资源那个速度是杠杠的。

综上,我比较倾向于第三种。于是继续折腾。

wasm-pack官方没有提供提前下载资源的方式:

$ wasm-pack --help
wasm-pack 0.11.1
The various kinds of commands that `wasm-pack` can execute

USAGE:
    wasm-pack [FLAGS] [OPTIONS] <SUBCOMMAND>

FLAGS:
    -h, --help       Prints help information
    -q, --quiet      No output printed to stdout
    -V, --version    Prints version information
    -v, --verbose    Log verbosity is based off the number of v used

OPTIONS:
        --log-level <log-level>    The maximum level of messages that should be logged by wasm-pack. [possible values:
                                   info, warn, error] [default: info]

SUBCOMMANDS:
    build      🏗️  build your npm package!
    help       Prints this message or the help of the given subcommand(s)
    login      👤  Add an npm registry user account! (aliases: adduser, add-user)
    new        🐑 create a new project with a template
    pack       🍱  create a tar of your npm package but don't publish!
    publish    🎆  pack up your npm package and publish!
    test       👩‍🔬  test your wasm!

里面有个命令是new,于是就有思路了,可以利用它来新建一个项目,构建,缓存之后,再把这个项目删掉。完美! 踩坑记:构建wasm-pack的Docker镜像 整个流程不到4分钟就完成了,其中还包括推送到我们自己的仓库。体验还是不错的。

在GitLab流水线中使用,第一次耗时3分50秒: 踩坑记:构建wasm-pack的Docker镜像

第二次耗时26秒: 踩坑记:构建wasm-pack的Docker镜像

镜像代码

有需要的同学自取。

镜像文件:里面加入了sccache和mold来优化构建速度,不需要的可以删除。

FROM rust:1.68-slim

WORKDIR /app

RUN apt-get update \
&& apt-get -y install clang pkg-config libssl-dev wget make ca-certificates curl 

RUN mkdir download && cd download \
&& wget https://github.com/rui314/mold/releases/download/v1.11.0/mold-1.11.0-x86_64-linux.tar.gz && tar xzvf mold-1.11.0-x86_64-linux.tar.gz && mv mold-1.11.0-x86_64-linux ../mold \
&& wget https://github.com/mozilla/sccache/releases/download/v0.5.0/sccache-v0.5.0-x86_64-unknown-linux-musl.tar.gz  && tar xzvf sccache-v0.5.0-x86_64-unknown-linux-musl.tar.gz && mv sccache-v0.5.0-x86_64-unknown-linux-musl/sccache ../ && chmod +x /app/sccache \ 
&& wget https://github.com/rustwasm/wasm-pack/releases/download/v0.11.1/wasm-pack-v0.11.1-x86_64-unknown-linux-musl.tar.gz && tar -xf wasm-pack-v0.11.1-x86_64-unknown-linux-musl.tar.gz -C /usr/local/bin --strip-components=1 \
&& apt-get remove -y wget \
&& rm -rf /var/lib/apt/lists/* && rm -rf ../download 

ENV USER=wasm
RUN wasm-pack new test && cd test \
&& wasm-pack build \
&& cd .. && rm -rf test

ENV TZ "Asia/Shanghai"
ENV CARGO_HOME="/rust-cache/cargo"
ENV RUSTC_WRAPPER="/app/sccache"
ENV SCCACHE_DIR="/rust-cache/sccache"

这是我配置的.github/workflows/deploy-wasm.yml文件:

name: wasm-pack
on:
  push:
    tags:
      - 'wasm-pack*'

jobs:
  # 构建并上传 Docker镜像
  build: 
    runs-on: ubuntu-latest # 依赖的环境      
    steps:
      - uses: actions/checkout@v2
      - name: Extract Tag
        id: extract_tag
        run: echo "::set-output name=tag::$(echo ${GITHUB_REF#refs/tags/wasm-pack})"
      - name: Build Image
        run: |
          docker build -t wasm-pack -f Dockerfile_wasm_pack .
      - name: Tag Image
        run: docker tag wasm-pack xx/runner/wasm-pack:${{ steps.extract_tag.outputs.tag }}
      - name: Login to Registry
        run: docker login --username='${{ secrets.DOCKER_USERNAME }}' --password ${{ secrets.DOCKER_PASSWORD }} xx
      - name: Push Image
        run: docker push xx/runner/wasm-pack:${{ steps.extract_tag.outputs.tag }}

当推送tag为wasm-pack0.0.2时,就会开始构建,并生成一个镜像wasm-pack:0.0.2镜像。 踩坑记:构建wasm-pack的Docker镜像

总结

通过在构建wasm-pack镜像时,虽然经历了多次尝试和错误,但最终还是找到了方法。通过仔细研究wasm-pack源码和日志输出,我们发现了是下载过程导致的问题,而在国内,由于网络限制,下载文件可能遇到很大的困难。最终,我们采用了一种通过GitHub Action构建镜像的方法,避免了下载过程,加快了构建速度,并成功地构建了适用于GitLab流水线的wasm-pack镜像。这个方法虽然不是最简单的,但是一定是最有效的,希望对有需要的读者有所帮助。