前端实现 word 转 png

站长

2024年04月14日 17:56 · 阅读数 100

在此之前 word 转图片的需求都是在后端实现的，用的是 itext 库，但是 itext 收费的，商用需要付费。近期公司内部安全部门发出侵权警告，要求整改。

所以采用前端实现 word 文档转图片功能。

一、需求

用户在页面上上传 .docx 格式的文件
前端拿到文件，解析并生成 .png 图片
上传该图片到文件服务器，并将图片地址作为缩略图字段

二、难点

目前来看，前端暂时无法直接实现将 .docx 文档转成图片格式的需求

三、解决方案

既然直接转无法实现，那就采用迂回战术

先转成 html（用到库 docx-preview ）
再将 html 转成 canvas（用到库 html2canvas ）
最后将 canvas 转成 png

四、实现步骤

将 .docx 文件先转成 html 格式，并插入到目标节点中

安装 docx-preview 依赖： pnpm add docx-preview --save

import { useEffect } from 'react';
import * as docx from 'docx-preview';

export default ({ file }) => {
  useEffect(() => {
    // file 为上传好的 docx 格式文件
    docx2Html(file);
  }, [file]);

  /**
   * @description: docx 文件转 html
   * @param {*} file: docx 格式文件
   * @return {*}
   */
  const docx2Html = file => {
    if (!file) {
      return;
    }
    // 只处理 docx 文件
    const suffix = file.name?.substr(file.name.lastIndexOf('.') + 1).toLowerCase();
    if (suffix !== 'docx') {
      return;
    }
    // 生成 html 后挂载的 dom 节点
    const htmlContentDom = document.querySelector('#htmlContent'); 
    const docxOptions = Object.assign(docx.defaultOptions, {
      debug: true,
      experimental: true,
    });
    docx.renderAsync(file, htmlContentDom, null, docxOptions).then(() => {
      console.log('docx 转 html 完成');
    });
  };

  return <div id='htmlContent' />;
};

此时，在 id 为 htmlContent 的节点下，就可以看到转换后的 html 内容了（ htmlContent 节点的宽高等 css 样式自行添加）

将 html 转成 canvas

安装 html2canvas 依赖： pnpm add html2canvas --save

import html2canvas from 'html2canvas';

/**
 * @description: dom 元素转为图片
 * @return {*}
 */
const handleDom2Img = async () => {
  // 生成 html 后挂载的 dom 节点
  const htmlContentDom = document.querySelector('#htmlContent'); 
  // 获取刚刚生成的 dom 元素
  const htmlContent = htmlContentDom.querySelectorAll('.docx-wrapper>section')[0]; 
  // 创建 canvas 元素
  const canvasDom = document.createElement('canvas');
  // 获取 dom 宽高
  const w = parseInt(window.getComputedStyle(htmlContent).width, 10);
  // const h = parseInt(window.getComputedStyle(htmlContent).height, 10);

  // 设定 canvas 元素属性宽高为 DOM 节点宽高 * 像素比
  const scale = window.devicePixelRatio; // 缩放比例
  canvasDom.width = w * scale; // 取文档宽度
  canvasDom.height = w * scale; // 缩略图是正方形，所以高度跟宽度保持一致

  // 按比例增加分辨率，将绘制内容放大对应比例
  const canvas = await html2canvas(htmlContent, {
    canvas: canvasDom,
    scale,
    useCORS: true,
  });
  return canvas;
};

将生成好的 canvas对象转成 .png 文件，并下载

// 将 canvas 转为 base64 图片
const base64Str = canvas.toDataURL();

// 下载图片
const imgName = `图片_${new Date().valueOf()}`;
const aElement = document.createElement('a');
aElement.href = base64Str;
aElement.download = `${imgName}.png`;
document.body.appendChild(aElement);
aElement.click();
document.body.removeChild(aElement);
window.URL.revokeObjectURL(base64Str);

五、总结

前端无法直接实现将 .docx 文档转成图片格式，所以要先将 .docx 文档转换成 html 格式，并插入页面文档节点中，然后根据 html 内容生成canvas对象，最后将 canvas对象转成 .png 文件

有以下两个缺点：

只能转 .docx 格式的 word 文档，暂不支持 .doc 格式；
无法自动获取文档第一页来生成图片内容，需要先将 word 所有页面生成为 html，再通过 canvas 手动裁切，来确定图片宽高。

转载自:https://juejin.cn/post/7331799381896151067