likes
comments
collection
share

LangChain-SearXNG配置文件热加载方案介绍

作者站长头像
站长
· 阅读数 27

👋 大家好,前面两篇文章介绍我最近开源的LangChain-SearXNG搜索引擎

从本期开始针对LangChain-SearXNG项目细节进行深度剖析,分享一些项目经验给大家 今天就以最近更新 配置文件热加载 功能来介绍一下是如何实现

一.uvicorn热更新

Python Uvicorn 是一个快速的 ASGI(Asynchronous Server Gateway Interface)服务器,用于构建异步 Web 服务。它基于 asyncio 库,支持高性能的异步请求处理,适用于各种类型的 Web 应用程序。 Uvicorn 是由 Starlette 框架的作者编写的 ASGI 服务器,旨在提供高性能的异步请求处理能力。它使用 asyncio 库实现异步 I/O 操作,支持 HTTP 和 WebSocket 协议,可与各种 ASGI 应用程序框架(如 FastAPI、Django、Starlette 等)配合使用。

LangChain-SearXNG项目采用的uvicorn+FastAPI的方案提供Web Server服务。 uvicorn支持基于文件修改的热更新,方便快速调试应用

uvicorn main:app --reload --host 192.XXX.XXX --port 8001

通过 --reload 参数启动应用就能支持热更新,监测到当前工程文件变化就重新加载应用

剖析uvicorn热更新方案

通过阅读 uvicorn 源码,主要有以下两处代码

1. 加载入口 uvicorn.run

LangChain-SearXNG配置文件热加载方案介绍
## config.should_reload 实现
 @property
    def should_reload(self) -> bool:
        return isinstance(self.app, str) and self.reload

可以看出,uvicorn.run加载必须是以字符串入口文件的形式启动才能启动reload功能。

⚠️ 通过以下方式引用ASGI对象是不支持热加载的

# start a fastapi server with uvicorn

import uvicorn

from langchain_deepread.main import app
from langchain_deepread.settings.settings import settings

uvicorn.run(app, host="0.0.0.0", port=settings().server.port, log_config=None)

2. ChangeReload 实现

从上面代码可以看到 ChangeReload 是最终启动app的入口,我们看一下它的具体实现

from typing import TYPE_CHECKING, Type

from uvicorn.supervisors.basereload import BaseReload
from uvicorn.supervisors.multiprocess import Multiprocess

if TYPE_CHECKING:
    ChangeReload: Type[BaseReload]
else:
    try:
        from uvicorn.supervisors.watchfilesreload import (
            WatchFilesReload as ChangeReload,
        )
    except ImportError:  # pragma: no cover
        try:
            from uvicorn.supervisors.watchgodreload import (
                WatchGodReload as ChangeReload,
            )
        except ImportError:
            from uvicorn.supervisors.statreload import StatReload as ChangeReload

__all__ = ["Multiprocess", "ChangeReload"]

从源码可以看出,uvicorn热更新的方案就是由supervisors来操作的,ChangeReload是由三个类来实现 StatReload WatchGodReload WatchFilesReload , 这3个类都继承于 BaseReload,分别为should_restart()这个方法提供了具体实现。

具体详情可以参考 分析uvicorn和hypercorn的热更新原理

总之从文件监听能力上来看: WatchFilesReload > WatchGodReload > StatReload

3. 存在问题

回到我们最初问题,如何支持配置文件更新后系统热更新。以 WatchFilesReload 为例,我们发现在实际应用中,配置文件没有发生内容变更,都会触发热更新。(实际文件MD5未变化)

说明WatchFilesReload实现未考虑实际内容未变的情况

所以综上,在理解了 uvicorn 更新原理后,完全可以自己实现一个CustomReload类

二. LangChain-SearXNG热更新方案

自定义uvicorn CustomReload启动类

class CustomWatchFilesReload(WatchFilesReload):

    def __init__(
        self,
        config: Config,
        target: Callable[[list[socket] | None], None],
        sockets: list[socket],
    ) -> None:
        super().__init__(config, target, sockets)
        self.file_hashes = {}  # Store file hashes

        # Calculate and store hashes for initial files
        for directory in self.reload_dirs:
            for file_path in directory.rglob("*"):
                if file_path.is_file() and self.watch_filter(file_path):
                    self.file_hashes[str(file_path)] = self.calculate_file_hash(
                        file_path
                    )

    def should_restart(self) -> list[Path] | None:
        self.pause()

        changes = next(self.watcher)
        if changes:
            changed_paths = []
            for event_type, path in changes:
                if event_type == Change.modified:
                    file_hash = self.calculate_file_hash(path)
                    if (
                        path not in self.file_hashes
                        or self.file_hashes[path] != file_hash
                    ):
                        changed_paths.append(Path(path))
                        self.file_hashes[path] = file_hash

            if changed_paths:
                return [p for p in changed_paths if self.watch_filter(p)]

        return None

    def calculate_file_hash(self, file_path: str) -> str:
        with open(file_path, "rb") as file:
            file_contents = file.read()
            return hashlib.md5(file_contents).hexdigest()

核心主要就两个关键点:

  1. 初始化获取需要监听配置文件的路径和hash值
  2. 在should_restart中,监听到文件变化后,对比是否真的发生了内容变化

程序启动

non_yaml_files = [
    f
    for f in glob.glob("**", root_dir=PROJECT_ROOT_PATH, recursive=True)
    if not f.lower().endswith((".yaml", ".yml"))
]
try:
    config = Config(
        app="langchain_searxng.main:app",
        host="0.0.0.0",
        port=settings().server.port,
        reload=True,
        reload_dirs=str(PROJECT_ROOT_PATH),
        reload_excludes=non_yaml_files,
        reload_includes="*.yaml",
        log_config=None,
    )

    server = Server(config=config)

    sock = config.bind_socket()
    CustomWatchFilesReload(config, target=server.run, sockets=[sock]).run()
except KeyboardInterrupt:
    ...

non_yaml_files 筛选出工程内非yaml配置文件 Config类构造启动参数,重点是 reload reload_dirs reload_excludes reload_includes 分别定义了需要监听的目录、排查文件和包含文件

CustomWatchFilesReload 再通过该类启动应用

最终效果

 python -m langchain_searxng
23:19:20.394 [INFO ] [settings_loader.py][line:55] - Starting application with profiles=['default', 'pro']
23:19:20.450 [INFO ] [config.py   ][line:316] - Will watch for changes in these directories: ['/Users/cfd/workstation/Python/LangChain-SearXNG']
23:19:20.450 [INFO ] [config.py   ][line:522] - Uvicorn running on http://0.0.0.0:8002 (Press CTRL+C to quit)
23:19:20.480 [INFO ] [basereload.py][line:79] - Started reloader process [46168] using WatchFiles
23:19:20.616 [INFO ] [settings_loader.py][line:55] - Starting application with profiles=['default', 'pro']
23:19:20.780 [INFO ] [auth.py     ][line:59] - Defining the given authentication mechanism for the API
23:19:21.316 [INFO ] [server.py   ][line:82] - Started server process [46171]
23:19:21.316 [INFO ] [on.py       ][line:48] - Waiting for application startup.
23:19:21.316 [INFO ] [on.py       ][line:62] - Application startup complete.
23:19:23.551 [INFO ] [main.py     ][line:297] - 1 change detected
23:19:23.552 [WARNING] [basereload.py][line:54] - WatchFiles detected changes in 'settings-pro.yaml'. Reloading...
23:19:23.643 [INFO ] [server.py   ][line:258] - Shutting down
23:19:23.744 [INFO ] [on.py       ][line:67] - Waiting for application shutdown.
23:19:23.744 [INFO ] [on.py       ][line:76] - Application shutdown complete.
23:19:23.744 [INFO ] [server.py   ][line:92] - Finished server process [46171]
23:19:23.889 [INFO ] [settings_loader.py][line:55] - Starting application with profiles=['default', 'pro']
23:19:24.051 [INFO ] [auth.py     ][line:59] - Defining the given authentication mechanism for the API
23:19:24.113 [INFO ] [main.py     ][line:297] - 1 change detected
23:19:24.553 [INFO ] [server.py   ][line:82] - Started server process [46202]
23:19:24.553 [INFO ] [on.py       ][line:48] - Waiting for application startup.
23:19:24.553 [INFO ] [on.py       ][line:62] - Application startup complete.
23:19:37.410 [INFO ] [llm_component.py][line:19] - Initializing the LLM in mode=openai+zhipuai
23:19:37.603 [INFO ] [embedding_component.py][line:18] - Initializing the embedding in mode=zhipuai
23:19:37.621 [INFO ] [h11_impl.py ][line:477] - 127.0.0.1:54992 - "POST /v1/search/sse HTTP/1.1" 200

正常监听到settings.yaml文件,重启项目,加载新的配置文件

更新详情欢迎大家访问我们项目地址,也可以联系作者噢 ⬇️⬇️⬇️

🍓 开源地址

LangChain-SearXNG配置文件热加载方案介绍

转载自:https://juejin.cn/post/7362527672616910863
评论
请登录