结合JUC谈Springboot优雅停机
废话少说
涉及到的知识点
Spring
- SmartLifecycle
- DefaultLifecycleProcessor
- WebServerGracefulShutdownLifecycle
- WebServerStartStopLifecycle
- WebServerManager
- TomcatWebServer implements WebServer
java
- java.util.concurrent.CountDownLatch
- java.lang.Runtime
- java.lang.ApplicationShutdownHooks
- java.lang.Shutdown
简述:springboot的优雅停机是借助于ShutdownHook实现的
关于spring hook的二三事
- 什么时候设置的hook
- 什么时候触发的hook
- 触发hook后续的流程
设置hook线程、触发hook
SpringApplication#run()
- org.springframework.boot.SpringApplication#refreshContext()
- org.springframework.context.support.AbstractApplicationContext#registerShutdownHook()
@Override
public void registerShutdownHook() {
if (this.shutdownHook == null) {
// No shutdown hook registered yet.
this.shutdownHook = new Thread(SHUTDOWN_HOOK_THREAD_NAME) {
@Override
public void run() {
synchronized (startupShutdownMonitor) {
doClose();
}
}
};
Runtime.getRuntime().addShutdownHook(this.shutdownHook);
}
}
- 从上述代码中可以看到,spring在刷新上下文时会向Runtime中注册一个shutdownHook,根据Runtime api中注释可以看出,当虚拟机响应关闭信号后(有些信号不会响应例如 kill -9),会执行这个线程
触发hook后续的流程
核心入口
- 从注册hook时可以看到,当虚拟机回调时会执行 doClose()方法,也就是说这个方法是关闭容器的核心入口
- org.springframework.context.support.AbstractApplicationContext#doClose()
模拟关闭
public static void main(String[] args) {
ConfigurableApplicationContext context = SpringApplication.run(MvcApplication.class, args);
// 模拟 shutdown调用
context.close();
}
@Override
public void close() {
synchronized (this.startupShutdownMonitor) {
// 此处调用真正的关闭方法
doClose();
if (this.shutdownHook != null) {
try {
Runtime.getRuntime().removeShutdownHook(this.shutdownHook);
}
catch (IllegalStateException ex) {
// ignore - VM is already shutting down
}
}
}
}
protected void doClose() {
....... 忽略不在本次范围的代码,有兴趣的可以去源码看看
// Stop all Lifecycle beans, to avoid delays during individual destruction.
if (this.lifecycleProcessor != null) {
try {
// 停止实现Lifecycle的bean
this.lifecycleProcessor.onClose();
}
catch (Throwable ex) {
logger.warn("Exception thrown from LifecycleProcessor on context close", ex);
}
}
.....
}
- 上述代码可以忽略不看,只是Springboot停机的外部代码
private void stopBeans() {
Map<String, Lifecycle> lifecycleBeans = getLifecycleBeans();
Map<Integer, LifecycleGroup> phases = new HashMap<>();
lifecycleBeans.forEach((beanName, bean) -> {
int shutdownPhase = getPhase(bean);
LifecycleGroup group = phases.get(shutdownPhase);
if (group == null) {
group = new LifecycleGroup(shutdownPhase, this.timeoutPerShutdownPhase, lifecycleBeans, false);
phases.put(shutdownPhase, group);
}
group.add(beanName, bean);
});
if (!phases.isEmpty()) {
List<Integer> keys = new ArrayList<>(phases.keySet());
keys.sort(Collections.reverseOrder());
for (Integer key : keys) {
// TODO 重点
phases.get(key).stop();
}
}
}
- stopBeans 一共做了两件事 组装 和 排序 这个不重要
- 重要的是 经过一系列组装,将相同排序的lifecycle加入到同一个 LifecycleGroup 这个类 里面会维护多个 lifecycle成员,在执行stop的时候,多个成员for循环依次执行
// LifecycleGroup
public void stop() {
if (this.members.isEmpty()) {
return;
}
if (logger.isDebugEnabled()) {
logger.debug("Stopping beans in phase " + this.phase);
}
this.members.sort(Collections.reverseOrder());
// 倒数器, count数量就是 lifecycle成员的数量
CountDownLatch latch = new CountDownLatch(this.smartMemberCount);
Set<String> countDownBeanNames = Collections.synchronizedSet(new LinkedHashSet<>());
// 里面的类名,会在doStop时被移除
Set<String> lifecycleBeanNames = new HashSet<>(this.lifecycleBeans.keySet());
for (LifecycleGroupMember member : this.members) {
if (lifecycleBeanNames.contains(member.name)) {
doStop(this.lifecycleBeans, member.name, latch, countDownBeanNames);
}
else if (member.bean instanceof SmartLifecycle) {
// Already removed: must have been a dependent bean from another phase
latch.countDown();
}
}
try {
// await 等待, 也就意味着 如果在上述方法时候,一直不执行countDown ,这里就是一个兜底方案,强制放行
latch.await(this.timeout, TimeUnit.MILLISECONDS);
if (latch.getCount() > 0 && !countDownBeanNames.isEmpty() && logger.isInfoEnabled()) {
logger.info("Failed to shut down " + countDownBeanNames.size() + " bean" +
(countDownBeanNames.size() > 1 ? "s" : "") + " with phase value " +
this.phase + " within timeout of " + this.timeout + "ms: " + countDownBeanNames);
}
}
catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
private void doStop(Map<String, ? extends Lifecycle> lifecycleBeans, final String beanName,
final CountDownLatch latch, final Set<String> countDownBeanNames) {
// 移除当前这个bean,并返回bean的实例
Lifecycle bean = lifecycleBeans.remove(beanName);
if (bean != null) {
// 依赖关系 依次stop
String[] dependentBeans = getBeanFactory().getDependentBeans(beanName);
for (String dependentBean : dependentBeans) {
doStop(lifecycleBeans, dependentBean, latch, countDownBeanNames);
}
try {
if (bean.isRunning()) {
if (bean instanceof SmartLifecycle) {
if (logger.isTraceEnabled()) {
logger.trace("Asking bean '" + beanName + "' of type [" +
bean.getClass().getName() + "] to stop");
}
countDownBeanNames.add(beanName);
// 核心 执行stop,执行完毕后回调函数中 进行countDown
((SmartLifecycle) bean).stop(() -> {
latch.countDown();
countDownBeanNames.remove(beanName);
if (logger.isDebugEnabled()) {
logger.debug("Bean '" + beanName + "' completed its stop procedure");
}
});
}
else {
if (logger.isTraceEnabled()) {
logger.trace("Stopping bean '" + beanName + "' of type [" +
bean.getClass().getName() + "]");
}
bean.stop();
if (logger.isDebugEnabled()) {
logger.debug("Successfully stopped bean '" + beanName + "'");
}
}
}
else if (bean instanceof SmartLifecycle) {
// Don't wait for beans that aren't running...
latch.countDown();
}
}
catch (Throwable ex) {
if (logger.isWarnEnabled()) {
logger.warn("Failed to stop bean '" + beanName + "'", ex);
}
}
}
}
- 上述两段代码,其实真正核心的就是一个CountDownLatch的运用
- LifecycleGroup的member作为countDown的count,stop成功一个释放一个count,直到全部释放成功
- latch.await(this.timeout, TimeUnit.MILLISECONDS)
- 如果countDown内部的count一直没被消费,则一直阻塞在这里
- 作为一个兜底,如果超过timeout时间还没有stop完毕,则不再阻塞线程,这里的timeout就是咱们在yaml文件中配置的
SmartLifecycle的回调
default void stop(Runnable callback) {
stop();
callback.run();
}
((SmartLifecycle) bean).stop(() -> {
latch.countDown();
countDownBeanNames.remove(beanName);
if (logger.isDebugEnabled()) {
logger.debug("Bean '" + beanName + "' completed its stop procedure");
}
});
具体看下SmartLifecycle这个方法,我们发现,是一个callback函数,只有当stop完成后,再会执行我们设置的函数,也就是latch.countDown()
什么情况下stop迟迟不结束
- org.springframework.boot.web.reactive.context.WebServerGracefulShutdownLifecycle#stop(java.lang.Runnable)
- org.springframework.boot.web.reactive.context.WebServerManager#shutDownGracefully
- org.springframework.boot.web.embedded.tomcat.TomcatWebServer#shutDownGracefully
- org.springframework.boot.web.embedded.tomcat.GracefulShutdown#shutDownGracefully
- org.springframework.boot.web.embedded.tomcat.TomcatWebServer#shutDownGracefully
- org.springframework.boot.web.reactive.context.WebServerManager#shutDownGracefully
void shutDownGracefully(GracefulShutdownCallback callback) {
logger.info("Commencing graceful shutdown. Waiting for active requests to complete");
new Thread(() -> doShutdown(callback), "tomcat-shutdown").start();
}
private void doShutdown(GracefulShutdownCallback callback) {
List<Connector> connectors = getConnectors();
connectors.forEach(this::close);
try {
for (Container host : this.tomcat.getEngine().findChildren()) {
for (Container context : host.findChildren()) {
while (isActive(context)) {
if (this.aborted) {
logger.info("Graceful shutdown aborted with one or more requests still active");
callback.shutdownComplete(GracefulShutdownResult.REQUESTS_ACTIVE);
return;
}
Thread.sleep(50);
}
}
}
}
catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
logger.info("Graceful shutdown complete");
callback.shutdownComplete(GracefulShutdownResult.IDLE);
}
-
代码可能有点多,既然坚持到这里了,还是把调用栈详细写出来
-
shutDownGracefully (callback)
- 我们看到这里启动了一个新的线程,并且执行,全部交给异步执行(不要忘了入参是个 callback)
- 内部再调用doShutDown(callback)
-
doShutdown(callback) 关键
- 关闭所有Connector,熟悉tomcat的都知道,Connector是管理socket连接的,关闭了Connector也就代表不再接受新的请求了。
- isActive(context) == true就一直执行,进入内部源码看下就会清楚,里面是tomcat正在处理的任务,只要有一个就返回true,这个方法也就是说明了,优雅关闭的核心,当有请求没有处理完,就允许他继续处理
总结
- 定义 countDownLatch 阻塞hook的线程, count数量就是 实现lifecycle的子类
- 循环每一个lifecycle进行stop,stop完成后会进行countDownLatch.countDown()
- 最外层countDownLatch.await,设置超时时间,如果超时不再阻塞主进程,正常走完hook流程,结束进程
编写不易,转载请标明出处。
转载自:https://juejin.cn/post/7197292579057221693