ConcurrentHashMap#computeIfAbsent 性能问题和解决方案
我报名参加金石计划1期挑战——瓜分10万奖池,这是我的第2篇文章,点击查看活动详情
JDK代码代码给人的感觉就是简洁、高性能。但是凡是代码都有可能有问题,今天就来聊聊JDK1.8中 ConcurrentHashMap#computeIfAbsent
的性能问题以及如何避免和解决。
1. 现象重现
测试代码:
package com.github.mxsm.dledger.benchmark;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class ConcurrentHashMapTest {
public static Map<String,Object> objectMap = new ConcurrentHashMap<>();
public static String key = "key";
public static void main(String[] args) {
int nThreads = Runtime.getRuntime().availableProcessors();
ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
for(int i = 1; i <=nThreads;++i){
executorService.execute(() -> {
for (;;){
objectMap.computeIfAbsent(key,key->0);
}
});
}
}
}
编译完成后用命令运行:
$ java com.github.mxsm.dledger.benchmark.ConcurrentHashMapTest
然后使用 async-profiler
工具监控,监控命令如下:
$ ./profiler.sh -e lock --lock 1ms -d 240 -o jfr -f output.jfr $PID
然后将生成的 output.jfr 用IDEA的Profiler查看。
会发现有很多的Java Monitor Blocked事件。
Tips: async-profiler使用参照《Java 性能分析工具 async-profiler》
其实这个是一个 ConcurrentHashMap#computeIfAbsent 的一个性能问题。下面来进行逐一分析。
2. 原因分析
ConcurrentHashMap#computeIfAbsent
方法从语义上理解:当Key对应的Value不存在的时候将mappingFunction的值放入Map,存在就直接获取出来。但是从上面的代码运行的现象来看似乎存在的时候进入了 synchronized
代码块。下面来分析一下代码(JDK1.8):
public V computeIfAbsent(K key, Function<? super K, ? extends V> mappingFunction) {
if (key == null || mappingFunction == null) {
throw new NullPointerException();
}
int h = spread(key.hashCode());
V val = null;
int binCount = 0;
for (Node<K, V>[] tab = table; ; ) {
Node<K, V> f;
int n, i, fh;
if (tab == null || (n = tab.length) == 0)
//初始化
{
tab = initTable();
} else if ((f = tabAt(tab, i = (n - 1) & h)) == null) {
Node<K, V> r = new ReservationNode<K, V>();
//value为空的情况
synchronized (r) {
//省略代码
}
if (binCount != 0) {
break;
}
} else if ((fh = f.hash) == MOVED)
//数据转移
{
tab = helpTransfer(tab, f);
} else {
boolean added = false;
//value存在的时候
synchronized (f) {
if (tabAt(tab, i) == f) {
//省略代码
}
}
if (binCount != 0) {
if (binCount >= TREEIFY_THRESHOLD) {
treeifyBin(tab, i);
}
if (!added) {
return val;
}
break;
}
}
}
if (val != null) {
addCount(1L, binCount);
}
return val;
}
从上面代码可以看出来,获取key对应的存在的value的时候会进入 synchronized
代码块。
3. 测试性能验证
使用JMH来进行测试,代码如下:
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Level;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Threads;
import org.openjdk.jmh.annotations.Warmup;
@Warmup(iterations = 2, time = 5)
@Measurement(iterations = 3, time = 5)
@State(Scope.Benchmark)
@Fork(2)
public class HashMapBenchmark {
private static final String KEY = "mxsm";
private final Map<String, Object> concurrentMap = new ConcurrentHashMap<>();
@Setup(Level.Iteration)
public void setup() {
concurrentMap.clear();
}
@Benchmark
@Threads(16)
public Object benchmarkGetBeforeComputeIfAbsent() {
Object result = concurrentMap.get(KEY);
if (null == result) {
result = concurrentMap.computeIfAbsent(KEY, key -> 1);
}
return result;
}
@Benchmark
@Threads(16)
public Object benchmarkComputeIfAbsent() {
return concurrentMap.computeIfAbsent(KEY, key -> 1);
}
}
然后打包通过命令运行jar
java -jar target/dledger-benchmarks.jar com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark
运行结果:
jdk version:
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~18.04-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
root@GZYZFA00088591:/mnt/d/develop/github/benchmark/dledger-benchmark# java -jar target/dledger-benchmarks.jar com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark
# JMH version: 1.35
# VM version: JDK 1.8.0_312, OpenJDK 64-Bit Server VM, 25.312-b07
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent
# Run progress: 0.00% complete, ETA 00:01:40
# Fork: 1 of 2
# Warmup Iteration 1: 42984830.445 ops/s
# Warmup Iteration 2: 38416032.086 ops/s
Iteration 1: 39423217.883 ops/s
Iteration 2: 41176380.793 ops/s
Iteration 3: 41146374.512 ops/s
# Run progress: 25.00% complete, ETA 00:01:15
# Fork: 2 of 2
# Warmup Iteration 1: 42000026.386 ops/s
# Warmup Iteration 2: 37801042.365 ops/s
Iteration 1: 38829603.637 ops/s
Iteration 2: 39873697.470 ops/s
Iteration 3: 39745609.268 ops/s
Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent":
40032480.594 ±(99.9%) 2652852.116 ops/s [Average]
(min, avg, max) = (38829603.637, 40032480.594, 41176380.793), stdev = 946032.620
CI (99.9%): [37379628.478, 42685332.710] (assumes normal distribution)
# JMH version: 1.35
# VM version: JDK 1.8.0_312, OpenJDK 64-Bit Server VM, 25.312-b07
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent
# Run progress: 50.00% complete, ETA 00:00:50
# Fork: 1 of 2
# Warmup Iteration 1: 867041643.912 ops/s
# Warmup Iteration 2: 879285059.514 ops/s
Iteration 1: 946381466.173 ops/s
Iteration 2: 829405048.842 ops/s
Iteration 3: 904068960.001 ops/s
# Run progress: 75.00% complete, ETA 00:00:25
# Fork: 2 of 2
# Warmup Iteration 1: 801375284.635 ops/s
# Warmup Iteration 2: 1023301849.860 ops/s
Iteration 1: 864155832.112 ops/s
Iteration 2: 939504012.429 ops/s
Iteration 3: 913166819.166 ops/s
Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent":
899447023.121 ±(99.9%) 126458224.528 ops/s [Average]
(min, avg, max) = (829405048.842, 899447023.121, 946381466.173), stdev = 45096221.050
CI (99.9%): [772988798.593, 1025905247.649] (assumes normal distribution)
# Run complete. Total time: 00:01:41
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent thrpt 6 40032480.594 ± 2652852.116 ops/s
ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent thrpt 6 899447023.121 ± 126458224.528 ops/s
从结果发现:在JDK8的版本两种方式的性能相差了一个数量级。所以性能有问题。
更多的问题详情可以查看:OpenJDK bug report
4. 解决方法
4.1 通过代码调整来解
从上面的代码分析可以知道,问题主要是由于在存在key的时候,进入了synchronized
代码块。我们就从这个方面入手。解决方式代码如下:
Object result = concurrentMap.get(KEY);
if (null == result) {
result = concurrentMap.computeIfAbsent(KEY, key -> 1);
}
上面的性能测试已经验证了代码的可行性,对性能有较大的优化。
4.2 升级JDK版本
这个性能问题已经JDK9+版本解决了。运行上面的测试代码:
jdk11 version:
openjdk version "11.0.14.1" 2022-02-08
OpenJDK Runtime Environment (build 11.0.14.1+1-Ubuntu-0ubuntu1.18.04)
OpenJDK 64-Bit Server VM (build 11.0.14.1+1-Ubuntu-0ubuntu1.18.04, mixed mode, sharing)
root@GZYZFA00088591:/mnt/d/develop/github/benchmark/dledger-benchmark# java -jar target/dledger-benchmarks.jar com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.openjdk.jmh.util.Utils (file:/mnt/d/develop/github/benchmark/dledger-benchmark/target/dledger-benchmarks.jar) to method java.io.Console.encoding()
WARNING: Please consider reporting this to the maintainers of org.openjdk.jmh.util.Utils
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
# JMH version: 1.35
# VM version: JDK 11.0.14.1, OpenJDK 64-Bit Server VM, 11.0.14.1+1-Ubuntu-0ubuntu1.18.04
# VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent
# Run progress: 0.00% complete, ETA 00:01:40
# Fork: 1 of 2
# Warmup Iteration 1: 734790850.730 ops/s
# Warmup Iteration 2: 723040919.893 ops/s
Iteration 1: 641529840.743 ops/s
Iteration 2: 711719044.738 ops/s
Iteration 3: 683114525.450 ops/s
# Run progress: 25.00% complete, ETA 00:01:16
# Fork: 2 of 2
# Warmup Iteration 1: 658869896.647 ops/s
# Warmup Iteration 2: 634635199.677 ops/s
Iteration 1: 682154939.057 ops/s
Iteration 2: 684056273.846 ops/s
Iteration 3: 684466801.465 ops/s
Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent":
681173570.883 ±(99.9%) 63060380.600 ops/s [Average]
(min, avg, max) = (641529840.743, 681173570.883, 711719044.738), stdev = 22487939.188
CI (99.9%): [618113190.283, 744233951.483] (assumes normal distribution)
# JMH version: 1.35
# VM version: JDK 11.0.14.1, OpenJDK 64-Bit Server VM, 11.0.14.1+1-Ubuntu-0ubuntu1.18.04
# VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent
# Run progress: 50.00% complete, ETA 00:00:50
# Fork: 1 of 2
# Warmup Iteration 1: 913175624.693 ops/s
# Warmup Iteration 2: 889999077.476 ops/s
Iteration 1: 755423064.811 ops/s
Iteration 2: 819973081.401 ops/s
Iteration 3: 819069238.312 ops/s
# Run progress: 75.00% complete, ETA 00:00:25
# Fork: 2 of 2
# Warmup Iteration 1: 755788304.744 ops/s
# Warmup Iteration 2: 749372817.128 ops/s
Iteration 1: 840515776.127 ops/s
Iteration 2: 872478443.625 ops/s
Iteration 3: 848340384.022 ops/s
Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent":
825966664.716 ±(99.9%) 111714396.868 ops/s [Average]
(min, avg, max) = (755423064.811, 825966664.716, 872478443.625), stdev = 39838430.078
CI (99.9%): [714252267.848, 937681061.585] (assumes normal distribution)
# Run complete. Total time: 00:01:41
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent thrpt 6 681173570.883 ± 63060380.600 ops/s
ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent thrpt 6 825966664.716 ± 111714396.868 ops/s
虽然性能有差距但是不大。
5. 学以致用-RocketMQ性能优化
在研究代码过程中发现RocketMQ中也有相对应的代码,对于RocketMQ这样的消息中间件性能是尤为重要。那么就提个ISSUE来一波性能优化。
笔者在RocketMQ提的ISSUE#4999
6. 总结
如果对性能有着很高的追求对于computeIfAbsent的优化是必须的,如果性能可能都达不到这个水平不优化也是可以的。优化的两种方式:
- 开发者自身代码解决-需要开发人员自身处理
- 升级JDK11或者JDK17-无效修改代码
我是蚂蚁背大象,文章对你有帮助点赞关注我,文章有不正确的地方请您斧正留言评论~谢谢
转载自:https://juejin.cn/post/7140227160454725646