likes
comments
collection
share

ConcurrentHashMap#computeIfAbsent 性能问题和解决方案

作者站长头像
站长
· 阅读数 9

我报名参加金石计划1期挑战——瓜分10万奖池,这是我的第2篇文章,点击查看活动详情

JDK代码代码给人的感觉就是简洁、高性能。但是凡是代码都有可能有问题,今天就来聊聊JDK1.8中 ConcurrentHashMap#computeIfAbsent 的性能问题以及如何避免和解决。

1. 现象重现

测试代码:

package com.github.mxsm.dledger.benchmark;

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class ConcurrentHashMapTest {
    public static Map<String,Object> objectMap = new ConcurrentHashMap<>();

    public static String key = "key";

    public static void main(String[] args) {

        int nThreads = Runtime.getRuntime().availableProcessors();
        ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
        for(int i = 1; i <=nThreads;++i){
            executorService.execute(() -> {
                for (;;){
                    objectMap.computeIfAbsent(key,key->0);
                }
            });
        }
    }
}

编译完成后用命令运行:

$ java com.github.mxsm.dledger.benchmark.ConcurrentHashMapTest

然后使用 async-profiler 工具监控,监控命令如下:

$ ./profiler.sh -e lock --lock 1ms -d 240 -o jfr -f output.jfr $PID

然后将生成的 output.jfr 用IDEA的Profiler查看。

ConcurrentHashMap#computeIfAbsent 性能问题和解决方案

会发现有很多的Java Monitor Blocked事件。

Tips: async-profiler使用参照《Java 性能分析工具 async-profiler》

其实这个是一个 ConcurrentHashMap#computeIfAbsent 的一个性能问题。下面来进行逐一分析。

2. 原因分析

ConcurrentHashMap#computeIfAbsent 方法从语义上理解:当Key对应的Value不存在的时候将mappingFunction的值放入Map,存在就直接获取出来。但是从上面的代码运行的现象来看似乎存在的时候进入了 synchronized 代码块。下面来分析一下代码(JDK1.8):

    public V computeIfAbsent(K key, Function<? super K, ? extends V> mappingFunction) {
        if (key == null || mappingFunction == null) {
            throw new NullPointerException();
        }
        int h = spread(key.hashCode());
        V val = null;
        int binCount = 0;
        for (Node<K, V>[] tab = table; ; ) {
            Node<K, V> f;
            int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
            //初始化
            {
                tab = initTable();
            } else if ((f = tabAt(tab, i = (n - 1) & h)) == null) {
                Node<K, V> r = new ReservationNode<K, V>();
                //value为空的情况
                synchronized (r) {
                    //省略代码
                }
                if (binCount != 0) {
                    break;
                }
            } else if ((fh = f.hash) == MOVED)
            //数据转移
            {
                tab = helpTransfer(tab, f);
            } else {
                boolean added = false;
                //value存在的时候
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        //省略代码
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD) {
                        treeifyBin(tab, i);
                    }
                    if (!added) {
                        return val;
                    }
                    break;
                }
            }
        }
        if (val != null) {
            addCount(1L, binCount);
        }
        return val;
    }

从上面代码可以看出来,获取key对应的存在的value的时候会进入 synchronized 代码块。

3. 测试性能验证

使用JMH来进行测试,代码如下:

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Level;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Threads;
import org.openjdk.jmh.annotations.Warmup;


@Warmup(iterations = 2, time = 5)
@Measurement(iterations = 3, time = 5)
@State(Scope.Benchmark)
@Fork(2)
public class HashMapBenchmark {

    private static final String KEY = "mxsm";

    private final Map<String, Object> concurrentMap = new ConcurrentHashMap<>();

    @Setup(Level.Iteration)
    public void setup() {
        concurrentMap.clear();
    }

    @Benchmark
    @Threads(16)
    public Object benchmarkGetBeforeComputeIfAbsent() {
        Object result = concurrentMap.get(KEY);
        if (null == result) {
            result = concurrentMap.computeIfAbsent(KEY, key -> 1);
        }
        return result;
    }

    @Benchmark
    @Threads(16)
    public Object benchmarkComputeIfAbsent() {
        return concurrentMap.computeIfAbsent(KEY, key -> 1);
    }

}

然后打包通过命令运行jar

java -jar target/dledger-benchmarks.jar com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark

运行结果:

jdk version:
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~18.04-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)

root@GZYZFA00088591:/mnt/d/develop/github/benchmark/dledger-benchmark# java -jar target/dledger-benchmarks.jar com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark
# JMH version: 1.35
# VM version: JDK 1.8.0_312, OpenJDK 64-Bit Server VM, 25.312-b07
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent

# Run progress: 0.00% complete, ETA 00:01:40
# Fork: 1 of 2
# Warmup Iteration   1: 42984830.445 ops/s
# Warmup Iteration   2: 38416032.086 ops/s
Iteration   1: 39423217.883 ops/s
Iteration   2: 41176380.793 ops/s
Iteration   3: 41146374.512 ops/s

# Run progress: 25.00% complete, ETA 00:01:15
# Fork: 2 of 2
# Warmup Iteration   1: 42000026.386 ops/s
# Warmup Iteration   2: 37801042.365 ops/s
Iteration   1: 38829603.637 ops/s
Iteration   2: 39873697.470 ops/s
Iteration   3: 39745609.268 ops/s


Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent":
  40032480.594 ±(99.9%) 2652852.116 ops/s [Average]
  (min, avg, max) = (38829603.637, 40032480.594, 41176380.793), stdev = 946032.620
  CI (99.9%): [37379628.478, 42685332.710] (assumes normal distribution)


# JMH version: 1.35
# VM version: JDK 1.8.0_312, OpenJDK 64-Bit Server VM, 25.312-b07
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent

# Run progress: 50.00% complete, ETA 00:00:50
# Fork: 1 of 2
# Warmup Iteration   1: 867041643.912 ops/s
# Warmup Iteration   2: 879285059.514 ops/s
Iteration   1: 946381466.173 ops/s
Iteration   2: 829405048.842 ops/s
Iteration   3: 904068960.001 ops/s

# Run progress: 75.00% complete, ETA 00:00:25
# Fork: 2 of 2
# Warmup Iteration   1: 801375284.635 ops/s
# Warmup Iteration   2: 1023301849.860 ops/s
Iteration   1: 864155832.112 ops/s
Iteration   2: 939504012.429 ops/s
Iteration   3: 913166819.166 ops/s


Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent":
  899447023.121 ±(99.9%) 126458224.528 ops/s [Average]
  (min, avg, max) = (829405048.842, 899447023.121, 946381466.173), stdev = 45096221.050
  CI (99.9%): [772988798.593, 1025905247.649] (assumes normal distribution)


# Run complete. Total time: 00:01:41

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                      Mode  Cnt          Score           Error  Units
ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent           thrpt    6   40032480.594 ±   2652852.116  ops/s
ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent  thrpt    6  899447023.121 ± 126458224.528  ops/s

从结果发现:在JDK8的版本两种方式的性能相差了一个数量级。所以性能有问题。

更多的问题详情可以查看:OpenJDK bug report

4. 解决方法

4.1 通过代码调整来解

从上面的代码分析可以知道,问题主要是由于在存在key的时候,进入了synchronized 代码块。我们就从这个方面入手。解决方式代码如下:

 Object result = concurrentMap.get(KEY);
if (null == result) {
    result = concurrentMap.computeIfAbsent(KEY, key -> 1);
}

上面的性能测试已经验证了代码的可行性,对性能有较大的优化。

4.2 升级JDK版本

这个性能问题已经JDK9+版本解决了。运行上面的测试代码:

jdk11 version:
openjdk version "11.0.14.1" 2022-02-08
OpenJDK Runtime Environment (build 11.0.14.1+1-Ubuntu-0ubuntu1.18.04)
OpenJDK 64-Bit Server VM (build 11.0.14.1+1-Ubuntu-0ubuntu1.18.04, mixed mode, sharing)

root@GZYZFA00088591:/mnt/d/develop/github/benchmark/dledger-benchmark# java -jar target/dledger-benchmarks.jar com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.openjdk.jmh.util.Utils (file:/mnt/d/develop/github/benchmark/dledger-benchmark/target/dledger-benchmarks.jar) to method java.io.Console.encoding()
WARNING: Please consider reporting this to the maintainers of org.openjdk.jmh.util.Utils
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
# JMH version: 1.35
# VM version: JDK 11.0.14.1, OpenJDK 64-Bit Server VM, 11.0.14.1+1-Ubuntu-0ubuntu1.18.04
# VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent

# Run progress: 0.00% complete, ETA 00:01:40
# Fork: 1 of 2
# Warmup Iteration   1: 734790850.730 ops/s
# Warmup Iteration   2: 723040919.893 ops/s
Iteration   1: 641529840.743 ops/s
Iteration   2: 711719044.738 ops/s
Iteration   3: 683114525.450 ops/s

# Run progress: 25.00% complete, ETA 00:01:16
# Fork: 2 of 2
# Warmup Iteration   1: 658869896.647 ops/s
# Warmup Iteration   2: 634635199.677 ops/s
Iteration   1: 682154939.057 ops/s
Iteration   2: 684056273.846 ops/s
Iteration   3: 684466801.465 ops/s


Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent":
  681173570.883 ±(99.9%) 63060380.600 ops/s [Average]
  (min, avg, max) = (641529840.743, 681173570.883, 711719044.738), stdev = 22487939.188
  CI (99.9%): [618113190.283, 744233951.483] (assumes normal distribution)


# JMH version: 1.35
# VM version: JDK 11.0.14.1, OpenJDK 64-Bit Server VM, 11.0.14.1+1-Ubuntu-0ubuntu1.18.04
# VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 5 s each
# Measurement: 3 iterations, 5 s each
# Timeout: 10 min per iteration
# Threads: 16 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent

# Run progress: 50.00% complete, ETA 00:00:50
# Fork: 1 of 2
# Warmup Iteration   1: 913175624.693 ops/s
# Warmup Iteration   2: 889999077.476 ops/s
Iteration   1: 755423064.811 ops/s
Iteration   2: 819973081.401 ops/s
Iteration   3: 819069238.312 ops/s

# Run progress: 75.00% complete, ETA 00:00:25
# Fork: 2 of 2
# Warmup Iteration   1: 755788304.744 ops/s
# Warmup Iteration   2: 749372817.128 ops/s
Iteration   1: 840515776.127 ops/s
Iteration   2: 872478443.625 ops/s
Iteration   3: 848340384.022 ops/s


Result "com.github.mxsm.dledger.benchmark.ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent":
  825966664.716 ±(99.9%) 111714396.868 ops/s [Average]
  (min, avg, max) = (755423064.811, 825966664.716, 872478443.625), stdev = 39838430.078
  CI (99.9%): [714252267.848, 937681061.585] (assumes normal distribution)


# Run complete. Total time: 00:01:41

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                                      Mode  Cnt          Score           Error  Units
ConcurrentHashMapBenchmark.benchmarkComputeIfAbsent           thrpt    6  681173570.883 ±  63060380.600  ops/s
ConcurrentHashMapBenchmark.benchmarkGetBeforeComputeIfAbsent  thrpt    6  825966664.716 ± 111714396.868  ops/s

虽然性能有差距但是不大。

5. 学以致用-RocketMQ性能优化

在研究代码过程中发现RocketMQ中也有相对应的代码,对于RocketMQ这样的消息中间件性能是尤为重要。那么就提个ISSUE来一波性能优化。

笔者在RocketMQ提的ISSUE#4999

6. 总结

如果对性能有着很高的追求对于computeIfAbsent的优化是必须的,如果性能可能都达不到这个水平不优化也是可以的。优化的两种方式:

  • 开发者自身代码解决-需要开发人员自身处理
  • 升级JDK11或者JDK17-无效修改代码

我是蚂蚁背大象,文章对你有帮助点赞关注我,文章有不正确的地方请您斧正留言评论~谢谢