likes
comments
collection
share

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

作者站长头像
站长
· 阅读数 70

一、测试结果

测试结论1:2台64核256g阿里云服务器组成的2节点集群,hadoop3.2.2 作为分布式存储,Cloudwave4.0在 SSB30g 国际标准测试集下,整体性能优于Starrocks3.0近0.4倍。 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

数据库数据集响应时间(ms)CPU 最大占用率存储压缩比数据导入时间
Cloudwave4.0ssb3074810.8%(696%/6400%)56.7%(10.2g/18g)82秒
Starrocks3.0ssb30105733.3%(2131%/6400%)46.1%(8.3g/18g)95秒

测试结论2:2台64核256g阿里云服务器组成的2节点集群,hadoop3.2.2 作为分布式存储,Cloudwave4.0在 SSB100g 国际标准测试集下,整体性能优于Starrocks3.0近0.9倍。 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

数据库数据集响应时间(ms)CPU 最大占用率存储压缩比数据导入时间
Cloudwave4.0ssb100112844.3%(2834%/6400%)58.6%(34.6g/59g)9分钟24秒
Starrocks3.0ssb100219153.7%(3437%/6400%)49.15%(29g/59g)6分钟

二、评测环境

  • 硬件环境:2台 64核256g 云服务器(组成2节点的集群),essd pl1 高效云盘
  • 软件环境:jdk19(Cloudwave4.0官方推荐版本,官方基于jdk19版本里头的的vector api,实现全面向量化引擎)、jdk8(starrocks安装推荐jdk版本,主要用于fe,亦可少踩坑)、mysql8(作为starrocks的客户端)、hadoop 3.2.2(作为cloudwave 和 starrocks 共同的分布式存储,副本数=2)
  • 软件版本:Cloudwave 4.0(最新版在2023年5月份发版),Starrocks 3.0(最新版在2023年4月份发版)
  • 评测数据集:ssb30,ssb100

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

三、评测方法

  • 执行19轮测试脚本,每轮执行13条标准测试sql,去除第1轮的测试数据(由于IO原因,第1次查询两边的性能均受IO影响,本测试主要测数据库引擎的算法在同等计算资源的条件下的优劣,因此去除第一轮测试数据),将余下的18轮测试数据做平均,获得每条sql的平均耗时
  • 观察最大CPU占用
  • 观察存储压缩比
  • 观察数据加载时间

四、开始测试

  1. 启动hadoop 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
hdfs namenode -format
start-dfs.sh
jps

2. [cloudwave]启动 Cloudwave4.0 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

./start-all-server.sh
jps

3. [cloudwave]创建 cloudwave 数据上传目录

dfs dfs -mkdir -p /cloudwave/uploads

4. [cloudwave]上传 ssb30 数据 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

hdfs dfs -put ssb30 /cloudwave/uploads/

5. [cloudwave]加载数据 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 执行数据导入命令 loaddata ssb30
./cplus_go.bin -s 'loaddata ssb30'
  • [cloudwave]查看数据导入情况 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点) 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
  • 可以看到30g的数据,82s就导入完成了
  • 通过 hdfs 命令,可以看到cloudwave做了数据压缩,ssb30数据的原始大小是18G,导入cloudwave数据库之后,压缩到了10.2g(图中的20.3G 表示hdfs两个数据副本的总大小) 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
  1. [cloudwave]开始测试
  • 执行测试脚本./test_ssb.sh,七镜观察到cloudwave 的2节点集群测ssb30 CPU最大占用是696%/6400% 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
  • 执行分析脚本./analysis.sh cloudwave "$(ls n*txt)" + 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
  1. [cloudwave]按上述步骤测试ssb100
  • 上传数据到hdfs 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 查看上传的数据 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 执行导入数据命令 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点) 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • ssb100数据的原始大小是59G,导入cloudwave数据库之后,压缩到了34.6g 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • cloudwave 的2节点集群测ssb100 CPU最大占用是2834%/6400% 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 分析测试结果 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  1. [starrocks] 启动 starrocks3.0 fe 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
./start_fe.sh --daemon

9. [starrocks] 添加starrocks3.0 be 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

mysql -uroot -h127.0.0.1 -P9030
ALTER SYSTEM ADD BACKEND "172.17.161.33:9050"; 
ALTER SYSTEM ADD BACKEND "172.17.161.30:9050"; 

10. [starrocks] 启动 starrocks 3.0 be 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

./start_be.sh --daemon
ps -ef | grep starrocks

11. [starrocks] 验证集群状态,两个节点的 Alive=true 即可。 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  1. [starrocks] 创建表 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  2. [starrocks] 开始导入数据,ssb30导入时间是95s 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

date && ./bin/stream_load.sh data_dir/ssb30 && date

14. [starrocks] 查看ssb30 压缩比,ssb30数据的原始大小是18G,导入starrocks数据库之后,压缩到了8.3g

678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  1. [starrocks] 开始测试
  • 执行测试脚本./test_ssb.sh,七镜观察到 starrocks 的2节点集群测ssb30 CPU最大占用是2131%/6400% 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
  • 执行分析脚本./analysis.sh starrocks "$(ls n*txt)" + 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)
  1. [starrocks]按上述步骤测试ssb100
  • 创建表 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 导数据,6分钟导入完成ssb100g数据 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点) 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 查看压缩比,ssb100数据的原始大小是59G,导入starrocks数据库之后,压缩到了29g 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 执行测试,starrocks 的2节点集群测ssb100 CPU最大占用是3437%/6400% 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

  • 分析测试结果 678. 【数据库评测】Cloudwave 4.0 集群版(2节点) VS Starrocks 3.0 集群版(2节点)

五、附加

  1. Cloudwave 测试脚本
#!/bin/bash
# Program:
#       test ssb
# History:
# 2023/03/17    junfenghe.cloud@qq.com  version:0.0.1

rm -rf ./n*txt
for ((i=1; i<20; i++))
do

    cat sql_ssb.sql |./cplus.sh > n${i}.txt

done

  1. Starrocks 测试脚本
#!/bin/bash
# Program:
#       test ssb
# History:
# 2023/03/17    junfenghe.cloud@qq.com  version:0.0.1

rm -rf ./n*txt
for ((i=1; i<20; i++))
do

    cat sql_ssb.sql | mysql -uroot -P 9030 -h 127.0.0.1 -v -vv -vvv >n${i}.txt

done

  1. 分析脚本
#!/bin/bash
#Program:
#       analysis cloudwave/starrocks logs of base compute
#History:
#2023/02/20     junfenghe.cloud@qq.com  version:0.0.1

path=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/sbin:/usr/local/bin:~/bin
export path



suff="(s)#####"

if [ -z "${1}" ]
then

        echo "Please input database'name"
        exit -1

fi

if [ -z "$2" ]
then

        echo "Please input times of scanner"
        exit -f
fi

if [ -n "${3}" ]
then
        suff=${3}
fi

for current in ${2}
do
        result_time=""

        if [ "${1}" == "starrocks" ]
        then
            for time in $( cat ${current} | grep sec  | awk -F '('  '{print $2}' | awk -F ' ' '{print $1}' )
            do
                result_time="${result_time}${time}${suff}"
            done
        elif [ "${1}" == "cloudwave" ]
        then
            for time in $( cat ${current} | grep Elapsed | awk '{print $2}'| sed 's/:/*60+/g'| sed 's/+00\*60//g ; s/+0\*60//g ; s/^0\*60+//g' )
            do
                result_time="${result_time}${time}${suff}"
            done
        fi

        echo ${result_time%${suff}*}

done


exit 0

4. sql_ssb.sql

use ssb100;
select sum(lo_revenue) as revenue from lineorder,dates where lo_orderdate = d_datekey and d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25;
select sum(lo_revenue) as revenue from lineorder,dates where lo_orderdate = d_datekey and d_yearmonthnum = 199401 and lo_discount between 4 and 6 and lo_quantity between 26 and 35;
select sum(lo_revenue) as revenue from lineorder,dates where lo_orderdate = d_datekey and d_weeknuminyear = 6 and d_year = 1994 and lo_discount between 5 and 7 and lo_quantity between 26 and 35;
select sum(lo_revenue) as lo_revenue, d_year, p_brand from lineorder ,dates,part,supplier where lo_orderdate = d_datekey and lo_partkey = p_partkey and lo_suppkey = s_suppkey and p_category = 'MFGR#12' and s_region = 'AMERICA' group by d_year, p_brand order by d_year, p_brand;
select sum(lo_revenue) as lo_revenue, d_year, p_brand from lineorder,dates,part,supplier where lo_orderdate = d_datekey and lo_partkey = p_partkey and lo_suppkey = s_suppkey and p_brand between 'MFGR#2221' and 'MFGR#2228' and s_region = 'ASIA' group by d_year, p_brand order by d_year, p_brand;
select sum(lo_revenue) as lo_revenue, d_year, p_brand from lineorder,dates,part,supplier where lo_orderdate = d_datekey and lo_partkey = p_partkey and lo_suppkey = s_suppkey and p_brand = 'MFGR#2239' and s_region = 'EUROPE' group by d_year, p_brand order by d_year, p_brand;
select c_nation, s_nation, d_year, sum(lo_revenue) as lo_revenue from lineorder,dates,customer,supplier where lo_orderdate = d_datekey and lo_custkey = c_custkey and lo_suppkey = s_suppkey and c_region = 'ASIA' and s_region = 'ASIA'and d_year >= 1992 and d_year <= 1997 group by c_nation, s_nation, d_year order by d_year asc, lo_revenue desc;
select c_city, s_city, d_year, sum(lo_revenue) as lo_revenue from lineorder,dates,customer,supplier where lo_orderdate = d_datekey and lo_custkey = c_custkey and lo_suppkey = s_suppkey and  c_nation = 'UNITED STATES' and s_nation = 'UNITED STATES' and d_year >= 1992 and d_year <= 1997 group by c_city, s_city, d_year order by d_year asc, lo_revenue desc;
select c_city, s_city, d_year, sum(lo_revenue) as lo_revenue from lineorder,dates,customer,supplier where lo_orderdate = d_datekey and lo_custkey = c_custkey and lo_suppkey = s_suppkey and (c_city='UNITED KI1' or c_city='UNITED KI5') and (s_city='UNITED KI1' or s_city='UNITED KI5') and d_year >= 1992 and d_year <= 1997 group by c_city, s_city, d_year order by d_year asc, lo_revenue desc;
select c_city, s_city, d_year, sum(lo_revenue) as lo_revenue from lineorder,dates,customer,supplier where lo_orderdate = d_datekey and lo_custkey = c_custkey and lo_suppkey = s_suppkey and (c_city='UNITED KI1' or c_city='UNITED KI5') and (s_city='UNITED KI1' or s_city='UNITED KI5') and d_yearmonth  = 'Dec1997' group by c_city, s_city, d_year order by d_year asc, lo_revenue desc;
select d_year, c_nation, sum(lo_revenue) - sum(lo_supplycost) as profit from lineorder,dates,customer,supplier,part where lo_orderdate = d_datekey and lo_custkey = c_custkey and lo_suppkey = s_suppkey and lo_partkey = p_partkey and c_region = 'AMERICA' and s_region = 'AMERICA' and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2') group by d_year, c_nation order by d_year, c_nation;
select d_year, s_nation, p_category, sum(lo_revenue) - sum(lo_supplycost) as profit from lineorder,dates,customer,supplier,part where lo_orderdate = d_datekey and lo_custkey = c_custkey and lo_suppkey = s_suppkey and lo_partkey = p_partkey and c_region = 'AMERICA'and s_region = 'AMERICA' and (d_year = 1997 or d_year = 1998) and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2') group by d_year, s_nation, p_category order by d_year, s_nation, p_category;
select d_year, s_city, p_brand, sum(lo_revenue) - sum(lo_supplycost) as profit from lineorder,dates,customer,supplier,part where lo_orderdate = d_datekey and lo_custkey = c_custkey and lo_suppkey = s_suppkey and lo_partkey = p_partkey and c_region = 'AMERICA'and s_nation = 'UNITED STATES' and (d_year = 1997 or d_year = 1998) and p_category = 'MFGR#14' group by d_year, s_city, p_brand order by d_year, s_city, p_brand;

七镜还将带来Cloudwave 4.0 集群版 VS Starrocks 3.0 集群版 在 1T SSB数据集上的评测。

转载自:https://juejin.cn/post/7243607462008979493
评论
请登录