starrocks开发环境搭建

starrocks是一个mpp的数据库

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
StarRocks 是一种新一代极速全场景 MPP(Massively Parallel Processing)数据库,专为高性能数据分析而设计。它的架构融合了 MPP 数据库和分布式系统的设计思想,具有以下特点:

1. **高性能查询**:
   - StarRocks 使用全面向量化引擎和基于代价的优化器(CBO),能够实现亚秒级查询速度,特别是在多表关联查询中表现出色。

2. **实时数据分析**:
   - 支持实时更新数据并进行高效查询,适用于实时数据仓库和实时指标监控等场景。

3. **灵活的数据建模**:
   - 支持多种数据模型,包括宽表、星型模型和雪花模型,满足复杂的数据分析需求。

4. **湖仓一体**:
   - 结合数据湖的灵活性和数据仓库的分析能力,提供统一的数据平台,简化数据存储、处理和分析流程。

5. **高并发查询**:
   - 通过优化查询调度和资源分配,确保在多用户同时访问时系统能够稳定运行并快速响应。

6. **兼容性**:
   - 兼容 MySQL 协议,支持标准 SQL 语法,易于与常用 BI 工具(如 Tableau、Power BI)集成。

StarRocks 的设计目标是让数据分析变得更加简单和敏捷,适用于企业级用户的多种分析需求,包括 OLAP 多维分析、实时数据分析和高并发查询等场景。如果您想了解更多,可以参考 [官方文档](https://docs.starrocks.io/zh/docs/introduction/what_is_starrocks/) 或 [社区资源](https://docs.starrocks.io/zh/docs/introduction/StarRocks_intro/)。

希望这些信息对您有所帮助! 😊

存算分离

StarRocks 的存算分离架构是一种创新设计,旨在优化资源利用率并提升系统的弹性和扩展性。以下是存算分离的主要特点:

  1. 存储与计算分离

    • 数据存储在远程存储系统中,例如 Amazon S3、Google Cloud Storage、Azure Blob Storage 或支持 S3 协议的存储(如 MinIO)。
    • 计算节点(CN)负责执行查询,而不存储数据。
  2. 本地缓存机制

    • 热数据会被缓存到本地磁盘,在查询命中缓存的情况下,性能与存算一体架构相当。
    • 支持数据缓存预热功能,可以提前加载所需数据以加速查询。
  3. 弹性扩展

    • 计算节点可以根据需求快速扩缩容,几秒内即可完成。
    • 存储成本更低,同时保证资源隔离性能。
  4. 支持多种存储类型

    • 包括 HDFS、Azure Blob、AWS S3 等对象存储服务。
  5. 适用场景

    • 存算分离架构特别适合云环境,能够降低存储成本并优化资源隔离。

如果您想了解更多,可以参考 官方文档相关教程。希望这些信息对您有所帮助! 😊 我来创建一个docker-compose启动的存算分离版本

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
services:
  minio:
    container_name: minio
    environment:
      MINIO_ROOT_USER: miniouser
      MINIO_ROOT_PASSWORD: miniopassword
    image: minio/minio:latest
    ports:
      - "9001:9001"
      - "9000:9000"
    entrypoint: sh
    command: '-c ''mkdir -p /minio_data/starrocks && minio server /minio_data --console-address ":9001"'''
    healthcheck:
      test: ["CMD", "mc", "ready", "local"]
      interval: 5s
      timeout: 5s
      retries: 5

  minio_mc:
    # This service is short lived, it does this:
    # - starts up
    # - checks to see if the MinIO service `minio` is ready
    # - creates a MinIO Access Key that the StarRocks services will use
    # - exits
    image: minio/mc:latest
    entrypoint:
      - sh
      - -c
      - |
        until mc ls minio > /dev/null 2>&1; do
          sleep 0.5
        done

        mc alias set myminio http://minio:9000 miniouser miniopassword
        mc admin user svcacct add --access-key AAAAAAAAAAAAAAAAAAAA \
        --secret-key BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB \
        myminio \
        miniouser        
    depends_on:
        minio:
          condition: service_healthy

  starrocks-fe:
    image: starrocks/fe-ubuntu:3.3-latest
    hostname: starrocks-fe
    container_name: starrocks-fe
    user: root
    command:
      - /bin/bash
      - -c
      - |
        echo "# enable shared data, set storage type, set endpoint" >> /opt/starrocks/fe/conf/fe.conf
        echo "run_mode = shared_data" >> /opt/starrocks/fe/conf/fe.conf
        echo "cloud_native_storage_type = S3" >> /opt/starrocks/fe/conf/fe.conf
        echo "aws_s3_endpoint = minio:9000" >> /opt/starrocks/fe/conf/fe.conf

        echo "# set the path in MinIO" >> /opt/starrocks/fe/conf/fe.conf
        echo "aws_s3_path = starrocks" >> /opt/starrocks/fe/conf/fe.conf

        echo "# credentials for MinIO object read/write" >> /opt/starrocks/fe/conf/fe.conf
        echo "aws_s3_access_key = AAAAAAAAAAAAAAAAAAAA" >> /opt/starrocks/fe/conf/fe.conf
        echo "aws_s3_secret_key = BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB" >> /opt/starrocks/fe/conf/fe.conf
        echo "aws_s3_use_instance_profile = false" >> /opt/starrocks/fe/conf/fe.conf
        echo "aws_s3_use_aws_sdk_default_behavior = false" >> /opt/starrocks/fe/conf/fe.conf

        echo "# Set this to false if you do not want default" >> /opt/starrocks/fe/conf/fe.conf
        echo "# storage created in the object storage using" >> /opt/starrocks/fe/conf/fe.conf
        echo "# the details provided above" >> /opt/starrocks/fe/conf/fe.conf
        echo "enable_load_volume_from_conf = true" >> /opt/starrocks/fe/conf/fe.conf

        /opt/starrocks/fe/bin/start_fe.sh --host_type FQDN        
    ports:
      - 8030:8030
      - 9020:9020
      - 9030:9030
    healthcheck:
      test: 'mysql -u root -h starrocks-fe -P 9030 -e "show frontends\G" |grep "Alive: true"'
      interval: 10s
      timeout: 5s
      retries: 3
    depends_on:
        minio:
            condition: service_healthy

  starrocks-cn:
    image: starrocks/cn-ubuntu:3.3-latest
    command:
      - /bin/bash
      - -c
      - |
        sleep 15s;
        ulimit -u 65535;
        ulimit -n 65535;
        mysql --connect-timeout 2 -h starrocks-fe -P9030 -uroot -e "ALTER SYSTEM ADD COMPUTE NODE \"starrocks-cn:9050\";"
        /opt/starrocks/cn/bin/start_cn.sh        
    environment:
      - HOST_TYPE=FQDN
    ports:
      - 8040:8040
    hostname: starrocks-cn
    container_name: starrocks-cn
    user: root
    depends_on:
      starrocks-fe:
        condition: service_healthy
        restart: true
      minio:
        condition: service_healthy
    healthcheck:
      test: 'mysql -u root -h starrocks-fe -P 9030 -e "SHOW COMPUTE NODES\G" |grep "Alive: true"'
      interval: 10s
      timeout: 5s
      retries: 3

先启动cn看看:

1
 nerdctl run -p 9060:9060 -p 8040:8040 -p 9050:9050 -p 8060:8060 -p 9070:9070 -it   --name cn -e "TZ=Asia/Shanghai" starrocks/cn-ubuntu:3.4-latest

进入到cn容器中:

1
2
3
 nerdctl exec -it cn /bin/bash
cd cn/conf
echo "priority_networks = 10.7.10.190/24" >>cn.properties

接下来重启一下服务

1
2
先杀死进程
bin/start_cn.sh --daemon

接下在idea中启动fe 需要修改python为Python3 {24626476-FD6F-47B4-A91B-18DE4C45432A}.png 接下来需要安装- Protobuf。 以上步骤都做完之后,进行编译·mvn clean install -DskipTests=true·,不报错即可。

接下来本地启动fe

再starrocks目录下操作以下命令:

1
2
3
4
5
6
7
cp -r conf fe/conf  
cp -r bin fe/bin  
cp -r webroot fe/webroot  
  
cd fe    
mkdir log    
mkdir meta

启动的主类是·com.starrocks.StarRocksFE·,再启动配置文件中添加以下环境变量

1
2
3
4
# 修改为自己的目录  
export PID_DIR=/Users/hxf/CodeSpace/starrocks/fe/bin  
export STARROCKS_HOME=/Users/hxf/CodeSpace/starrocks/fe  
export LOG_DIR=/Users/hxf/CodeSpace/starrocks/fe/log

接下来修改fe.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

#####################################################################
## The uppercase properties are read and exported by bin/start_fe.sh.
## To see all Frontend configurations,
## see fe/fe-core/src/main/java/com/starrocks/common/Config.java

# the output dir of stderr/stdout/gc
LOG_DIR = ${STARROCKS_HOME}/log

DATE = "$(date +%Y%m%d-%H%M%S)"
JAVA_OPTS="-Dlog4j2.formatMsgNoLookups=true -Xmx8192m -XX:+UseG1GC -Xlog:gc*:${LOG_DIR}/fe.gc.log.$DATE:time -XX:ErrorFile=${LOG_DIR}/hs_err_pid%p.log -Djava.security.policy=${STARROCKS_HOME}/conf/udf_security.policy"

##
## the lowercase properties are read by main program.
##

# DEBUG, INFO, WARN, ERROR, FATAL
sys_log_level = INFO

# store metadata, create it if it is not exist.
# Default value is ${STARROCKS_HOME}/meta
# meta_dir = ${STARROCKS_HOME}/meta

http_port = 8030
rpc_port = 9020
query_port = 9030
edit_log_port = 9010
mysql_service_nio_enabled = true

# Enable jaeger tracing by setting jaeger_grpc_endpoint
# jaeger_grpc_endpoint = http://localhost:14250
run_mode = shared_data
cloud_native_storage_type = S3
aws_s3_endpoint = 10.7.10.190:9000
# set the path in MinIO
aws_s3_path = starrocks
# credentials for MinIO object read/write
# 这里的 key 为刚才设置的 access token
aws_s3_access_key = AAAAAAAAAAAAAAAAAAAA
aws_s3_secret_key = BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
aws_s3_use_instance_profile = false
aws_s3_use_aws_sdk_default_behavior = false
# Set this to false if you do not want default
# storage created in the object storage using
# the details provided above
enable_load_volume_from_conf = true

# Choose one if there are more than one ip except loopback address. 
# Note that there should at most one ip match this list.
# If no ip match this rule, will choose one randomly.
# use CIDR format, e.g. 10.10.10.0/24
# Default value is empty.
priority_networks = 10.7.10.190/24

# Advanced configurations 
# log_roll_size_mb = 1024
# sys_log_dir = ${STARROCKS_HOME}/log
# sys_log_roll_num = 10
# sys_log_verbose_modules = 
# audit_log_dir = ${STARROCKS_HOME}/log
# audit_log_modules = slow_query, query
# audit_log_roll_num = 10
# meta_delay_toleration_second = 10
# qe_max_connection = 1024
# max_conn_per_user = 100
# qe_query_timeout_second = 300
# qe_slow_log_ms = 5000

{EFA750EB-C705-404D-A448-D85C78CC5FA8}.png 再idea中启动看到console如下输出 {19E490AF-FF2B-4D1A-BAE2-F7313263A724}.png 我们接下来再dbeaver中试试连接这个服务 {89072CD4-6753-40D6-8EA1-0B52A1CC2CBA}.png 接下来我们要连接cn节点了 {E7726FF3-BD7E-4F0D-8122-1DC48239881C}.png 能看到lastStartTime有数据即可。 接下来我们进行测试一下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
create database test;

use test;

admin set frontend config("tablet_create_timeout_second"="100")

  

CREATE TABLE IF NOT EXISTS par_tbl1

(

datekey DATETIME,

k1 INT,

item_id STRING,

v2 INT

)PRIMARY KEY (`datekey`,`k1`)

PARTITION BY date_trunc('day', `datekey`)

PROPERTIES (

"compression" = "LZ4",

"datacache.enable" = "true",

"enable_async_write_back" = "false",

"enable_persistent_index" = "true",

"persistent_index_type" = "LOCAL",

"replication_num" = "1",

"storage_volume" = "builtin_storage_volume"

);

注意

admin set frontend config(“tablet_create_timeout_second”=“100”),这条sql是为了让创建语句正常运行,不然会报错超时。 创建成功后可以看到成功创建的表格

{25F160F7-6981-4D67-B746-36D329A24444}.png 来手动插入一条数据看看 {9CAC8372-59C3-47D2-B1E2-D1708E9BB00E}.png

参考文档:

https://crossoverjie.top/2025/02/26/ob/StarRocks-dev-shard-data-build/

Licensed under CC BY-NC-SA 4.0
最后更新于 Apr 29, 2025 08:25 UTC
Built with Hugo
主题 StackJimmy 设计
#׷������
Caret Up