clickhouse集群搭建
Clickhouse集群搭建
文章目录
- Clickhouse集群搭建
 - 安装包下载
 - clickhouse单机安装
 - 默认安装
 - 默认数据库目录
 - 更改默认数据目录
 
- 2分片-1副本-3节点集群搭建
 - 1. 配置hosts
 - 2. 修改每个主机的主机名
 - 3. 配置文件上传
 - 配置文件分布
 - chnode1配置文件
 - chnode2配置文件
 - chnode3配置文件
 
- 4. 重启clickhouse-server
 - 5. 集群搭建测试
 
- Reference List
 
安装包下载
可以通过deb/rpm、tgz方式安装,但是离线安装选择tgz方式比较方便。本篇介绍
 如何离线安装。
tgz下载:https://packages.clickhouse.com/tgz/lts
clickhouse单机安装
搭建集群需要在每台主机上进行单机安装,将tgz和install.sh上传到服务器,执行:
bash install.sh
 
- install.sh
 
#!/bin/bash
export LATEST_VERSION=22.3.19.6
export ARCH=arm64
tar -xzvf "clickhouse-common-static-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-$LATEST_VERSION/install/doinst.sh"
tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh"
tar -xzvf "clickhouse-server-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-server-$LATEST_VERSION.tgz"
sudo "clickhouse-server-$LATEST_VERSION/install/doinst.sh" configure
sudo /etc/init.d/clickhouse-server start
tar -xzvf "clickhouse-client-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-client-$LATEST_VERSION.tgz"
sudo "clickhouse-client-$LATEST_VERSION/install/doinst.sh"
 
默认安装
默认数据库目录
- 默认数据目录路径: 默认数据目录处于
/var/lib/clickhouse。 - 数据目录初始化:删除此目录,重新启动clickhouse 
systemctl restart clickhouse-server可以重新将数据目录初始化。 
更改默认数据目录
要更改默认数据目录更改,修改/etc/clickhouse-server/config.xml
    <!-- Path to data directory, with trailing slash. -->
    <path>/var/lib/clickhouse/</path>
 
配置文件是只读的,所以要修改数据目录,
- 需要修改其权限
 
chmod 755 /etc/clickhouse-server/config.xml
 
- 将全部/var/lib/clickhouse/修改为其他数据目录
 - 删除默认数据库目录下所有文件。
 - 重启clickhouse-server,对数据目录进行重新初始化。
 
2分片-1副本-3节点集群搭建
clickhouse的集群每个分片的每个副本只能放到单独的实例上,比如2分片-2副本需要4台机器,3分片-2副本需要6台机器。
集群的拓扑:
| 节点 | 角色 | 描述 | 
|---|---|---|
| chnode1 | Data + ClickHouse Keeper | 数据节点 + 协调器 | 
| chnode2 | Data + ClickHouse Keeper | 数据节点 + 协调器 | 
| chnode3 | ClickHouse Keeper | 协调器 | 
分片及副本分布:
| 节点 | 角色 | 
|---|---|
| chnode1 | 分片1、副本1 | 
| chnode2 | 分片2、副本1 | 
| chnode3 | 无分片、无副本 | 
1. 配置hosts
对于每个主机,将如下内容追加到/etc/hosts:
10.55.134.82 chnode1     
10.55.134.93 chnode2
10.55.134.99 chnode3
 
2. 修改每个主机的主机名
hostnamectl set-hostname chnode1
 
3. 配置文件上传
配置文件分布
/etc/clickhouse-server/config.d/目录中的配置会覆盖默认配置,所以官网建议:
- 服务器配置添加到
/etc/clickhouse-server/config.d/ - 用户配置添加到
/etc/clickhouse-server/users.d/ - 不要更改
/etc/clickhouse-server/config.xml - 不要更改
/etc/clickhouse-server/users.xml 
我们先来看看每个主机上的配置文件:
- chnode1
 
[root@chnode1 clickhouse]# ll /etc/clickhouse-server/config.d/
total 24
-rw-r--r-- 1 root       root       985 Feb 13 15:41 enable-keeper.xml
-rw-r--r-- 1 clickhouse clickhouse  66 Feb 13 16:19 listen.xml
-rw-r--r-- 1 root       root       104 Feb 13 15:41 macros.xml
-rw-r--r-- 1 root       root       574 Feb 13 16:52 network-and-logging.xml
-rw-r--r-- 1 root       root       599 Feb 13 15:41 remote-servers.xml
-rw-r--r-- 1 root       root       386 Feb 13 15:41 use-keeper.xml
 
- chnode2
 
[root@chnode2 clickhouse]# ll /etc/clickhouse-server/config.d/
total 24
-rw-r--r-- 1 root       root       985 Feb 13 15:41 enable-keeper.xml
-rw-r--r-- 1 clickhouse clickhouse  66 Feb 13 16:19 listen.xml
-rw-r--r-- 1 root       root       104 Feb 13 15:41 macros.xml
-rw-r--r-- 1 root       root       574 Feb 13 16:52 network-and-logging.xml
-rw-r--r-- 1 root       root       599 Feb 13 15:41 remote-servers.xml
-rw-r--r-- 1 root       root       386 Feb 13 15:41 use-keeper.xml
 
- chnode3
 
[root@chnode3 clickhouse]# ll /etc/clickhouse-server/config.d/
total 12
-rw-r--r-- 1 clickhouse clickhouse 985 Feb 13 15:45 enable-keeper.xml
-rw-r--r-- 1 clickhouse clickhouse  61 Feb 13 16:06 listen.xml
-rw-r--r-- 1 clickhouse clickhouse 566 Feb 13 15:45 network-and-logging.xml
 
chnode1配置文件
-  
network-and-logging.xml
- 日志在1000M大小时滚动一次,保留3000M的日志。
 - clickhouse监听8123和9000端口
 - 服务器间通信使用端口9009
 
 
<clickhouse>
        <logger>
                <level>debug</level>
                <log>/var/log/clickhouse-server/clickhouse-server.log</log>
                <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
                <size>1000M</size>
                <count>3</count>
        </logger>
        <display_name>clickhouse</display_name>
        <listen_host>0.0.0.0</listen_host>
        <http_port>8123</http_port>
        <tcp_port>9000</tcp_port>
        <interserver_http_port>9009</interserver_http_port>
</clickhouse>
 
- enable-keeper.xml 
  
- chnode节点的server_id设置为1,其他节点id要不同。
 - 其他配置和chnode2一样
 
 
<clickhouse>
  <keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>1</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>trace</raft_logs_level>
    </coordination_settings>
    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>chnode1</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>2</id>
            <hostname>chnode2</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>3</id>
            <hostname>chnode3</hostname>
            <port>9234</port>
        </server>
    </raft_configuration>
  </keeper_server>
</clickhouse>
 
- macros.xml 
  
- shard值为1,指定了本节点存储分片1,副本1,chnode2里shard的值将变为2
 - 这种指定方式可以减少DDL语句复杂度,不用在建表时候再去指定分片分配到哪个节点。
 
 
<clickhouse>
  <macros>
    <shard>1</shard>
    <replica>replica_1</replica>
  </macros>
</clickhouse>
 
- remote-servers.xml 
  
remote-servers部分指定了所有集群,replace="true"表示覆盖默认配置里配置的集群。- 指定了一个集群名为
cluster_2S_1R - 集群使用
secret进行加密通信 cluster_2S_1R集群有两个分片,每个分片有一个副本。internal_replication设置为true表示写入操作时会选择第一个发现的健康副本去写入。
 
chnode1和chnode2的remote-servers.xml配置相同。
<clickhouse>
  <remote_servers replace="true">
    <cluster_2S_1R>
    <secret>mysecretphrase</secret>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode1</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode2</host>
                <port>9000</port>
            </replica>
        </shard>
    </cluster_2S_1R>
  </remote_servers>
</clickhouse>
 
- use-keeper.xml 
  
- 指定3个节点的zookeeper端口为9181
 
 
<clickhouse>
    <zookeeper>
        <node index="1">
            <host>chnode1</host>
            <port>9181</port>
        </node>
        <node index="2">
            <host>chnode2</host>
            <port>9181</port>
        </node>
        <node index="3">
            <host>chnode3</host>
            <port>9181</port>
        </node>
    </zookeeper>
</clickhouse>
 
chnode2配置文件
- network-and-logging.xml
 
和chnode1配置相同
<clickhouse>
        <logger>
                <level>debug</level>
                <log>/var/log/clickhouse-server/clickhouse-server.log</log>
                <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
                <size>1000M</size>
                <count>3</count>
        </logger>
        <display_name>clickhouse</display_name>
        <listen_host>0.0.0.0</listen_host>
        <http_port>8123</http_port>
        <tcp_port>9000</tcp_port>
        <interserver_http_port>9009</interserver_http_port>
</clickhouse>
 
- enable-keeper.xml 
  
- chnode2节点的server_id设置为2,其他相同
 
 
<clickhouse>
  <keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>2</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>trace</raft_logs_level>
    </coordination_settings>
    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>chnode1</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>2</id>
            <hostname>chnode2</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>3</id>
            <hostname>chnode3</hostname>
            <port>9234</port>
        </server>
    </raft_configuration>
  </keeper_server>
</clickhouse>
 
- macros.xml 
  
- shard值为2,指定了本节点存储分片2,副本1
 
 
<clickhouse>
  <macros>
    <shard>2</shard>
    <replica>replica_1</replica>
  </macros>
</clickhouse>
 
- remote-servers.xml
 
chnode1和chnode2的remote-servers.xml配置相同。
<clickhouse>
  <remote_servers replace="true">
    <cluster_2S_1R>
    <secret>mysecretphrase</secret>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode1</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode2</host>
                <port>9000</port>
            </replica>
        </shard>
    </cluster_2S_1R>
  </remote_servers>
</clickhouse>
 
- use-keeper.xml
 
和chnode1配置相同
<clickhouse>
    <zookeeper>
        <node index="1">
            <host>chnode1</host>
            <port>9181</port>
        </node>
        <node index="2">
            <host>chnode2</host>
            <port>9181</port>
        </node>
        <node index="3">
            <host>chnode3</host>
            <port>9181</port>
        </node>
    </zookeeper>
</clickhouse>
 
chnode3配置文件
- network-and-logging.xml
 
和chnode1配置相同
<clickhouse>
        <logger>
                <level>debug</level>
                <log>/var/log/clickhouse-server/clickhouse-server.log</log>
                <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
                <size>1000M</size>
                <count>3</count>
        </logger>
        <display_name>clickhouse</display_name>
        <listen_host>0.0.0.0</listen_host>
        <http_port>8123</http_port>
        <tcp_port>9000</tcp_port>
        <interserver_http_port>9009</interserver_http_port>
</clickhouse>
 
- enable-keeper.xml 
  
- chnode3节点的server_id设置为3,其他相同
 
 
<clickhouse>
  <keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>3</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>trace</raft_logs_level>
    </coordination_settings>
    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>chnode1</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>2</id>
            <hostname>chnode2</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>3</id>
            <hostname>chnode3</hostname>
            <port>9234</port>
        </server>
    </raft_configuration>
  </keeper_server>
</clickhouse>
 
4. 重启clickhouse-server
三台服务器全部执行:
systemctl restart clickhouse-server
 
5. 集群搭建测试
下面来创建样例分布表,测试下效果:
- 连接chnode1,执行
SHOW CLUSTERS 
[root@chnode1 clickhouse]# clickhouse-client --password -h 127.0.0.1
ClickHouse client version 22.3.19.6 (official build).
Password for user (default): 
Connecting to 127.0.0.1:9000 as user default.
Connected to ClickHouse server version 22.3.19 revision 54455.
clickhouse :) SHOW CLUSTERS
SHOW CLUSTERS
Query id: 0ff16c63-3c1e-438d-8fad-1e8e25c42235
┌─cluster───────┐
│ cluster_2S_1R │
└───────────────┘
 
- 创建数据库
 
CREATE DATABASE db1 ON CLUSTER cluster_2S_1R
 
- 创建一个分布表
 
CREATE TABLE db1.table1_dist ON CLUSTER cluster_2S_1R
(
    `id` UInt64,
    `column1` String
)
ENGINE = Distributed('cluster_2S_1R', 'db1', 'table1', rand())
 
- 分别
clickhouse-client连接chnode1和chnode2. 
chnode1插入:
INSERT INTO db1.table1 (id, column1) VALUES (1, 'abc');
 
chnode2插入:
INSERT INTO db1.table1 (id, column1) VALUES (2, 'def');
 
- 查询数据
 
clickhouse :) SELECT * FROM db1.table1_dist;
SELECT *
FROM db1.table1_dist
Query id: 8ce26016-f923-472e-894d-a7a3025a8927
┌─id─┬─column1─┐
│  1 │ abc     │
└────┴─────────┘
┌─id─┬─column1─┐
│  1 │ abc     │
└────┴─────────┘
 
Reference List
- https://clickhouse.com/docs/en/architecture/horizontal-scaling
 
