分布式存储Pureflash学习任务清单
目标
熟悉学习pfconductor模块,维护更新,后续需求改动(目前conductor部分用java编写,后续可能会需要改为golong语言)。
任务(2025.09.28)
-
服务启动的日志在pfconductor.log.bk,这里就是一个进程启动了(入口函数netbric/s5/conductor/Main.java: public static void main(String[] args)),启动之后,应该有个HTTPServer,走读日志看看服务怎么启动的,可以画流程图。

-
执行 pfcli命令,走读日志,看看create volume是怎么做的。[
select_store策略] -
查看数据库的各个表,看看各个字段大概是什么意思。[分析数据库字段意思]
-
查看com/netbric/s5/conductor/handler/S5RestfulHandler.java,看看还有什么命令,如果pfccli没有实现,思考应该怎么实现。
重要知识
select_store策略
“存储策略”指的就是一个volume的其中一个shard可以有多个replica,副本分别放在哪些主机上。

创建卷下面有shard,每个shard有多个replica,select_store的作用就是给每个replica分配存储节点位置。当主副本被指定时,默认优先存储在指定主机上,当未指定则根据剩余空间指定存储位置。
select_store(trans, v.rep_count, volume_size, store_name, tray_ids, store_idx, hostId);
- 当用户指定了存储节点时:
- 查询数据库获取指定存储节点的信息
- 如果还指定了托盘,则验证这些托盘是否有足够的空间存放卷数据
- 直接使用用户指定的存储节点和托盘
- 当用户未指定存储节点时:
- 调用 select_suitable_store_tray 方法自动选择合适的存储节点和托盘
- 根据存储节点的剩余空间进行智能分配
host_id被选择(hostId != -1),当指定了 hostId 参数时,会优先将主副本放置在该主机上,其他副本分散到其他节点上。当 hostId != -1(即 hostId 是一个有效的正整数)时,表示用户明确指定了主副本应该放置在哪个主机上。 当 hostId == -1 时,表示没有指定特定的主机来放置主副本。在这种情况下,系统会采用默认的分布策略来选择存储节点。(策略负载均衡)
存储节点用select_store方法选择,其中分为两种情况,当主副本被指定时,默认优先存储在指定主机上,当未指定则根据剩余空间指定存储位置。
切片shard
public static final long DEFAULT_SHARD_SIZE=64L<<30;
定义一个默认的分片大小为 64GB
注意一个volume卷下有多个shard切片

查看数据库字段
命令:
mysql -u root -p
SHOW DATABASES;
use s5;
root@flyslice-PowerEdge-R730xd:/home/flyslice/yangxiao/cocalele/jconductor# mysql -u root -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 593
Server version: 10.6.22-MariaDB-0ubuntu0.22.04.1 Ubuntu 22.04
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| s5 |
| sys |
+--------------------+
5 rows in set (0.001 sec)
MariaDB [(none)]> use s5;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MariaDB [s5]> show tables;
+--------------------+
| Tables_in_s5 |
+--------------------+
| seq_gen |
| t_port |
| t_quotaset |
| t_replica |
| t_shard |
| t_shared_disk |
| t_snapshot |
| t_store |
| t_tenant |
| t_tray |
| t_volume |
| v_id |
| v_replica_ext |
| v_store_alloc_size |
| v_store_free_size |
| v_store_total_size |
| v_tray_alloc_size |
| v_tray_free_size |
| v_tray_total_size |
+--------------------+
查看表内所有元素:
show tables;
select * from seq_gen;
select * from t_port;
select * from t_quotaset;
select * from t_replica;
select * from t_shard;
select * from t_shared_disk;
select * from t_snapshot;
select * from t_store;
select * from t_tenant;
select * from t_tray;
select * from t_volume;
查看字段格式:
DESC t_tray;
DESC seq_gen;
# etc
结果1(tables):
MariaDB [s5]> DESC t_tray;
+--------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| uuid | varchar(64) | NO | PRI | NULL | |
| device | varchar(96) | NO | | NULL | |
| status | varchar(16) | NO | | NULL | |
| raw_capacity | bigint(20) | NO | | NULL | |
| object_size | bigint(20) | NO | | NULL | |
| store_id | int(11) | NO | MUL | NULL | |
+--------------+-------------+------+-----+---------+-------+
6 rows in set (0.001 sec)
MariaDB [s5]> DESC seq_gen;
+-----------------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+---------------------+------+-----+---------+-------+
| next_not_cached_value | bigint(21) | NO | | NULL | |
| minimum_value | bigint(21) | NO | | NULL | |
| maximum_value | bigint(21) | NO | | NULL | |
| start_value | bigint(21) | NO | | NULL | |
| increment | bigint(21) | NO | | NULL | |
| cache_size | bigint(21) unsigned | NO | | NULL | |
| cycle_option | tinyint(1) unsigned | NO | | NULL | |
| cycle_count | bigint(21) | NO | | NULL | |
+-----------------------+---------------------+------+-----+---------+-------+
8 rows in set (0.001 sec)
MariaDB [s5]> DESC t_port
-> ;
+----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| ip_addr | varchar(16) | NO | PRI | NULL | |
| store_id | int(11) | YES | | NULL | |
| purpose | int(11) | NO | PRI | NULL | |
| status | varchar(16) | NO | | NULL | |
+----------+-------------+------+-----+---------+-------+
4 rows in set (0.001 sec)
MariaDB [s5]> DESC t_quotaset;
+-----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| car_id | int(11) | NO | | NULL | |
| name | varchar(96) | NO | | NULL | |
| iops | int(11) | NO | | NULL | |
| cbs | int(11) | NO | | NULL | |
| bw | int(11) | NO | | NULL | |
| tenant_id | int(11) | NO | MUL | NULL | |
+-----------+-------------+------+-----+---------+-------+
7 rows in set (0.001 sec)
MariaDB [s5]> DESC t_replica;
+---------------+-------------+------+-----+---------------------+-------------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------------------+-------------------------------+
| id | bigint(20) | NO | PRI | NULL | |
| replica_index | int(11) | NO | | NULL | |
| volume_id | bigint(20) | NO | | NULL | |
| shard_id | bigint(20) | NO | | NULL | |
| store_id | int(11) | NO | | NULL | |
| tray_uuid | varchar(64) | NO | | NULL | |
| status_time | datetime | NO | | current_timestamp() | on update current_timestamp() |
| status | varchar(16) | YES | | NULL | |
+---------------+-------------+------+-----+---------------------+-------------------------------+
8 rows in set (0.001 sec)
MariaDB [s5]> DESC t_shard;
+-------------------+---------------------+------+-----+---------------------+-------------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------------------+------+-----+---------------------+-------------------------------+
| id | bigint(20) unsigned | NO | PRI | NULL | |
| volume_id | bigint(20) unsigned | NO | MUL | NULL | |
| shard_index | bigint(20) unsigned | NO | | NULL | |
| primary_rep_index | bigint(20) unsigned | NO | | NULL | |
| status | char(16) | YES | | NULL | |
| status_time | datetime | NO | | current_timestamp() | on update current_timestamp() |
+-------------------+---------------------+------+-----+---------------------+-------------------------------+
6 rows in set (0.001 sec)
MariaDB [s5]> DESC t_shared_disk;
+--------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| uuid | varchar(64) | NO | PRI | NULL | |
| status | varchar(16) | NO | | NULL | |
| raw_capacity | bigint(20) | NO | | NULL | |
| object_size | bigint(20) | NO | | NULL | |
| coowner | varchar(64) | YES | | NULL | |
+--------------+-------------+------+-----+---------+-------+
5 rows in set (0.001 sec)
MariaDB [s5]> DESC t_snapshot;
+-----------+-------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------------------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| volume_id | bigint(20) | NO | | NULL | |
| snap_seq | int(11) | NO | | NULL | |
| name | varchar(96) | NO | | NULL | |
| size | bigint(20) | NO | | NULL | |
| created | datetime | NO | | current_timestamp() | |
+-----------+-------------+------+-----+---------------------+----------------+
6 rows in set (0.001 sec)
MariaDB [s5]> DESC t_store;
+---------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| name | varchar(96) | YES | | NULL | |
| sn | varchar(128) | YES | | NULL | |
| model | varchar(128) | YES | | NULL | |
| mngt_ip | varchar(32) | NO | | NULL | |
| status | varchar(16) | NO | | NULL | |
+---------+--------------+------+-----+---------+-------+
6 rows in set (0.001 sec)
MariaDB [s5]> DESC t_tenant;
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| car_id | int(11) | YES | | NULL | |
| name | varchar(96) | NO | UNI | NULL | |
| pass_wd | varchar(256) | NO | | NULL | |
| auth | int(11) | NO | | NULL | |
| size | bigint(20) | NO | | NULL | |
| iops | int(11) | NO | | NULL | |
| cbs | int(11) | NO | | NULL | |
| bw | int(11) | NO | | NULL | |
+---------+--------------+------+-----+---------+----------------+
9 rows in set (0.001 sec)
MariaDB [s5]> DESC t_volume;
+-------------+-------------+------+-----+---------------------+-------------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------------------+-------------------------------+
| id | bigint(20) | NO | PRI | NULL | |
| name | varchar(96) | NO | | NULL | |
| size | bigint(20) | NO | | NULL | |
| iops | int(11) | NO | | NULL | |
| cbs | int(11) | NO | | NULL | |
| bw | int(11) | NO | | NULL | |
| tenant_id | int(11) | NO | MUL | NULL | |
| quotaset_id | int(11) | YES | | NULL | |
| status | varchar(16) | YES | | NULL | |
| meta_ver | int(11) | YES | | 0 | |
| features | int(11) | YES | | 0 | |
| exposed | int(11) | YES | | 0 | |
| rep_count | int(11) | YES | | 1 | |
| snap_seq | int(11) | YES | | 0 | |
| shard_size | bigint(20) | YES | | (64 << 30) | |
| status_time | datetime | NO | | current_timestamp() | on update current_timestamp() |
+-------------+-------------+------+-----+---------------------+-------------------------------+
16 rows in set (0.001 sec)
定义这些数据库的函数在jconductor/src/com/netbric/s5/orm下
Port.java, Replica.java, S5Database.java, Shard.java, SharedDisk.java, Snapshot.java, Status.java, StoreNode.java, Tenant.java, Tray.java, Volume.java.
结果2(views):
MariaDB [s5]> DESC v_id;
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| id | bigint(20) | NO | | 0 | |
+-------+------------+------+-----+---------+-------+
1 row in set (0.001 sec)
MariaDB [s5]> DESC v_replica_ext;
+---------------+---------------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------------+------+-----+---------------------+-------+
| volume_id | bigint(20) | NO | | NULL | |
| volume_name | varchar(96) | NO | | NULL | |
| shard_id | bigint(20) unsigned | NO | | NULL | |
| shard_index | bigint(20) unsigned | NO | | NULL | |
| replica_id | bigint(20) | NO | | NULL | |
| tenant_id | int(11) | NO | | 0 | |
| replica_index | int(11) | NO | | NULL | |
| status_time | datetime | NO | | current_timestamp() | |
| is_primary | int(1) | NO | | 0 | |
| store_id | int(11) | NO | | NULL | |
| tray_uuid | varchar(64) | NO | | NULL | |
| status | varchar(16) | YES | | NULL | |
| data_ports | mediumtext | YES | | NULL | |
| rep_ports | mediumtext | YES | | NULL | |
+---------------+---------------------+------+-----+---------------------+-------+
14 rows in set (0.002 sec)
MariaDB [s5]> DESC v_store_alloc_size;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id | int(11) | NO | | NULL | |
| alloc_size | decimal(41,0) | YES | | NULL | |
+------------+---------------+------+-----+---------+-------+
2 rows in set (0.001 sec)
MariaDB [s5]> DESC v_store_free_size;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id | int(11) | NO | | NULL | |
| total_size | decimal(41,0) | YES | | NULL | |
| alloc_size | decimal(41,0) | YES | | NULL | |
| free_size | decimal(42,0) | YES | | NULL | |
| status | varchar(16) | NO | | NULL | |
+------------+---------------+------+-----+---------+-------+
5 rows in set (0.001 sec)
MariaDB [s5]> DESC v_store_total_size;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id | int(11) | NO | | NULL | |
| total_size | decimal(41,0) | YES | | NULL | |
+------------+---------------+------+-----+---------+-------+
2 rows in set (0.001 sec)
MariaDB [s5]> DESC v_tray_alloc_size;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id | int(11) | NO | | NULL | |
| tray_uuid | varchar(64) | NO | | NULL | |
| alloc_size | decimal(41,0) | YES | | NULL | |
+------------+---------------+------+-----+---------+-------+
3 rows in set (0.001 sec)
MariaDB [s5]> DESC v_tray_free_size;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id | int(11) | NO | | NULL | |
| tray_uuid | varchar(64) | NO | | NULL | |
| total_size | bigint(20) | NO | | NULL | |
| alloc_size | decimal(41,0) | YES | | NULL | |
| free_size | decimal(42,0) | YES | | NULL | |
| status | varchar(16) | NO | | NULL | |
+------------+---------------+------+-----+---------+-------+
6 rows in set (0.001 sec)
MariaDB [s5]> DESC v_tray_total_size;
+------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| store_id | int(11) | NO | | NULL | |
| tray_uuid | varchar(64) | NO | | NULL | |
| total_size | bigint(20) | NO | | NULL | |
| status | varchar(16) | NO | | NULL | |
+------------+-------------+------+-----+---------+-------+
4 rows in set (0.001 sec)
数据库UML类图(不标准)
views UML类图(不标准)

其中tray相关的空间用于选择存储节点策略,store相关的空间则是在判断总剩余容量时使用*
任务(25.10.28)
- 任务1:补充
conductor命令 - 任务2(已完成待检视,在143环境上): 修改pfstore增加
delete existing trays & shard_disk策略 - 详见”分析t_tray表格冗余问题” 参考https://github.com/cocalele/PureFlash/pull/102 - 任务3: 修改
v_store_alloc_size逻辑,增加实际已分配的空间,解决总空间比分配空间小的问题
任务conductor命令开发(25.10.31)
目前进展:完成open_volume/open_aof, update_volume, list_port(限制连接本机,即-i 1) 暂无法完成:sanity_check(s5bd等进程不存在), expose_volume(缺少s5bd)
待开发: check_volume_exists, deep_scrub_volume, ls_children, query_task(看起来是cmd_scrub_volume、cmd_recovery_volume用到的), move_volume, unexpose_volume, create_tenant, list_tenant, add_store, delete_store