logo
Tags

Yingyu's Magic World

分布式存储Pureflash学习任务清单

目标

熟悉学习pfconductor模块,维护更新,后续需求改动(目前conductor部分用java编写,后续可能会需要改为golong语言)。

任务(2025.09.28)

  1. 服务启动的日志在pfconductor.log.bk,这里就是一个进程启动了(入口函数netbric/s5/conductor/Main.java: public static void main(String[] args)),启动之后,应该有个HTTPServer,走读日志看看服务怎么启动的,可以画流程图。 20251010-image2

  2. 执行 pfcli命令,走读日志,看看create volume是怎么做的。[select_store策略]

  3. 查看数据库的各个表,看看各个字段大概是什么意思。[分析数据库字段意思]

  4. 查看com/netbric/s5/conductor/handler/S5RestfulHandler.java,看看还有什么命令,如果pfccli没有实现,思考应该怎么实现。

重要知识

select_store策略

“存储策略”指的就是一个volume的其中一个shard可以有多个replica,副本分别放在哪些主机上。 20251010-image1

创建卷下面有shard,每个shard有多个replica,select_store的作用就是给每个replica分配存储节点位置。当主副本被指定时,默认优先存储在指定主机上,当未指定则根据剩余空间指定存储位置。

select_store(trans, v.rep_count, volume_size, store_name, tray_ids, store_idx, hostId);
  • 当用户指定了存储节点时:
    1. 查询数据库获取指定存储节点的信息
    2. 如果还指定了托盘,则验证这些托盘是否有足够的空间存放卷数据
    3. 直接使用用户指定的存储节点和托盘
  • 当用户未指定存储节点时:
    1. 调用 select_suitable_store_tray 方法自动选择合适的存储节点和托盘
    2. 根据存储节点的剩余空间进行智能分配

host_id被选择(hostId != -1),当指定了 hostId 参数时,会优先将主副本放置在该主机上,其他副本分散到其他节点上。当 hostId != -1(即 hostId 是一个有效的正整数)时,表示用户明确指定了主副本应该放置在哪个主机上。 当 hostId == -1 时,表示没有指定特定的主机来放置主副本。在这种情况下,系统会采用默认的分布策略来选择存储节点。(策略负载均衡)

存储节点用select_store方法选择,其中分为两种情况,当主副本被指定时,默认优先存储在指定主机上,当未指定则根据剩余空间指定存储位置。

切片shard

public static final long DEFAULT_SHARD_SIZE=64L<<30;

定义一个默认的分片大小为 64GB

注意一个volume卷下有多个shard切片

20251010-image3

查看数据库字段

命令:

mysql -u root -p
SHOW DATABASES;
use s5;
root@flyslice-PowerEdge-R730xd:/home/flyslice/yangxiao/cocalele/jconductor# mysql -u root -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 593
Server version: 10.6.22-MariaDB-0ubuntu0.22.04.1 Ubuntu 22.04

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| s5                 |
| sys                |
+--------------------+
5 rows in set (0.001 sec)

MariaDB [(none)]> use s5;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

MariaDB [s5]> show tables;
+--------------------+
| Tables_in_s5       |
+--------------------+
| seq_gen            |
| t_port             |
| t_quotaset         |
| t_replica          |
| t_shard            |
| t_shared_disk      |
| t_snapshot         |
| t_store            |
| t_tenant           |
| t_tray             |
| t_volume           |
| v_id               |
| v_replica_ext      |
| v_store_alloc_size |
| v_store_free_size  |
| v_store_total_size |
| v_tray_alloc_size  |
| v_tray_free_size   |
| v_tray_total_size  |
+--------------------+

查看表内所有元素:

show tables;

select * from seq_gen;
select * from t_port;
select * from t_quotaset;
select * from t_replica;
select * from t_shard;
select * from t_shared_disk;
select * from t_snapshot;
select * from t_store;
select * from t_tenant;
select * from t_tray;
select * from t_volume;

查看字段格式:

DESC t_tray;
DESC seq_gen;
  # etc

结果1(tables):

MariaDB [s5]> DESC t_tray;
+--------------+-------------+------+-----+---------+-------+
| Field        | Type        | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| uuid         | varchar(64) | NO   | PRI | NULL    |       |
| device       | varchar(96) | NO   |     | NULL    |       |
| status       | varchar(16) | NO   |     | NULL    |       |
| raw_capacity | bigint(20)  | NO   |     | NULL    |       |
| object_size  | bigint(20)  | NO   |     | NULL    |       |
| store_id     | int(11)     | NO   | MUL | NULL    |       |
+--------------+-------------+------+-----+---------+-------+
6 rows in set (0.001 sec)

MariaDB [s5]> DESC seq_gen;
+-----------------------+---------------------+------+-----+---------+-------+
| Field                 | Type                | Null | Key | Default | Extra |
+-----------------------+---------------------+------+-----+---------+-------+
| next_not_cached_value | bigint(21)          | NO   |     | NULL    |       |
| minimum_value         | bigint(21)          | NO   |     | NULL    |       |
| maximum_value         | bigint(21)          | NO   |     | NULL    |       |
| start_value           | bigint(21)          | NO   |     | NULL    |       |
| increment             | bigint(21)          | NO   |     | NULL    |       |
| cache_size            | bigint(21) unsigned | NO   |     | NULL    |       |
| cycle_option          | tinyint(1) unsigned | NO   |     | NULL    |       |
| cycle_count           | bigint(21)          | NO   |     | NULL    |       |
+-----------------------+---------------------+------+-----+---------+-------+
8 rows in set (0.001 sec)

MariaDB [s5]> DESC t_port
    -> ;
+----------+-------------+------+-----+---------+-------+
| Field    | Type        | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| ip_addr  | varchar(16) | NO   | PRI | NULL    |       |
| store_id | int(11)     | YES  |     | NULL    |       |
| purpose  | int(11)     | NO   | PRI | NULL    |       |
| status   | varchar(16) | NO   |     | NULL    |       |
+----------+-------------+------+-----+---------+-------+
4 rows in set (0.001 sec)

MariaDB [s5]> DESC t_quotaset;
+-----------+-------------+------+-----+---------+-------+
| Field     | Type        | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| id        | int(11)     | NO   | PRI | NULL    |       |
| car_id    | int(11)     | NO   |     | NULL    |       |
| name      | varchar(96) | NO   |     | NULL    |       |
| iops      | int(11)     | NO   |     | NULL    |       |
| cbs       | int(11)     | NO   |     | NULL    |       |
| bw        | int(11)     | NO   |     | NULL    |       |
| tenant_id | int(11)     | NO   | MUL | NULL    |       |
+-----------+-------------+------+-----+---------+-------+
7 rows in set (0.001 sec)

MariaDB [s5]> DESC t_replica;
+---------------+-------------+------+-----+---------------------+-------------------------------+
| Field         | Type        | Null | Key | Default             | Extra                         |
+---------------+-------------+------+-----+---------------------+-------------------------------+
| id            | bigint(20)  | NO   | PRI | NULL                |                               |
| replica_index | int(11)     | NO   |     | NULL                |                               |
| volume_id     | bigint(20)  | NO   |     | NULL                |                               |
| shard_id      | bigint(20)  | NO   |     | NULL                |                               |
| store_id      | int(11)     | NO   |     | NULL                |                               |
| tray_uuid     | varchar(64) | NO   |     | NULL                |                               |
| status_time   | datetime    | NO   |     | current_timestamp() | on update current_timestamp() |
| status        | varchar(16) | YES  |     | NULL                |                               |
+---------------+-------------+------+-----+---------------------+-------------------------------+
8 rows in set (0.001 sec)

MariaDB [s5]> DESC t_shard;
+-------------------+---------------------+------+-----+---------------------+-------------------------------+
| Field             | Type                | Null | Key | Default             | Extra                         |
+-------------------+---------------------+------+-----+---------------------+-------------------------------+
| id                | bigint(20) unsigned | NO   | PRI | NULL                |                               |
| volume_id         | bigint(20) unsigned | NO   | MUL | NULL                |                               |
| shard_index       | bigint(20) unsigned | NO   |     | NULL                |                               |
| primary_rep_index | bigint(20) unsigned | NO   |     | NULL                |                               |
| status            | char(16)            | YES  |     | NULL                |                               |
| status_time       | datetime            | NO   |     | current_timestamp() | on update current_timestamp() |
+-------------------+---------------------+------+-----+---------------------+-------------------------------+
6 rows in set (0.001 sec)

MariaDB [s5]> DESC t_shared_disk;
+--------------+-------------+------+-----+---------+-------+
| Field        | Type        | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| uuid         | varchar(64) | NO   | PRI | NULL    |       |
| status       | varchar(16) | NO   |     | NULL    |       |
| raw_capacity | bigint(20)  | NO   |     | NULL    |       |
| object_size  | bigint(20)  | NO   |     | NULL    |       |
| coowner      | varchar(64) | YES  |     | NULL    |       |
+--------------+-------------+------+-----+---------+-------+
5 rows in set (0.001 sec)

MariaDB [s5]> DESC t_snapshot;
+-----------+-------------+------+-----+---------------------+----------------+
| Field     | Type        | Null | Key | Default             | Extra          |
+-----------+-------------+------+-----+---------------------+----------------+
| id        | bigint(20)  | NO   | PRI | NULL                | auto_increment |
| volume_id | bigint(20)  | NO   |     | NULL                |                |
| snap_seq  | int(11)     | NO   |     | NULL                |                |
| name      | varchar(96) | NO   |     | NULL                |                |
| size      | bigint(20)  | NO   |     | NULL                |                |
| created   | datetime    | NO   |     | current_timestamp() |                |
+-----------+-------------+------+-----+---------------------+----------------+
6 rows in set (0.001 sec)

MariaDB [s5]> DESC t_store;
+---------+--------------+------+-----+---------+-------+
| Field   | Type         | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+-------+
| id      | int(11)      | NO   | PRI | NULL    |       |
| name    | varchar(96)  | YES  |     | NULL    |       |
| sn      | varchar(128) | YES  |     | NULL    |       |
| model   | varchar(128) | YES  |     | NULL    |       |
| mngt_ip | varchar(32)  | NO   |     | NULL    |       |
| status  | varchar(16)  | NO   |     | NULL    |       |
+---------+--------------+------+-----+---------+-------+
6 rows in set (0.001 sec)

MariaDB [s5]> DESC t_tenant;
+---------+--------------+------+-----+---------+----------------+
| Field   | Type         | Null | Key | Default | Extra          |
+---------+--------------+------+-----+---------+----------------+
| id      | int(11)      | NO   | PRI | NULL    | auto_increment |
| car_id  | int(11)      | YES  |     | NULL    |                |
| name    | varchar(96)  | NO   | UNI | NULL    |                |
| pass_wd | varchar(256) | NO   |     | NULL    |                |
| auth    | int(11)      | NO   |     | NULL    |                |
| size    | bigint(20)   | NO   |     | NULL    |                |
| iops    | int(11)      | NO   |     | NULL    |                |
| cbs     | int(11)      | NO   |     | NULL    |                |
| bw      | int(11)      | NO   |     | NULL    |                |
+---------+--------------+------+-----+---------+----------------+
9 rows in set (0.001 sec)

MariaDB [s5]> DESC t_volume;
+-------------+-------------+------+-----+---------------------+-------------------------------+
| Field       | Type        | Null | Key | Default             | Extra                         |
+-------------+-------------+------+-----+---------------------+-------------------------------+
| id          | bigint(20)  | NO   | PRI | NULL                |                               |
| name        | varchar(96) | NO   |     | NULL                |                               |
| size        | bigint(20)  | NO   |     | NULL                |                               |
| iops        | int(11)     | NO   |     | NULL                |                               |
| cbs         | int(11)     | NO   |     | NULL                |                               |
| bw          | int(11)     | NO   |     | NULL                |                               |
| tenant_id   | int(11)     | NO   | MUL | NULL                |                               |
| quotaset_id | int(11)     | YES  |     | NULL                |                               |
| status      | varchar(16) | YES  |     | NULL                |                               |
| meta_ver    | int(11)     | YES  |     | 0                   |                               |
| features    | int(11)     | YES  |     | 0                   |                               |
| exposed     | int(11)     | YES  |     | 0                   |                               |
| rep_count   | int(11)     | YES  |     | 1                   |                               |
| snap_seq    | int(11)     | YES  |     | 0                   |                               |
| shard_size  | bigint(20)  | YES  |     | (64 << 30)          |                               |
| status_time | datetime    | NO   |     | current_timestamp() | on update current_timestamp() |
+-------------+-------------+------+-----+---------------------+-------------------------------+
16 rows in set (0.001 sec)

定义这些数据库的函数在jconductor/src/com/netbric/s5/ormPort.java, Replica.java, S5Database.java, Shard.java, SharedDisk.java, Snapshot.java, Status.java, StoreNode.java, Tenant.java, Tray.java, Volume.java.

结果2(views):

MariaDB [s5]> DESC v_id;
+-------+------------+------+-----+---------+-------+
| Field | Type       | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| id    | bigint(20) | NO   |     | 0       |       |
+-------+------------+------+-----+---------+-------+
1 row in set (0.001 sec)

MariaDB [s5]> DESC v_replica_ext;
+---------------+---------------------+------+-----+---------------------+-------+
| Field         | Type                | Null | Key | Default             | Extra |
+---------------+---------------------+------+-----+---------------------+-------+
| volume_id     | bigint(20)          | NO   |     | NULL                |       |
| volume_name   | varchar(96)         | NO   |     | NULL                |       |
| shard_id      | bigint(20) unsigned | NO   |     | NULL                |       |
| shard_index   | bigint(20) unsigned | NO   |     | NULL                |       |
| replica_id    | bigint(20)          | NO   |     | NULL                |       |
| tenant_id     | int(11)             | NO   |     | 0                   |       |
| replica_index | int(11)             | NO   |     | NULL                |       |
| status_time   | datetime            | NO   |     | current_timestamp() |       |
| is_primary    | int(1)              | NO   |     | 0                   |       |
| store_id      | int(11)             | NO   |     | NULL                |       |
| tray_uuid     | varchar(64)         | NO   |     | NULL                |       |
| status        | varchar(16)         | YES  |     | NULL                |       |
| data_ports    | mediumtext          | YES  |     | NULL                |       |
| rep_ports     | mediumtext          | YES  |     | NULL                |       |
+---------------+---------------------+------+-----+---------------------+-------+
14 rows in set (0.002 sec)

MariaDB [s5]> DESC v_store_alloc_size;
+------------+---------------+------+-----+---------+-------+
| Field      | Type          | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id   | int(11)       | NO   |     | NULL    |       |
| alloc_size | decimal(41,0) | YES  |     | NULL    |       |
+------------+---------------+------+-----+---------+-------+
2 rows in set (0.001 sec)

MariaDB [s5]> DESC v_store_free_size;
+------------+---------------+------+-----+---------+-------+
| Field      | Type          | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id   | int(11)       | NO   |     | NULL    |       |
| total_size | decimal(41,0) | YES  |     | NULL    |       |
| alloc_size | decimal(41,0) | YES  |     | NULL    |       |
| free_size  | decimal(42,0) | YES  |     | NULL    |       |
| status     | varchar(16)   | NO   |     | NULL    |       |
+------------+---------------+------+-----+---------+-------+
5 rows in set (0.001 sec)

MariaDB [s5]> DESC v_store_total_size;
+------------+---------------+------+-----+---------+-------+
| Field      | Type          | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id   | int(11)       | NO   |     | NULL    |       |
| total_size | decimal(41,0) | YES  |     | NULL    |       |
+------------+---------------+------+-----+---------+-------+
2 rows in set (0.001 sec)

MariaDB [s5]> DESC v_tray_alloc_size;
+------------+---------------+------+-----+---------+-------+
| Field      | Type          | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id   | int(11)       | NO   |     | NULL    |       |
| tray_uuid  | varchar(64)   | NO   |     | NULL    |       |
| alloc_size | decimal(41,0) | YES  |     | NULL    |       |
+------------+---------------+------+-----+---------+-------+
3 rows in set (0.001 sec)

MariaDB [s5]> DESC v_tray_free_size;
+------------+---------------+------+-----+---------+-------+
| Field      | Type          | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| store_id   | int(11)       | NO   |     | NULL    |       |
| tray_uuid  | varchar(64)   | NO   |     | NULL    |       |
| total_size | bigint(20)    | NO   |     | NULL    |       |
| alloc_size | decimal(41,0) | YES  |     | NULL    |       |
| free_size  | decimal(42,0) | YES  |     | NULL    |       |
| status     | varchar(16)   | NO   |     | NULL    |       |
+------------+---------------+------+-----+---------+-------+
6 rows in set (0.001 sec)

MariaDB [s5]> DESC v_tray_total_size;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| store_id   | int(11)     | NO   |     | NULL    |       |
| tray_uuid  | varchar(64) | NO   |     | NULL    |       |
| total_size | bigint(20)  | NO   |     | NULL    |       |
| status     | varchar(16) | NO   |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+
4 rows in set (0.001 sec)

数据库UML类图(不标准)

数据库UML 数据库UML

views UML类图(不标准)

view UML

其中tray相关的空间用于选择存储节点策略,store相关的空间则是在判断总剩余容量时使用*

任务(25.10.28)

  • 任务1:补充conductor命令
  • 任务2(已完成待检视,在143环境上): 修改pfstore增加 delete existing trays & shard_disk 策略 - 详见”分析t_tray表格冗余问题” 参考https://github.com/cocalele/PureFlash/pull/102
  • 任务3: 修改v_store_alloc_size逻辑,增加实际已分配的空间,解决总空间比分配空间小的问题

任务conductor命令开发(25.10.31)

目前进展:完成open_volume/open_aof, update_volume, list_port(限制连接本机,即-i 1) 暂无法完成:sanity_check(s5bd等进程不存在), expose_volume(缺少s5bd)

待开发: check_volume_exists, deep_scrub_volume, ls_children, query_task(看起来是cmd_scrub_volume、cmd_recovery_volume用到的), move_volume, unexpose_volume, create_tenant, list_tenant, add_store, delete_store