Create a highly available distributed database with Apache ShardingSphere

Follow this example of ShardingSphere's high availability and dynamic read/write splitting as the basis for your own configurations. 
No readers like this yet.
Databases as a service

Jason Baker. CC BY-SA 4.0.

Modern business systems must be highly available, reliable, and stable in the digital age. As the cornerstone of the current business system, databases are supposed to embrace high availability.

High availability (HA) allows databases to switch services between primary and secondary database nodes. HA automatically selects a primary, picking the best node when the previous one crashes.

MySQL high availability

There are plenty of MySQL high availability options, each with pros and cons. Below are several common high availability options:

  • Orchestrator is a MySQL HA and replication topology management tool written in Go. Its advantage lies in its support for manual adjustment of the primary-secondary topology, automatic failover, and automatic or manual recovery of primary nodes through a graphical web console. However, the program needs to be deployed separately and has a steep learning curve due to its complex configurations.
  • MHA is another mature solution. It provides primary/secondary switching and failover capabilities. The good thing about it is that it can ensure the least data loss in the switching process and works with semi-synchronous and asynchronous replication frameworks. However, only the primary node is monitored after MHA starts, and MHA doesn't provide the load balancing feature for the read database.
  • MGR implements group replication based on the distributed Paxos protocol to ensure data consistency. It is an official HA component provided by MySQL, and no extra deployment program is required. Instead, users only need to install the MGR plugin on each data source node. The tool features high consistency, fault tolerance, scalability, and flexibility.

Apache ShardingSphere high availability

Apache ShardingSphere's architecture actually separates storage from computing. The storage node represents the underlying database, such as MySQL, PostgreSQL, openGauss, etc., while compute node refers to ShardingSphere-JDBC or ShardingSphere-Proxy.

Accordingly, the high availability solutions for storage nodes and compute nodes are different. Stateless compute nodes need to perceive the changes in storage nodes. They also need to set up separate load balancers and have the capabilities of service discovery and request distribution. Stateful storage nodes must provide data synchronization, connection testing, primary node election, and so on.

Although ShardingSphere doesn't provide a database with high availability, it can help users integrate database HA solutions such as primary-secondary switchover, faults discovery, traffic switching governance, and so on with the help of the database HA and through its capabilities of database discovery and dynamic perception.

When combined with the primary-secondary flow control feature in distributed scenarios, ShardingSphere can provide better high availability read/write splitting solutions. It will be easier to operate and manage ShardingSphere clusters using DistSQL's dynamic high availability adjustment rules to get primary/secondary nodes' information.

Best practices

Apache ShardingSphere adopts a plugin-oriented architecture so that you can use all its enhanced capabilities independently or together. Its high availability function is often used with read/write splitting to distribute query requests to the secondary databases according to the load balancing algorithm to ensure system HA, relieve primary database pressure, and improve business system throughput.

Note that ShardingSphere HA implementation leans on its distributed governance capability. Therefore, it can only be used under the cluster mode for the time being. Meanwhile, read/write splitting rules are revised in ShardingSphere 5.1.0. Please refer to the official documentation about read/write splitting for details.

Consider the following HA+read/write splitting configuration with ShardingSphere DistSQL RAL statements as an example. The example begins with the configuration, requirements, and initial SQL.

Configuration

schemaName: database_discovery_db

dataSources:
  ds_0:
    url: jdbc:mysql://127.0.0.1:1231/demo_primary_ds?serverTimezone=UTC&useSSL=false
    username: root
    password: 123456
    connectionTimeoutMilliseconds: 3000
    idleTimeoutMilliseconds: 60000
    maxLifetimeMilliseconds: 1800000
    maxPoolSize: 50
    minPoolSize: 1
  ds_1:
    url: jdbc:mysql://127.0.0.1:1232/demo_primary_ds?serverTimezone=UTC&useSSL=false
    username: root
    password: 123456
    connectionTimeoutMilliseconds: 3000
    idleTimeoutMilliseconds: 60000
    maxLifetimeMilliseconds: 1800000
    maxPoolSize: 50
    minPoolSize: 1
  ds_2:
    url: jdbc:mysql://127.0.0.1:1233/demo_primary_ds?serverTimezone=UTC&useSSL=false
    username: root
    password: 123456
    connectionTimeoutMilliseconds: 3000
    idleTimeoutMilliseconds: 50000
    maxLifetimeMilliseconds: 1300000
    maxPoolSize: 50
    minPoolSize: 1

rules:
  - !READWRITE_SPLITTING
    dataSources:
      replication_ds:
        type: Dynamic
        props:
          auto-aware-data-source-name: mgr_replication_ds
  - !DB_DISCOVERY
    dataSources:
      mgr_replication_ds:
        dataSourceNames:
          - ds_0
          - ds_1
          - ds_2
        discoveryHeartbeatName: mgr-heartbeat
        discoveryTypeName: mgr
    discoveryHeartbeats:
      mgr-heartbeat:
        props:
          keep-alive-cron: '0/5 * * * * ?'
    discoveryTypes:
      mgr:
        type: MGR
        props:
          group-name: b13df29e-90b6-11e8-8d1b-525400fc3996

Requirements

  • ShardingSphere-Proxy 5.1.0 (Cluster mode + HA + dynamic read/write splitting rule)
  • Zookeeper 3.7.0
  • MySQL MGR cluster

SQL script

CREATE TABLE `t_user` (
  `id` int(8) NOT NULL,
  `mobile` char(20) NOT NULL,
  `idcard` varchar(18) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

First, view the primary-secondary relationship:

mysql> SHOW READWRITE_SPLITTING RULES;
+----------------+-----------------------------+------------------------+------------------------+--------------------+---------------------+
| name           | auto_aware_data_source_name | write_data_source_name | read_data_source_names | load_balancer_type | load_balancer_props |
+----------------+-----------------------------+------------------------+------------------------+--------------------+---------------------+
| replication_ds | mgr_replication_ds          | ds_0                   | ds_1,ds_2              | NULL               |                     |
+----------------+-----------------------------+------------------------+------------------------+--------------------+---------------------+
1 row in set (0.09 sec)

You can also look at the secondary database state:

mysql> SHOW READWRITE_SPLITTING READ RESOURCES;
+----------+---------+
| resource | status  |
+----------+---------+
| ds_1     | enabled |
| ds_2     | enabled |
+----------+---------+

The results above show that the primary database is currently ds_0, while secondary databases are ds_1 and ds_2.

Next, test INSERT:

mysql> INSERT INTO t_user(id, mobile, idcard) value (10000, '13718687777', '141121xxxxx');
Query OK, 1 row affected (0.10 sec)

View the ShardingSphere-Proxy log and see if the route node is the primary database ds_0.

[INFO ] 2022-02-28 15:28:21.495 [ShardingSphere-Command-2] ShardingSphere-SQL - Logic SQL: INSERT INTO t_user(id, mobile, idcard) value (10000, '13718687777', '141121xxxxx')
[INFO ] 2022-02-28 15:28:21.495 [ShardingSphere-Command-2] ShardingSphere-SQL - SQLStatement: MySQLInsertStatement(setAssignment=Optional.empty, onDuplicateKeyColumns=Optional.empty)
[INFO ] 2022-02-28 15:28:21.495 [ShardingSphere-Command-2] ShardingSphere-SQL - Actual SQL: ds_0 ::: INSERT INTO t_user(id, mobile, idcard) value (10000, '13718687777', '141121xxxxx')

Now test SELECT (repeat it twice):

mysql> SELECT id, mobile, idcard FROM t_user WHERE id = 10000;

View the ShardingSphere-Proxy log and see if the route node is ds_1 or ds_2.

[INFO ] 2022-02-28 15:34:07.912 [ShardingSphere-Command-4] ShardingSphere-SQL - Logic SQL: SELECT id, mobile, idcard FROM t_user WHERE id = 10000
[INFO ] 2022-02-28 15:34:07.913 [ShardingSphere-Command-4] ShardingSphere-SQL - SQLStatement: MySQLSelectStatement(table=Optional.empty, limit=Optional.empty, lock=Optional.empty, window=Optional.empty)
[INFO ] 2022-02-28 15:34:07.913 [ShardingSphere-Command-4] ShardingSphere-SQL - Actual SQL: ds_1 ::: SELECT id, mobile, idcard FROM t_user WHERE id = 10000
[INFO ] 2022-02-28 15:34:21.501 [ShardingSphere-Command-4] ShardingSphere-SQL - Logic SQL: SELECT id, mobile, idcard FROM t_user WHERE id = 10000
[INFO ] 2022-02-28 15:34:21.502 [ShardingSphere-Command-4] ShardingSphere-SQL - SQLStatement: MySQLSelectStatement(table=Optional.empty, limit=Optional.empty, lock=Optional.empty, window=Optional.empty)
[INFO ] 2022-02-28 15:34:21.502 [ShardingSphere-Command-4] ShardingSphere-SQL - Actual SQL: ds_2 ::: SELECT id, mobile, idcard FROM t_user WHERE id = 10000

Switch to the primary database

Close the primary database ds_0:

Close primary database

(Zhao Jinchao, CC BY-SA 4.0)

View whether the primary database has changed and if the secondary database state is correct through DistSQL:

[INFO ] 2022-02-28 15:34:07.912 [ShardingSphere-Command-4] ShardingSphere-SQL - Logic SQL: SELECT id, mobile, idcard FROM t_user WHERE id = 10000
[INFO ] 2022-02-28 15:34:07.913 [ShardingSphere-Command-4] ShardingSphere-SQL - SQLStatement: MySQLSelectStatement(table=Optional.empty, limit=Optional.empty, lock=Optional.empty, window=Optional.empty)
[INFO ] 2022-02-28 15:34:07.913 [ShardingSphere-Command-4] ShardingSphere-SQL - Actual SQL: ds_1 ::: SELECT id, mobile, idcard FROM t_user WHERE id = 10000
[INFO ] 2022-02-28 15:34:21.501 [ShardingSphere-Command-4] ShardingSphere-SQL - Logic SQL: SELECT id, mobile, idcard FROM t_user WHERE id = 10000
[INFO ] 2022-02-28 15:34:21.502 [ShardingSphere-Command-4] ShardingSphere-SQL - SQLStatement: MySQLSelectStatement(table=Optional.empty, limit=Optional.empty, lock=Optional.empty, window=Optional.empty)
[INFO ] 2022-02-28 15:34:21.502 [ShardingSphere-Command-4] ShardingSphere-SQL - Actual SQL: ds_2 ::: SELECT id, mobile, idcard FROM t_user WHERE id = 10000

Now, INSERT another line of data:

mysql> INSERT INTO t_user(id, mobile, idcard) value (10001, '13521207777', '110xxxxx');
Query OK, 1 row affected (0.04 sec)

View the ShardingSphere-Proxy log and see if the route node is the primary database ds_1:

[INFO ] 2022-02-28 15:40:26.784 [ShardingSphere-Command-6] ShardingSphere-SQL - Logic SQL: INSERT INTO t_user(id, mobile, idcard) value (10001, '13521207777', '110xxxxx')
[INFO ] 2022-02-28 15:40:26.784 [ShardingSphere-Command-6] ShardingSphere-SQL - SQLStatement: MySQLInsertStatement(setAssignment=Optional.empty, onDuplicateKeyColumns=Optional.empty)
[INFO ] 2022-02-28 15:40:26.784 [ShardingSphere-Command-6] ShardingSphere-SQL - Actual SQL: ds_1 ::: INSERT INTO t_user(id, mobile, idcard) value (10001, '13521207777', '110xxxxx')

Finally, test SELECT(repeat it twice):

mysql> SELECT id, mobile, idcard FROM t_user WHERE id = 10001;

View the ShardingSphere-Proxy log and see if the route node is ds_2:

[INFO ] 2022-02-28 15:42:00.651 [ShardingSphere-Command-7] ShardingSphere-SQL - Logic SQL: SELECT id, mobile, idcard FROM t_user WHERE id = 10001
[INFO ] 2022-02-28 15:42:00.651 [ShardingSphere-Command-7] ShardingSphere-SQL - SQLStatement: MySQLSelectStatement(table=Optional.empty, limit=Optional.empty, lock=Optional.empty, window=Optional.empty)
[INFO ] 2022-02-28 15:42:00.651 [ShardingSphere-Command-7] ShardingSphere-SQL - Actual SQL: ds_2 ::: SELECT id, mobile, idcard FROM t_user WHERE id = 10001
[INFO ] 2022-02-28 15:42:02.148 [ShardingSphere-Command-7] ShardingSphere-SQL - Logic SQL: SELECT id, mobile, idcard FROM t_user WHERE id = 10001
[INFO ] 2022-02-28 15:42:02.149 [ShardingSphere-Command-7] ShardingSphere-SQL - SQLStatement: MySQLSelectStatement(table=Optional.empty, limit=Optional.empty, lock=Optional.empty, window=Optional.empty)
[INFO ] 2022-02-28 15:42:02.149 [ShardingSphere-Command-7] ShardingSphere-SQL - Actual SQL: ds_2 ::: SELECT id, mobile, idcard FROM t_user WHERE id = 10001

Release the secondary databases

Release the secondary database

(Zhao Jinchao, CC BY-SA 4.0)

View the latest primary-secondary relationship changes through DistSQL. The state of the ds_0 node is recovered as enabled, while ds_0 is integrated to read_data_source_names:

mysql> SHOW READWRITE_SPLITTING RULES;
+----------------+-----------------------------+------------------------+------------------------+--------------------+---------------------+
| name           | auto_aware_data_source_name | write_data_source_name | read_data_source_names | load_balancer_type | load_balancer_props |
+----------------+-----------------------------+------------------------+------------------------+--------------------+---------------------+
| replication_ds | mgr_replication_ds          | ds_1                   | ds_0,ds_2              | NULL               |                     |
+----------------+-----------------------------+------------------------+------------------------+--------------------+---------------------+
1 row in set (0.01 sec)

mysql> SHOW READWRITE_SPLITTING READ RESOURCES;
+----------+---------+
| resource | status  |
+----------+---------+
| ds_0     | enabled |
| ds_2     | enabled |
+----------+---------+
2 rows in set (0.00 sec)

Wrap up

Database high availability is critical in today's business environments, and Apache ShardingSphere can help provide the necessary reliability. Based on the above example, you now know more about ShardingSphere's high availability and dynamic read/write splitting. Use this example as the basis for your own configurations. 

Tags

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.