Why Broker Fail-over is not Enough? Broker Clustering With WSO2 Message Broker

First, to learn what is broker fail-over and how it is working with WSO2 Message Broker, read article "Configure Broker Fail-over in a WSO2 Message Broker and WSO2 ESB setup".

Why Broker Clustering?

Broker clustering is somewhat related to fail-over, but there are differences. In the above described fail-over case messages sent to the first broker is not visible to the second broker, as they are completely independent broker instances.

Thus when first broker goes down and it fails over to the second one, messages sent to the first broker is lost or message delivering order is broken.

For an example think of a system that in order message delivery is a mandatory requirement. If you receive messages, process it and insert processed information to a database, and for the second set of data to be inserted if it is mandatory first set of data is inserted, that has the above requirement. On the other hand, it cannot be neglected that HA must be there to get rid of single-point-of-failure.

There you need a Message Broker with proper clustering support.

Another scenario you might want clustering is to deal with "high messaging rates". When using one message broker instance if a particular "queue" gets many incoming messages and there are many subscribers for that queue, subscribers might starve waiting for a message to be delivered by the broker as the broker is hot with dumping incoming messages to the message store and also trying to deliver messages across all subscribers.

Scalable solution to above problem would be clustering. Some subscribers and publishers will be handled by a certain MB instance in the cluster. Whole load will be distributed among the nodes and as a result there will be no (or very little) net slowing down in publisher perspective or subscriber perspective.

If subscribers in a particular node consume messages fast, more messages should be dynamically routed to that node for delivery.

Further more, there should be a distributed algorithm to keep in order message delivery across the cluster.

WSO2 Message Broker 2.1.0 supports all above Fail-over and clustering scenarios and requirements.

A Broker Clustering Scenario with WSO2 MB

Setting up WSO2 MB Cluster

This is very straightforward and easy. There are various deployment patterns how WSO2 Message Broker can be clustered described here.

In this article we will follow simplest deployment pattern i.e "Start External Cassandra and ZooKeeper Servers and Point all Broker Nodes to Them"

Setup MB instances as described in the post "Configure Broker Fail-over in a WSO2 Message Broker and WSO2 ESB setup".
Now we will change the setup so that those 3 instances are members in a broker cluster rather than three standalone instances.
Start a Apache Cassandra Server on some machine as described here.
Start a Apache ZooKeeper Serveron some machine as described here.

As depicted in the same article, navigate to MBx/repository/conf/advanced/qpid-config.xml file and there you need to do two things.

Enable clustering
Change the server of the Zookeeper

<clustering>
 
          <enabled>true</enabled>
          <OnceInOrderSupportEnabled>false</OnceInOrderSupportEnabled>
          <externalCassandraServerRequired>true</externalCassandraServerRequired>
          <externalZookeeperServerRequired>true</externalZookeeperServerRequired>
          <coordination>
              <!-- Apache Zookeeper Address -->
              <ZooKeeperConnection>ZK_IP:2181</ZooKeeperConnection>
              <!-- Format yyyy-MM-dd HH:mm:ss -->
              <ReferenceTime>2012-02-29 08:08:08</ReferenceTime>
          </coordination>
....

Then we need to configure the message store. At each Message Broker node, modify $CARBON_HOME/repository/conf/advanced/qpid-virtualhosts.xml file as follows.

  <store>
     <class>org.wso2.andes.server.store.CassandraMessageStore</class>
     <username>admin</username>
     <password>admin</password>
     <cluster>ClusterOne</cluster>
     <idGenerator>org.wso2.andes.server.cluster.coordination.TimeStampBasedMessageIdGenerator</idGenerator>
     <connectionString>CASSANDRA_IP:9160</connectionString>
  </store>

Now as Cassandra server and zooKeeper server is running with necessary credentials we can start all three Message Broker servers.

Setting up WSO2 ESB

This is same as described at post "Configure Broker Fail-over in a WSO2 Message Broker and WSO2 ESB setup".

Testing the Setup

This is same as described in above article.
Now when MB1 is killed you will notice after the fail-over messages sent to MB1 is not lost. They are delivered back at some time later.

ESB Cluster with a MB Cluster

This is also now possible.

Single point-of-failure at Cassandra or Zookeeper??

Is problem back again? Maybe. But both Cassandra and ZooKeeper can be clustered, or connected in a ring model.

Where is in-order message delivery?

This is at an experimental level still. You can enable it and play with it. Navigate to MBx/repository/conf/advanced/qpid-config.xml file and enable it in all nodes. You need a restart of all nodes.

<clustering>
   <OnceInOrderSupportEnabled>true</OnceInOrderSupportEnabled>
</clustering>

Hasitha's Tech Blessings