Do We Actually Need JMS XA Transactions (two phase commit) ?


Common use of JMS is to consume messages from a queue or topic, process them using a database or EJB, then acknowledge / commit the message.

Say you are using WSO2 ESB to listen to messages in a queue of WSO2 MB and then you need to write those messages

1. Reliably - without any message loss
2. Without Duplication - records cannot be duplicated



One way to achieve this is to use message store, processor implementations inside ESB. Even that is possible if you are dealing with large volume of messages and you do not care on message order,  it is better if you can use raw JMS rather than using message store-processor implementation as it has to do lot of message conversions along the way.

As described in the post here you can use a JMS proxies in order to send messages and receive messages from queues or topics.

One might think JMS local transactions and using transaction mediator of ESB will solve the problem (with necessary configurations at Message Broker side to redeliver messages and acknowledgment wait times). But the issue would be they are independent two different transactions and not related to reach other. In our case if DB transaction failed JMS transaction should also fail.

But within ESB flow we can link above two as follows.

If all went well, 

1. db transaction will be committed using <transaction action="commit"/>.
2. JMS transaction will be committed automatically.

If DB transaction failed 

1. catch the soap fault (or for an endpoint timeout this will happen only when endpoint is suspended) and do <transaction action="rollback"/>
2. whenever you do above you should rollback the JMS transaction as well setting
"SET_ROLLBACK_ONLY" value="true" scope="axis2"/> 

Even it is possible to do above, there can be edge cases it can go wrong.


Proper way to deal with


Proper way to deal with above is distributed transactions. Transaction Mediator has been added to support the distributed transactions using Java transaction API(JTA). JTA allows applications to perform a distributed transaction, that is, transactions that access and update data on two or more networked computer resources (an example would be to have two databases or a database and a message queue such as JMS).

If you are using more than one resource; e.g. reading a JMS message and writing to a database, you really should use XA - its purpose is to provide atomic transactions for multiple transactional resources. For example there is a small window from when you complete updating the database and your changes are committed up to the point at which you commit/acknowledge the message; if there is a network/hardware/process failure inside that window, the message will be redelivered and you may end up processing duplicates.

Well, what is the problem?


The problem with XA is it can be a bit slow; as the XA protocol requires multiple syncs to disk to ensure it can always recover properly under every possible failure scenario. This adds significant cost (in terms of latency, performance, resources and complexity). Also quite a few EJB servers and databases don't actually properly support XA!


OK, then what is the suggestion?




So a good optimisation is to use regular JMS transactions - with no XA - and just perform some duplicate message detection in your code to check you have not already processed the message.


Or in pseudocode you could use something like the following...

-------------------------------------------------------

onMessage
try {
  if I have not processed this message successfully before {
    do some stuff in the database / with EJBs etc
    jdbc.commit() (unless auto-commit is enabled on the JDBC)
  }
  jms.commit()
}
catch (Exception e) {
  jms.rollback()
}
-----------------------------------------------------

This leads to much better performance since you are not performing many slow syncs to disk (per transaction!) for the XA protocol. The only downside with this approach is it means you have to use some application specific logic to detect if you've processed the message before or not. However its quite common that you'll have some way of detecting this. e.g. if the message contains a purchase order and version; have you stored that purchase order and version in the database yet?
So provided the message has some kind of ID and version, you can often detect duplicates yourself and so not require to pay the performance cost of XA.

So you can use the Message.getJMSRedelivered() method (of cause, with WSO2 ESB you can inspect headers) to detect if a message has been redelivered and only if its been redelivered then perform the duplicate detection check.

Hasitha Hiranya

No comments:

Post a Comment

Instagram