
For me, 2007 looks like the year of messaging backplanes. As an architect, it's a wonderful thing. Long a staple of enterprising messaging architectures, such as financial transaction applications, I believe it to be a very valuable addition to the standard telecom architecture. I'm doing two independent designs at the moment based on enterprise messaging technologies. The first is a gateway between two messaging architectures, designed to help customers migrate legacy messaging applications. An excellent project for me, as if you want to truly understand a protocol, do a gateway that converts it to something else.
The second project is an internal project we'll announce sometime later on in the year. For this one, we are using a hosted messaging backplane from Amazon called Simple Queue Service. (In fact, we are using nearly all of the Amazon APIs in the hosted version of this project, including the elastic computing cloud, simple storage service and, of course, the Turks.) Simple Queue Service, or SQS for short, is a simple, but extraordinarily powerful idea. SQS allows software to send messages between applications in a reliable and scalable way, using Amazon's hosted service. Messages are created by message producers, stored in the queue, and read by message consumers. Many different message producers may add to the same queue, as many different consumers may read from it. Amazon guarantees that a message is read at least once, and will hold a message for at least fifteen days. In practice, messages tend to be consumed nearly instantaneously, but it's good to know you've can go get a cup of coffee and not worry too much about missing a message. Messages can hold any data and, when combined with Amazon's simple storage service, can hold any data of arbitrary size.
An example might help here. Imagine you are designing a billing system for a large telecom carrier. If you have a switch creating call detail records, you could store those CDRs as an XML records in a local database. You could also take that XML data, and put it into an SQS queue to be read by a far end billing system. The billing system would wait for the XML record, and when it arrives, record it, bill it, invoice it, whatever. The calls to SQS are very simple and straightforward, and the service itself is quite inexpensive. Today's alternative implementation would probably use a standard such as Radius, which is supported in the billing world, has no traction outside of ISP billing in the Web world. SQS is a simple and straightforward alternative.
The advantages are numerous. First, since many producers can add to the same queue, you could use SQS to aggregate information from several sources. So, to extend our example, you could aggregate CDR data from many switches into a single outbound stream. Since SQS carries data only, different manufacturers can submit their XML into the queue without any impact to the billing system that consumes the data. Let's keep with the CDR example, and imagine that one of your twenty switches uses a different XML format than the other nineteen. Pretty simple with SQS. Create a piece of software that reads all of the XML from the aggregated stream, looks for the messages that need to be translated, translates and puts them back on the aggregated queue. Worried about the bottleneck? Don't be, as you can have as many instances of translators as you wish, as SQS guarantees that a message is read at least once, and if you wish, only once. The same argument is made for redundancy. You can have as may consumers or producers as you wish. If one fails, your throughput does drop, but the rest of the system is unaffected. (By the way, this is where the elastic computing cloud just rocks. Throughput dropping? Take 60 seconds to spin up a hundred or so servers to take care of it until you catch up, then give the servers back. Since a server costs about ten cents an hour.... well, you do the math.) This also makes components that hang off the message stream safer to deploy in production environments, as you can fractionally deploy components without affecting the core system.
The downside? Messaging backplanes do introduce a bit of delay into the system, and are probably inappropriate for communication between components where real time operation and response times must be guaranteed. As I think about the heuristics of real time design for web services based communications infrastructures, my gut tells me that the logical place for components is where real time communication is critical, where bottlenecks would occur, or where linear scaling is critical. Those components would then be best implemented in an Web Services, or SOA, architecture with messaging backplanes as the scaling and communication backplane.
We've been talking about Amazon SQS, which is undeniably the only reasonable hosted choice we have today. In a platform environment, there's a number of good choices ranging from legacy offerings from Tibco, down to the open source ActiveMQ offering. But honestly, the Amazon web services suite is so easy and inexpensive to use, it would take a lot for me to deploy my own platform. I suppose if you're Verizon, you could match the Amazon platform. Maybe. Either way, I expect to see communications architectures move from the inherently fragile legacy telecom designs towards new, service oriented designs, all on the backs of messaging solutions.