BLOBs and Messaging

BLOB stands for Binary Large OBject, typically an image, video, or audio file, but could represent any binary data. Often I'm asked how Jabber handles BLOBs and why it doesn't encode and transfer MIME objects within the protocol, so I'd like to explain clearly why I think BLOBs are inappropriate in a diverse messaging infrastructure.

First, let's forget about email. Jabber is a new architecture, and doesn't need to continue to advance the baggage that email has accumulated. Email was designed in a time when the network was a very different place, and very few protocols existed for transferring data via a standard method, so let's clear the slate and not incorrectly assume that we must operate like email.

The problem is quite simply, it is easy to think from a human point of view that a message is a message, wether that be simple text, marked up text, word processing document, a picture, audio clip, or any other data. But from a technical point of view, there are some distinct differences.

Textual data is a base common denominator, it can inherently be displayed on any human medium (viewed on any video display, and spoken on any audio medium). Textual data is also the base common denominator across software, every programming environment supports characters at it's most fundamental level. Any data beyond simple text varies in support across all human mediums and software environments.

Let me also define the "diverse" in diverse messaging infrastructure. By diverse, I mean any two entities that are communicating without any predefined or agreed upon format or protocol for that conversation, and the infrastructure manages the differences in the environments and merges the conversation.

So with that background, what are the reasons for not sending BLOBs across a diverse messaging infrastructure?

  • Bandwidth and Disk Storage are non-negligible cost mediums when dealing with BLOBs, and when a BLOB is part of the message, the recipient must accept the costs without prior acceptance or knowledge of what the BLOB is. An architecture which enforces that is inappropriate, recipients should always be given the choice of wether or not to incur the costs associated with a BLOB.

  • Although illegal data can exist as small text, the majority of it exists as a BLOB in the format of images, audio, and large word processing documents. By allowing BLOBs to be part of a message, the recipient may unknowingly or unwillingly receive illegal data, and would posess a copy and have to dispose of it after having received it. Technologies such as messaging should not forcibly create a liability on the users of that technology.

  • The "last leg", or the path directly to the recipient user, may be restricted by bandwidth, medium, or environment (at public terminal, work, etc). The last leg should not be burdened with dealing with BLOBs it cannot handle and should be given the choice of which to handle.

  • The characteristics of a diverse messaging infrastructure include: routing logic, addressing, multi-hop, medium and format translation, storage and redistribution, and handling envelope data. None of which are BLOB friendly or intrinsicly add value to a BLOB.

    The appropriate way to handle BLOBs in this environment is by reference. Jabber passes a reference to a BLOB around as part of a message, and the BLOB is retrieved out of band via HTTP on demand at the will of the recipient. This model solves the issues I've pointed out above: all data is textual, the recipient doesn't incur the costs associated with the BLOB until they choose to, the recipient isn't forced to accept a copy of the BLOB before knowing what it is, and the receiving application only retrieves the BLOB if it can understand or present that data type.

    Theoretically, any binary data can be encoded into textual data, which could be used to obscure the difference between a textual message and a BLOB. XML is an excellent divider between the grey area differentiating a BLOB from textual data, since it is textual oriented and BLOBs are difficult to encode. By using XML as the structure for the envelope and message an inherent logic is built into the infrastructure: if it's not human intended readable text, does XML add value to the data? If the answer is no, it should be passed by reference.

    So, in a diverse messaging infrastructure such as Jabber, BLOBs are most appropriately handled by passing a reference to them within the protocol and allowing the endpoints to manage them.