Saturday, May 05, 2012

Base64 Encoding in Platypus


Base64 encoding was originally designed for transmitting binary data over SMTP.  When email servers were first created, the SMTP protocol restricted the allowed characters to the first 7bits of ASCII's 8bit character set. Among other things, this prevented binary files from being sent over email.

Then came MIME and with it, base64 encoding, which converted binary files into readable text suitable for sending over email. With base64, for every 3 bytes of binary data input there are 4 bytes of encoded text returned. Compared with hex (base16) encoding, the other dominant encoding mechanism at the time, base64 encoding is much more efficient in terms of storage by lowering the amount of space needed for attachments from 2:1 (for hex) to 4:3 (for base64).  Since its creation, base64 encoding has been included in a variety of internet protocols, including SMTP authentication, HTTP authentication, XML and is used in various subsystems of the Platypus Billing System.

While Visual FoxPro - the base language for the Platypus Billing System - includes features for encoding and decoding using base64 through STRCONV(), that function does not follow the line length requirement in the RFC 2045 specification, which limits encoded lines to a maximum of 76 characters. For example, if data were encoded into 100 characters, two lines of encoded data would result.  The first line 76 characters, followed by CRLF and then the remaining 24 characters.

At first, this limitation was not a problem.  We used 3rd party ActiveX/COM libraries, named EncodeX and SmtpX from Mabry Software, for encoding email attachments and sending emails, respectively.  The other use of base64 within Platypus - our own attachment feature first included in Platypus v3 - did not have to interact with any external systems, so RFC 2045 compliance was not a requirement.

Eventually, we began work on creating a shared library that could wrap all our SMTP functionality for Platypus v5.  Up until this point, there were separate classes and libraries for the Platypus client and API.  For one, the Platypus client used ActiveX forms of the libraries, while the Platypus API used the COM forms.  Of course, this made maintaining that code doubly difficult and it was prone to inconsistent behavior between the Platypus client and API.  Because of that inconsistency, limitations in Visual FoxPro, and compatibility problems, we dropped the ActiveX form of SmtpX and completely dropped the EncodeX libraries.

Unfortunately, because FoxPro's STRCONV() function did not strictly follow MIME, sending attachments encoded using this function would often generate errors.  The strange thing was that these errors were not universal across all mail servers.  Some were more strict than others.  Anyway, to move forward, we decided to develop our own library for base64 encoding.

This was my first C++ project with Platypus.  Up until this point, my development experience - excluding school - was limited to Visual FoxPro, Visual Basic, and SQL.  After an excessive amount of research, we found a decent example in the public domain which performed encoding quickly enough and adapted it into a COM library using Visual C++ 6.0.  That library has been in use since that time for a majority of Platypus v4 and all of Platypus v5 and v6.  Not bad for a piece of of code 10 years old.  With the release of Platypus v7, that COM library has been phased out in favor of the Mailbee SMTP COM library, which handles encoding internally; but the old base64 COM library is still included in our installation sets as part of a fallback feature.

The original C++ source for the encoder has been phased out of our other components, as well; being replaced with a much improved library.  This new library takes advantage of Microsoft's SecureCRT (secure C runtime) guidelines to prevent buffer overflows, now includes an efficient decoder, and has been fuzz tested to ensure stability.  It is currently in use within the updated Mailpopper for re-encoding email attachments that have been decoded by the Mailbee POP3 COM library, which pulls emails into the Helpdesk features in Platypus.

No comments:

Post a Comment