Unfortunately it turned out that the base85x encoding proposed in the last post is damn slow. To encode a 400 KB picture from memory to memory, the PHP implementation needs 0.2 seconds and the Java implementation needs 0.1 seconds. That’s only 4 MB/s.
A lot of this time is spent on alphabet table lookups. So I have the hope to be able to reduce the number of lookups. The original base85 just adds 33 to every value 0,…,84 in order to get the right ASCII character. That is fast but leads to all ASCII characters between 33 and 117 including “<” and “>”.
So the question was, what is the simplest function to map 0,…,84 to a subset of the 94 printable ASCII values excluding: < > ” ‘. The best I could find was this one:
In Java I could reduce the encoding time from 0.1 s to 0.075 s by using the following formula instead of a lookup table:
out = (char) ((c1 >= 21) ? c1 + 42 : ((c1 != 0) ? c1 + 39 : 33)));
I think some more enhancements are possible, especially in c/c++ where you can do better binary operations and use unsigned variables.
Overall this leads to the following new base85x encoding alphabet:
But why is base85 so much slower compared to base64? Well, 64 is a power of 2 (2^6 = 64) which means that a multiplication by 64 is equal to a binary shift of 6 positions to the left and a division by 64 is equal to a right shift of 6 positions. Additionally the modulo operation a%64 is the same as cutting the lower 5 bits of a. All these essential operations are slower in base 85.