Traum-Altbaubüro in Wiesbaden am Kranzplatz

Das Team vom Kullo vermietet ihr Büro zur Zwischenmiete unter.

Ein geschlossener Raum in Büroetage zusammen mit 2 netten Grafikern und einer Anwältin.

Aufteilung

Raum A: 2 Grafiker
Raum B: Anwältin
Raum C: zu vermieten, 4 Arbeitsplätze

Raum C

Zeitraum: April 2014 – März 2015 (12 Monate, ggf. verhandelbar)
Lage: Kranzplatz Wiesbaden
Arbeitsplätze: 4
Internet: 64 MBit inklusive
Küche: vorhanden

Kontakt

Alle Details bei Simon Warta
simon@kullo.net
0175 1930541
Ruft einfach jederzeit durch!

DSC00585 DSC00587 DSC00591

3 der 4 Schreibtische

3 der 4 Schreibtische

SONY DSC

4m hohe Wände

Beste Couch

Beste Couch

Second draft of the Kullo Letter Model

Kullo Letter Structure

letter receiver address
symmetric (RSA encrypted) algo string key string iv string
messageContainer (encrypted)
message (compressed)
header type enum to string cc string bcc string from string subject string date datetime letterId string[0-9a-Z] inReplyTo string
body
footer
imprint name string rank string company string logo file comments text
attachmentsIndex
item name string mime string start int length int
messageDigest string messageSignature string
attachmentsContainer (encrypted) attachedTo MessageId attachments List<Attachment> attachmentsDigest string attachmentsSignature string

Kullo Attachment Structure used within a Letter

Attachment data blob name string mime string length int

Datetime format in Kullo letters

When dealing with dates and times you can easily find a handful of formats representing the exact same time. For example

  • Timestamp: 1349697903
  • German human: 8.10.12 14:05:03
  • MySQL: 2012-10-08 14:05:03
  • E-Mail: Mon, 08 Oct 2012 14:05:03 +0200
  • ISO 8601: 2012-10-08T14:05:03+02:00

The granularity of all these five formats is 1 second, which is fine for online actions. For our daily workflow it would be enough to see the minute of a message in our inbox, but sometimes we need to know the order of a bunch of messages that might arrive within a single minute.

The date format for Kullo should be human readable. Developers have to love it for testing and debugging because they create our software! So the timestamp is not an option. A timestamp might perform well when doing a lot of calculations time differences and time ranges since it is an integer. But we don’t need that.

Local formats do not make a lot of sense as well. They will come into play for client software later.

The information about the time zone is helpful when sending messages around the globe. They allow to refer to your local time without calculating into UTC, which would reduce the readability. But a reply message sent at nine o’clock from New York City will be sorted behind the original message sent at ten o’clock from Berlin.

We are not interested in the day of the week, since that is a redundant information. Client software will look it up if necessary.

Another aspect is sortability. When sorting German dates, you have to split and convert the date string to a vector ((int) year, (int) month, (int) day, (int) hour, (int) minute, (int) second), sort it lexicographically, and convert it back again. With MySQL and ISO 8601 you can do simple lexicographical string sorting because we have a correct order and always have leading zeros.

Thus ISO 8601 is in my opinion the best overall format for international date and time fields. It’s strict structure make it easy to convert. There is no white space, nothing optional et cetera. Since ISO 8601 defines a bunch of different formats for different purposes, we only use the profile specified the NOTE-datetime by the W3C:

YYYY-MM-DDThh:mm:ssTZD

where

YYYY = four-digit year
MM = two-digit month (01=January, etc.)
DD = two-digit day of month (01 through 31)
hh = two digits of hour (00 through 23) (am/pm NOT allowed)
mm = two digits of minute (00 through 59)
ss = two digits of second (00 through 59)
TZD = time zone designator (Z or +hh:mm or -hh:mm)

Base85x-java available on SourceForge

I am thrilled to announce that the first base85x code is available online on SourceForge for free! During the last couple of weeks I tried differen approaches to implement base85x in java with a good performance.

The project as well as the source code is available here: https://sourceforge.net/projects/base85xjava/. Downloads will be packed as soon as there are more tests and better comments in the code.

Next step will be a simple but less efficient PHP implementation in order to be able to provider an online encoder and decoder for base85x.

Cheers!

Minor correction of base85x encoding

In the transformation between base 85 values and ASCII characters there was a minor mistake last time. 40 minus 2 is 38 of course, not 39.

The corrected formulars are

From base 85 to ACSII
out = (c >= 51 ? c+42 : (c >= 22 ? c+41 : (c >= 2 ? c+38 : c+36)));

From ASCII to base 85
c = ((in >= 93) ? in-42 : ((in >= 63) ? in-41 : ((in >= 40) ? in-38 : in-36)));

Getting rid of the backslash \ in the base85x alphabet

Escaping sucks, right? Last version of the base85x alphabet included the backslash character which is used in a bunch of popular programming languages like for example C, Java and PHP.

But we want to be able to store small binary files inside of sourcecode without taking care about escaping. Just like

$image_data = '
?%AvZOxjj)vwXPQ_*hhA2Sj~vME^X32O]TMX/0+P41T2AtZ4;ER04^WhSDqym%r;
ei,%N}}O|3b3w{WawiL4twR51zu4CEQ{OvNL/ebH:wK)KWF{x5IC.C2[Hlpq|vh9
9g/]vUEa6/,lF[YjpYrV%d2zqr5nQlp)PXzqXbWtE}]ZcN*FKK2$~YcoYK6S:W4G
^u9Yw:h^m6d.ukMTGEiCt^8?(ybvbCr,ij;QDoHTjuV(,$niJj7,);/3Mx9oVygq
%w9q*i0_pBy)Np~(/oYnO32H8t~3u+gAxF6M.YutDQ+v]Ege$UDFCb6lnAWjJ2un
DfCV.(h%f}N4YL(knk+dcy{bc%4,.],7}Uo{ox(*lRH,r.AoHi,614GxIF7?8Oy[
g_u3hyf5e?/s?/cPbvuP3VSmpIRiWgT})HmnZ*,pFE[%y*mH^rQZo]iF3P)bg0wY
Ko%){H.qQhA/GdlS^E,X}M/6Nvt{Vti2Z{k9A|C(35mc1rkml+IP,Jva6.P1Qs[f
nwAY.m}i?cyimPKMl%9Xq](WEabr77
';

Thus the base85x alphabet changed as follows:

In formular style that’s the same as

out = (char) (c1 >= 51 ? c1 + 42 : (c1 >= 22 ? c1 + 41 : (c1 >= 2 ? c1 + 39 : c1 + 36)));

0
$
1
%
2
(
3
)
4
*
5
+
6
,
7
-
8
.
9
/
10
0
11
1
12
2
13
3
14
4
15
5
16
6
17
7
18
8
19
9
20
:
21
;
22
?
23
@
24
A
25
B
26
C
27
D
28
E
29
F
30
G
31
H
32
I
33
J
34
K
35
L
36
M
37
N
38
O
39
P
40
Q
41
R
42
S
43
T
44
U
45
V
46
W
47
X
48
Y
49
Z
50
[
51
]
52
^
53
_
54
`
55
a
56
b
57
c
58
d
59
e
60
f
61
g
62
h
63
i
64
j
65
k
66
l
67
m
68
n
69
o
70
p
71
q
72
r
73
s
74
t
75
u
76
v
77
w
78
x
79
y
80
z
81
{
82
|
83
}
84
~

New alphabet for the base85x encoding

Unfortunately it turned out that the base85x encoding proposed in the last post is damn slow. To encode a 400 KB picture from memory to memory, the PHP implementation needs 0.2 seconds and the Java implementation needs 0.1 seconds. That’s only 4 MB/s.

A lot of this time is spent on alphabet table lookups. So I have the hope to be able to reduce the number of lookups. The original base85 just adds 33 to every value 0,…,84 in order to get the right ASCII character. That is fast but leads to all ASCII characters between 33 and 117 including “<” and “>”.

So the question was, what is the simplest function to map 0,…,84 to a subset of the 94 printable ASCII values excluding: < > ” ‘. The best I could find was this one:

In Java I could reduce the encoding time from 0.1 s to 0.075 s by using the following formula instead of a lookup table:

out = (char) ((c1 >= 21) ? c1 + 42 : ((c1 != 0) ? c1 + 39 : 33)));

I think some more enhancements are possible, especially in c/c++ where you can do better binary operations and use unsigned variables.

Overall this leads to the following new base85x encoding alphabet:

0
!
1
(
2
)
3
*
4
+
5
,
6
-
7
.
8
/
9
0
10
1
11
2
12
3
13
4
14
5
15
6
16
7
17
8
18
9
19
:
20
;
21
?
22
@
23
A
24
B
25
C
26
D
27
E
28
F
29
G
30
H
31
I
32
J
33
K
34
L
35
M
36
N
37
O
38
P
39
Q
40
R
41
S
42
T
43
U
44
V
45
W
46
X
47
Y
48
Z
49
[
50
\
51
]
52
^
53
_
54
`
55
a
56
b
57
c
58
d
59
e
60
f
61
g
62
h
63
i
64
j
65
k
66
l
67
m
68
n
69
o
70
p
71
q
72
r
73
s
74
t
75
u
76
v
77
w
78
x
79
y
80
z
81
{
82
|
83
}
84
~

But why is base85 so much slower compared to base64? Well, 64 is a power of 2 (2^6 = 64) which means that a multiplication by 64 is equal to a binary shift of 6 positions to the left and a division by 64 is equal to a right shift of 6 positions. Additionally the modulo operation a%64 is the same as cutting the lower 5 bits of a. All these essential operations are slower in base 85.

The base85x encoding

How to store binary data within XML files

Kullo will use a slightly modified form of the base64/base85 encoding to store binary files inside a xml file: base85x.

The general idea is to encode 4 bytes of binary data into 5 bytes of printable characters, which means that on the one hand, file size increases by 25 % but on the other hand binary data can be displayed, copy’n’pasted and stored in text files and source code.

Given four binary bytes b_1, …, b_4 out of [0, 255], we use the following base transformation to get the values c_1, …, c_5 in the range of [0, 84]:
7c0d52b0582cf30c85e6fe72bf2f2455

Then we use another alphabet than Adobe’s base85, since that includes the crucial XML characters < and >. The alphabet we’re using is the following:

0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
A
11
B
12
C
13
D
14
E
15
F
16
G
17
H
18
I
19
J
20
K
21
L
22
M
23
N
24
O
25
P
26
Q
27
R
28
S
29
T
30
U
31
V
32
W
33
X
34
Y
35
Z
36
a
37
b
38
c
39
d
40
e
41
f
42
g
43
h
44
i
45
j
46
k
47
l
48
m
49
n
50
o
51
p
52
q
53
r
54
s
55
t
56
u
57
v
58
w
59
x
60
y
61
z
62
?
63
!
64
.
65
:
66
-
67
_
68
,
69
;
70
(
71
)
72
[
73
]
74
=
75
|
76
{
77
}
78
*
79
+
80
@
81
#
82
%
83
~
84
^

There are 94 printable characters in ACSII (plus the space ” “). The spare characters we are not using are: "$&'/<>\`

All these 94 ASCII chars are represented as 1-Byte characters in UTF-8, thus the size will not increase when storing base85x blocks in XML files.

In order to get consistent results, we need to use a standardized padding if the input data is not divisible by 4: We add as many ^ characters as we need to get a full 4-bytes block. And to store the number of tildes which were added, we trim the resulting 5-bytes output by exactly this number.

If we had for example 42 bytes of binary data, ending on “n!”, the padding rule expands this block to “n!^^” before encoding.

base85xEncode(“n!^^”) = 110*256³ + 33*256² + 94*256 + 94 = (35,33,54,28,76)85 = ZXsS{

In this Example, we cut the last 2 characters and get “ZXs”

Note: Since the first version of this encoding there have been incompatible updates in the alphabet. See all related posts.