Parsing OpenPGP Key Export Format with JavaScript
If you tuned in last time to my post about scalar numbers and multiprecision integers, you might think that I'm just writing a book report for the OpenPGP spec. That’s not actually the case. My goal is to write a JavaScript app that allows encryption and decryption of OpenPGP messages, and I’m trying to document my progress as it happens. Reading the spec and obsessing over every definition will only get us so far. I’m tired of reading, so now it’s time to write some code.
In this post, I will be walking through the process of reading OpenPGP public key files with JavaScript. Before we can get on to the more glamorous work of encrypting and decrypting OpenPGP messages, we need a way to get public and private keys into our app. And even before we can do this, we need a way of getting JavaScript to simply read data from the files themselves.
OpenPGP data files and JavaScript
OpenPGP data files—which can include public key, private key or encrypted messages (among other things)—can be stored as raw binary or as, what the spec calls, ASCII Armored data. “ASCII Armor” (see section 6.2 in the spec) is a fancy way of saying that the binary data is Base64-encoded and saved as ASCII along with a checksum and some particular headers. When you decode an ASCII Armored data file, you should get the exact same binary data as if you had opened an equivalent binary file. (That’s part of what we’re going to prove today.)
Unfortunately, Javascript makes it a bitch to work with binary files, so that’s why I’m devoting a post solely to reading files into JavaScript and Base64 decoding ASCII Armor into binary. We’ll actually parse the binary into meaningful data structures at a later time. In order to properly decode ASCII Armored data, we’ll need JavaScript functions to Base64 decode/encode as well as a CRC24 checksum function. The spec helpfully describes how all of this will work, and we’re going to implement it!
Export your Public Key files with GNUPG
GnuPG is what all the cool kids use for PGP encryption these days, and ensuring interoperability with their export files and messages is a big priority for my project (although they can keep their license). If you don’t already have it set up, check out this excellent intro to OpenPGP by Zachary Voase, which includes installation instructions and usage tips for GnuPG. Once you have it set up and you have at least one entry in your public keyring (probably your own personal public key/subkey), export your keyring to both binary and ASCII Armored files using the following commands:
gpg2 --export > pubkeys.gpg
gpg2 --armor --export > pubkeys.asc
Opening ASCII Armored keyfiles with JavaScript’s File API
The previous commands exported two files: pubkeys.gpg, which is binary, and pubkeys.asc, which is ASCII Armored. Using a relatively new “HTML5” feature in JavaScript, (namely the File API) we can read the data from these files directly in the browser without first posting them to a server or anything ugly like that. This might not work on older browsers, so everyone using those browsers can cry me a river.
In order to open a file with JavaScript, we need one of those file uploader form input elements. I shall give it the id ascii_keyfile.
<input type="file" id="ascii_keyfile"/>
Next we use JavaScript to bind an event to the file input so we can call a
function when it changes. Note that I’m using MooTools for the event binding
and document.getElementById
shortcutting, but you can easily substitute your
own favorite JavaScript framework as needed.
window.addEvent('domready', function() {
$('ascii_keyfile').addEvent('change', open_ascii_keyfile);
});
var open_ascii_keyfile = function() {
var file = $('ascii_keyfile').files[0];
read_file(file, parse_ascii_keyfile);
}
var read_file = function(file, callback) {
var reader = new FileReader();
reader.onload = function(evt) {
var binary = evt.target.result;
callback(binary);
};
reader.readAsBinaryString(file);
}
var parse_ascii_keyfile = function(data) {
// ... ACTUALLY DECODE THE FILE HERE ...
}
This code binds a change event to the ascii_keyfile
input. When you open
your file via the uploader button, this event fires and triggers the
open_ascii_keyfile
function. This pulls the File object for the attachment out
of the input and sends it into the read_file function along with the
parse_ascii_keyfile
callback function. Since reading data from the file may
take awhile, it happens asynchronously. The read_file
function starts this
process and, when it completes, it sends the file’s data into the
parse_ascii_keyfile
callback.
If all goes well, the file will be read and passed into parse_ascii_keyfile
as
ASCII text. It should look something like this:
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v2.0.19 (Darwin)
mQENBFEN6EoBCADChZ+c6Q84tJ+WLTKYfhdN49OTUlxmoZD8cou6Bdi/EKXvpciA
ydnD+SmlYf4pjAOwEiEsKJ6swLORAam4q0pnW9gAALbclhwDf9J4sLwUkh4F4D9P
6TJX2vPEk4WRkudkj2TW3H2Wn1d7fQ3zlwLtK/bC5YeajuAIAk1m5zCtMbeZoYGc
FWU+Max2G4Xr1/5JmUzfVtVSlxdJj7SX1FtJ/zj/eWklKNtl05yBWA+NyFpkgkzR
DP+oJYBPdNoyS5mqNNIEnIIjDAUiufhGzkk2+865gIOH9X2WWCB5p0EGsR8ZzZA6
H379WPca+GTlu5JncEi7lLcg+eQRwxQu9S6XABEBAAG0QEplZmYgTCAoaHR0cDov
L3J1YmJpbmdhbGNvaG9saWMuY29tKSA8amVmZkBydWJiaW5nYWxjb2hvbGljLmNv
bT6JATgEEwECACIFAlEN6EoCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJ
EJmhufkmy7JniVYH/3Mjgo2gDDsc8tTPaIsBbYacB40pMOMX7+KxSQktrUZkGqwJ
TlGnfBB4R8jz+32dBjX/OmeGYFTl9xFMBx+MuQlHW9Sl0ffV+Gpk9YbebBZaPn7Y
5OpinF9e7zuFH7MyI72SIM7S1CvvfP3QrYj7viBitddJ+eW3Vx3ANgpkr8Bj9auH
oT043dlfm/xpqozOLwbVM0BADJge0zQvNKGpoZjoHU2mNSSGhWhXAPCRp9wVOCCE
SxbL+2Wi++ZUQUXO9DIxAQJy6HJfPx+PvBeedGAisfovNwtT0tDfJvyyPnRTKtEz
TWiWDYwNUY80A6o/KkmSxSs5OeSh4t08phBIzWu5AQ0EUQ3oSgEIAMZf+w8pVqj/
ZUQtacxzDe52kz+HtljJq4ltxulxQtoln5VkP5vWGq3uF1RFBoLVZ0OE/61yZixG
8pOPMiGzHWJtidtQk7GxT/Z/b34voeTeruZjpfm3ty14sQvmApaRpjEQaNFTPy7d
DiJKqGkD7teb/Mx8rtWJpN60hTiww1cOP5VjBvC82mn6uZ9DU2vJ6VwBTmwYnZMa
XLiGRIpEAOqtLag4XwYrHS04H7No3asxSGhlyVN2KnxvlIMwoTZ+bTVaOr2ivCIC
el1dY2kC5LsfMa04z2Ne7fme+pnGM62ufC+l/T9H58vsw1VFl5vanYmJugtFzxHF
HzU3atdbHzEAEQEAAYkBHwQYAQIACQUCUQ3oSgIbDAAKCRCZobn5JsuyZ138CACu
mdutchMDVE7V8ewhzsOCHgSMQjnmkB0HFCll2RxbhLz6x8SmzcQK107XbHQwFCdF
A5v4JgFtwb6b9W9WShemNvC7tNx/loo2C+EiUKA9tURo/rJORu6S1jR79BaaOUUj
MsB/jxxF2eRzE86SzgWXj34pYyoqJeMaiLSdXcCNW8eyN1i3gf8XpMlM7Ldv0Bq7
vqbU2sDXBQvPDbNyhVIZjqfjTOBJl54NWHYRXlybFaSrXb7Qg/9ac+54TPpgCBTs
1kR/HSZDujWE891NqlKGpSN4MDyi3WRL2RVbW0s5+8f8odNJuswIo1tWiNXBHVXs
2/eCtlrSbyoTGYj0ErY0
=a7UR
-----END PGP PUBLIC KEY BLOCK-----
There are two parts to this data that we care about. First we have the big block
of ASCII characters grouped 76 to a line. That’s the actual data encoded as
Base64. Then, the second to last line is 4 more Base64 characters preceded by an
equal sign—in this case it’s =a7UR
. That’s our
CRC24 checksum (stay tuned for more on
that).
Isolating and decoding our Base64 data
The first thing to do is to isolate the Base64-encoded payload data from the block above. We can do that with regular expressions:
var parse_ascii_keyfile = function(data) {
// Our data begins at the first character index preceded by a blank line.
var body_begin_index = data.search(/^(\r\n|\n|\r)/m) + 1;
// Our data ends right before the checksum line which starts with an "="
var body_end_index = data.search(/^\=/m);
// Both of these indexes need to exist for the file to be readable.
if (body_begin_index == -1 || body_end_index == -1) {
alert('This is not a valid ASCII-Armored OpenPGP export file.');
return false;
}
// Pull the body out of the data and strip all newlines from it
var body = data.substring(body_begin_index, body_end_index);
var body = body.replace(/(\r\n|\n|\r)/gm, '');
// Grab the checksum while we're at it...
var body_checksum = data.substr(body_end_index + 1, 4);
...
Now we’ve isolated our Base64-encoded data into the body
variable. It’s time
to decode it! Some browsers natively support Base64 encoding and decoding
respectively via the btoa(data)
and atob(text)
functions, but I don’t trust
them. I’ve seen Base64 implementations that try to “helpfully”
UTF8-encode/decode the incoming/outgoing data, and this will actually break our
binary data. Plus I was geeking out on this
awesome guide to Base64 in JavaScript
(if you don’t know what Base64 is or roughly how it works, check it out), so I
decided to write my own:
var base_64 = {
chars: 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/',
encode: function(data) {
var output = '';
for (i=0, c=data.length; i<c; i += 3)
{
var char1 = data.charCodeAt(i) >> 2;
var char2 = ((data.charCodeAt(i) & 3) << 4) | data.charCodeAt(i+1) >> 4;
var char3 = ((data.charCodeAt(i+1) & 15) << 2) | data.charCodeAt(i+2) >> 6;
var char4 = data.charCodeAt(i+2) & 63;
output += this.chars.charAt(char1)
+ this.chars.charAt(char2)
+ this.chars.charAt(char3)
+ this.chars.charAt(char4);
}
if (c % 3 == 1)
output = output.substr(0, output.length - 2) + '==';
else if (c % 3 == 2)
output = output.substr(0, output.length - 1) + '=';
return output;
},
decode: function(str) {
var data = '';
for (i=0, c=str.length; i<c; i += 4)
{
var char1 = this.chars.indexOf(str.charAt(i));
var char2 = this.chars.indexOf(str.charAt(i+1));
var char3 = this.chars.indexOf(str.charAt(i+2));
var char4 = this.chars.indexOf(str.charAt(i+3));
data += String.fromCharCode(char1 << 2 | char2 >> 4);
if (char3 != -1)
data += String.fromCharCode((char2 & 15) << 4 | char3 >> 2)
if (char4 != -1)
data += String.fromCharCode((char3 & 3) << 6 | char4);
}
return data;
}
}
Now, getting back to our parse_ascii_keyfile
function, Base64-decoding the
data is as simple as:
var decoded_body = base_64.decode(body);
Computing the CRC24 checksum
The previous code should return a string of 8-bit binary characters. But how do we know this is the correct string of 8-bit binary characters? That’s where our CRC24 checksum function comes in to play. The CRC24 checksum is sort of like a cheap hash function. Given a certain input (in our case the binary data), it should provide an output which is somewhat uniquely mapped to the input and consistent (non-random). This output is in the form of 24 binary bits. If we then Base64 encode the output, we should get 4 characters which precisely match the CRC24 checksum near the bottom of our ASCII Armored keyfile. If you’re confused, check out Section 6 of the OpenPGP spec—it explains this pretty well.
The spec also includes a very helpful example of a CRC24 function written in C, which translates very easily to JavaScript:
var crc24 = function(data) {
var crc = 0xb704ce;
var len = data.length;
while (len--) {
crc ^= (data.charCodeAt((data.length-1) - len)) << 16;
for (i=0; i<8; i++) {
crc <<= 1;
if (crc & 0x1000000)
crc ^= 0x1864cfb;
}
}
return number_to_binstring(crc, 24);
}
var number_to_binstring = function(bin, bits) {
bits || (bits = 32);
var text = Array();
var i = (bits < 32 && bits > 0 && bits % 8 == 0) ? (bits / 8) : 4;
while (i--) {
if (((bin>>(i*8))&255) || text.length) {
text.push(String.fromCharCode(((bin>>(i*8))&255)))
}
}
return text.join('')
}
The major difference between this and the spec is my addition of the
number_to_binstring
function. In the crc24
function, JavaScript is
performing a bunch of bitwise operations on the crc
integer variable as it
iterates over the 8-bit ASCII character codes associated with each byte of the
input data. These operations are performed numerically, even though the data
itself is in the form of a string (JavaScript is kind of janky here with its
spotty support for ByteArrays). The number_to_binstring
function simply
converts the resulting 24-bit crc number value back to a string composed of 3
8-bit ASCII bytes.
We previously read the ASCII Armor checksum into the body_checksum
variable.
Now we can take our decoded data and compute its checksum, then
Base64-encode that checksum and compare to body_checksum
. If it’s a match then
we can do a victory dance, because the data was not corrupted and it decoded
properly!
var decoded_checksum = base_64.encode(crc24(decoded_body));
if (body_checksum != decoded_checksum) {
alert('Checksum mismatch! (Expected '+body_checksum+', got '+decoded_checksum+')');
return false;
}
// Our data decoded successfully
...
Opening binary files is easier. The checksums should match!
We can repeat the same basic steps as far as the read_file
function to read a
binary file into JavaScript. There’s no special decoding required once you get
the file data, but one useful result is to compute the checksum. If you’re
working with the binary version of the same key export as you opened previously
in ASCII Armored format, then the checksum on the data should be the same. This
is useful for proving this whole process is sane.
After all this work, all we got was some lousy binary data in JavaScript. Actually reading the data and doing something useful with it is a whole ‘nother can of worms. But not to despair—we have to crawl before we can walk, and walk before we can run, and I really need some whiskey so I’m done for now.