How do you secure email? (Part Deux)

My last posting on securing email ended on a heart-stopping cliff-hanger, and I know the suspense has had you reaching for the soothing medication. But rest easy, dear reader, as our tale may now resume.

As you may recall, Toby and Violet’s plans to exchange secure email have run aground. Neither Toby nor Violet can figure out how to trust each other’s public keys. The security of secure email is built on the assumption that public keys are genuine – no trust, no secure email.

And so Violet and Toby do the simple thing. They meet at Tully’s for coffee and physically exchange public keys. Violet knows Toby’s public key is genuine because he gave it to her. She can safely verify his digital signatures and accept his email.

But what works for two doesn’t work for 637. For Toby wants to send secure email to 637 of his closest Facebook friends (the remaining 1039 worthy only of the Wall), many of whom aren’t even in the same city. Will Toby travel the world having coffee with each of his friends? Toby is eager to sip cappuccino in Florence and Lisbon, but his financial advisor (me) mandates otherwise.

Certificates of Authenticity

Toby recognizes that his public key needs a proper certificate of authenticity. Yes, sort of like the one that came with your prized William & Kate commemorative plate (I know you have one).

Toby heads to Certificates-R-Us, a well known Certificate Authority. Certificates-R-Us issues X509 Certificates – standardized digital packages that contain a public key, a statement about who the key belongs to, and evidence of the key’s authenticity.

Toby’s certificate acquisition process goes something like this:

  • Certificates-R-Us asks Toby for his papers. Toby produces his Government issued ID (Driver’s License) as evidence that he is the famed canine.
  • Certificates-R-Us has a higher bar for identity proofing than its competitor Certificate-Mart. It demands more information from Toby, such as his bank account or credit card number. Certificates-R-Us uses Toby’s personal information to conduct a series of validation checks:
    • It hires a credit agency to verify Toby’s identity by confirming his financial credentials.
    • It verifies Toby’s home address (taken from his license) by sending him snail mail
    • It calls Toby’s home phone number and listens to his melodious bark
  • Fortunately for Toby, his identity checks out.
  • Certificates-R-Us now proceeds with the certificate issuance:
    • It creates a new (public, private) key pair for Toby
    • It very carefully transfers the private key to Toby. Toby stores this key in a very secure store (Attila the Hound is always on the prowl, after all) and uses it to sign his email.
    • It issues a X509Certificate containing Toby’s public key and a statement that he is the Subject to whom the certificate was issued.
    • It collects processing fees from Toby (darn capitalism).

clip_image001

X509 Certificates in daa email

Toby signs his email to Violet with his spiffy new private key, then attaches his shiny new certificate to it. The certificate bears his name, and he is very proud of it. He also posts his X509 Certificate in a public directory to enable his many friends to download and use to encrypt the email they send him.

But how does any of this help? How does Violet know this isn’t another of Attila the Hound’s forgeries?

You guess rightly. Digital signatures to the rescue once again. What works for secure email, and documents, also works for certificates.

Certificates-R-Us signs Toby’s X509 Certificate with its own private key. The signature certifies the authenticity of Toby’s public key. Certificates-R-Us can confidently certify Toby’s public key because Certificates-R-Us created both Toby’s (public, private) key pair and the certificate.

And we’re done, right? Nope. I still have half my weekly quota of words to write.

To trust Toby’s certificate, Violet checks that the Subject of the certificate matches the sender (Toby’s) name. She validates the digital signature of the certificate issuer. She decrypts the signature with the issuer’s public key so that she can compare the issuer’s hash……and behold, we’re right back to where we started.

Because while verifying the issuer’s signature, Violet encounters yet another public key she cannot trust – the issuer’s !

Who do you trust?

In the real world, you trust someone because:

  • You chose to trust them explicitly
    • You know the person
    • The person has an honest face (more likely, good looking face – trust isn’t always rational)
    • You take their word for it (possibly because they honestly have a good looking face)
    • Etc .
  • You trust them because you explicitly trust somebody else who trusts them. Or attests to their trustworthiness. Or issues them a Photo ID. Or “likes” them on that website. Somebody vouches for somebody who vouches for somebody who vouches for…. This is called trust delegation.

You always end up trusting somebody explicitly. You can delegate and federate (and complicate) trust all day long – but in the end, that chain of delegation must end somewhere. The trust chain has to terminate at somebody you explicitly trust.

What is true in real life is also true for certificates. Toby’s certificate also has a certification path or trust chain.

clip_image002

Chains of Trust

Toby’s certificate issuer – Certificates-R-Us – also has a certificate; an intermediate or subordinate certificate issued by the well regarded Certificate Authority Woo Hoo Corp. This certificate contains – yes – the public key for Certificate-R-Us.

image

The Woo Hoo Corp Root Authority also has a certificate. This root certificate is special because no other authority certifies the public key the certificate contains. But since all X509 Certificates must be signed by someone, a root certificate is self-attested or self-signed. Woo Hoo Corp issues certificates to multiple subordinate Certificate Authorities – such as “Certificates-R-Us Asia” or “Certificates-For-The-Masses”.

image

How do you trust?

Violet can trust Toby’s certificate in 3 ways:

  • She can trust Toby’s certificate explicitly
  • She can trust the certificate for the authority that issued Toby’s certificate – Certificates-R-Us.
  • She can trust the certificate for the authority (Woo Hoo Corp) that issued the certificate for the authority (Certificates-R-Us) that issued Toby’s certificate (say that really quickly – thrice)

In the Direct Project parlance, Violet’s trusted certificates are her Trust Anchors. Violet picks the anchors she trusts, and stores them in her trust anchor store, or anchor list.

Your browser and your computer come pre-installed with the certificates of several highly trusted large Certificate Authorities, managed by vendors such as VeriSign. Every time you create an SSL connection to order your prescription from drugstore.com, your browser verifies the server’s identity, by going through a procedure very similar to Violet’s. There are a few other details involved, naturally, so how SSL works is best saved for another post.

image

Circles of Trust

The higher up the trust chain you decide to anchor your trust at, the wider the circle of trust you belong to. By trusting a Certificate Authority (CA), you make it simple for yourself to exchange secure email with an entire trust community – the community of individuals and organizations who hold certificates issued by the CA. The community admits only trusted parties– only those who meet the security and privacy policies outlined by the community members. The benefits of admission is an admission card – your X509 Certificate issued by the community’s Certificate Authority. Use the keys associated with member X509 Certificates to sign and encrypt email and have it accepted and trusted anywhere in the community.

You don’t have to pay Certificates-R-Us to set up a community or to issue certificates to your members – you can do it for free (now I have your attention – see below). Your community can set up its own Certificate Authority by running one of many commercial and open source certificate management servers. It doesn’t really matter as long as your community members install your CA certificate in their trust store.

The beauty and simplicity of this model is the powerful premise behind the Direct Project. Its not a new idea. We owe it to the geniuses who invented modern crypto & asymmetric encryption. And the designers of S/MIME.

Trusting a Certificate Authority has its costs. If you trust Certificates-R-Us, you trust any and all certificates issued by Certificates-R-Us – a possibly large community. If you trust Woo Hoo Corp, you trust half the planet. If you trust too broadly, you become too trusting. So trust wisely.

Creating your own Circle of Trust

To create a trust community, you must first create a Certificate Authority (public, private)key pair and X509 Certificate. You can then start issuing certificates (and keys) to your members. You can do this very cheaply by using the free tools: OpenSSL (open source) or MakeCert.exe (free in Windows).

Learn how to use makecert.exe using these sample batch files.

  • genca.bat: Create a new certificate authority certificate (root).
  • gencert_exchange.bat: Issue a new certificate to use for secure email – issued signed by the authority you created in genca.bat

You do need to be careful with the all the private keys you generate for your community members. After securely transfering them to the owning member using a secure out of band mechanism (such as password protected PFX files) – do ensure they are wiped from the machine you created them on. For Attila the Hound prowls endlessly.

You can find detailed documentation on makecert and PFX files MSDN and the Web.

How do you secure email?

Wait, didn’t we already cover encrypted email in an earlier post? We did. But encrypted email is not secure email. Encryption solves but part of the secure email conundrum.

An Identity Crisis

Toby, Toby, a Cairn Terrier of Distinction, dutifully encrypts all the email he sends his pal Violet. Violet is thrilled to receive email that she believes isn’t junk, but her joy is short lived. Toby keeps wanting her assistance in moving $1,000,000 through various Caribbean bank accounts! Or insists that she claim the $5,000,000 that a newly discovered uncle, Indian Buffet, has left her. What is going on?

There is mischief afoot. For Attila the Hound has also learnt how to encrypt email and is masquerading as Toby. Attila pulls Violet’s public key (it is public!) from a directory, and then sends her (encrypted) email that looks like this:

From: <TOBY@caninegenius.woof>
To: <violet@mathiscool.xyz>

Subject: Funtestic Finencial Oppurchoonity!

Dear Mr. Violet,
Sir, I beseechfully writes to tell you...

[No, you can’t do this by using Outlook. I think. But you can with Notepad and a few of lines of code].

SMTP Servers (tuned and debugged for nearly 30 years) use the venerable SMTP protocol to reliably push billions of email messages around the planet 24 hours a day. SMTP is simple, insanely successful, eminently spoof-able and rather insecure. SMTP has no concept of verifiable sender identity. Attila the Hound can send Violet email (encrypted or not) claiming to be Iron Man and she has no way of knowing him from Robert Downey Jr. Encryption keeps your email from prying eyes, but it can’t save you from actors.

What are Violet and Toby to do? Attila is putting a strain on their friendship. Toby & Violet face an identity crisis.

Something you have, Something you know

Toby asks Grandma Asha for help. He regularly applies her practical everyday wisdom to difficult engineering problems.

In my experience, says Grandma, you prove or assert your identity using:

  • Something you have: Your driver’s license, your passport [token]
  • Something you know: Your social security number, your mother’s maiden name [secret]
  • Your signature.

And the lights blaze in Toby’s ingenious head.

The Digital Signature

To prove and assert his identity, Toby uses a blend of Grandma’s suggestions.

First, he creates a (public, private) key pair (see previous post for an overview of key pairs and asymmetric encryption). The private key is a secret that only Toby knows (and has). If he can prove to Violet that he knows this secret, he can prove to her that he is Toby!

Toby demonstrates his knowledge of his private key by using it to encrypt data both he and Violet have access to – the email he is about to send her. If Violet successfully decrypts the email using Toby’s public key, then Violet knows that Toby must have encrypted the email. This is because the only data Violet can decrypt using Toby’s public key – is data encrypted using Toby’s private key! Asymmetric encryption is genius.

But wait, isn’t asymmetric encryption slow? Not a problem, growls the canine cryptographer.

  • Toby creates a cryptographic hash or digital fingerprint of his email.
  • He encrypts the hash with his private key. This will prove to Violet that the email is really from him. Encrypting the practically unique digital fingerprint of the email is as good as encrypting the email itself.
  • He attaches the encrypted hash to the email.
  • He names his creation a digital signature.
  • Toby has signed the email with his private key.

Violet verifies Toby’s identity by verifying his digital signature:

  • Violet creates a cryptographic hash of the email she receives.
  • She decrypts Toby’s digital signature using his public key. This gives her Toby’s version of the hash.
  • She compares her version of the hash with Toby’s
  • If the two match, she can confidently state that:
    • Toby sent her the email
    • Nobody tampered with or altered the email after Toby signed it. If they had, her version of the cryptographic hash  – the digital fingerprint – would be different from what came in the digital signature.

But where does Violet get Toby’s public key? Violet could look it up in a directory, but does not have to. The performance conscious Toby saves her the extra round trip by sending his public key along with the email itself. Public keys are designed for broad dissemination, so this is safe.

How do you send secure email?

To send secure email, you:

  • Sign it with your private key [so the recipient knows you sent it, and nobody else tampered with it]
  • THEN encrypt it with the recipient’s public key [so nobody but the recipient can read it].

And you are done, right? Wrong.

Spoofing Public Keys

For the cunning Attila can also generate his own (public, private) key pair. He uses this pair to continue pretending that he is Toby:

  • Like before, Attila creates an email that claims to be from Toby.
  • He signs the email with his (Attila’s) private key
  • Then he attaches his (Attila’s) public key to the email

Violet receives Attila’s email and runs through her validation procedure. As Attila expected, everything checks out. The digital signature matches! Violet accepts Attila’s email as what it claims to be – an email from her pal Toby .

Then, Attila’s mentor, Prof. Moriarty, joins the fun. Moriarty figures out that he can intercept Toby’s emails to Violet, but is frustrated because they are encrypted. So, the wily Professor hacks into the public directory that hosts Violet’s public key, and replaces Violet’s public key with his own. Toby is none the wiser as he downloads what he believes to be Violet’s public key. He encrypts email he is sure is for Violet’s eyes only, but will in reality be read by Prof. Moriarty.

Prof. Moriarty reads Toby’s insightful commentary on support vector machines with great interest. Then he re-encrypts the email using Violet’s public key (which he has kept), and forwards it to Violet.

And so we arrive at our next conundrum:

  • How does Violet know that the public key she used to verify Toby’s digital signature on his email– is really Toby’s?
  • How does Toby know that the public key he used to encrypt his email to Violet – is really Violet’s?

Anybody can generate a public, private key pair. Directories can be hacked and spoofed.

In this cruel, untrusting world, who attests that a proffered public key is the genuine public half of a subject’s (public,private key) pair? Who do you trust? How do you trust?

Unfortunately, tonight’s episode must end on that cliffhanging note. Tune in next time for the exciting tale of two X509 Certificates.

To be continued….

What is a cryptographic hash?

The other day, I was in a meeting where somebody said, “….and then you take a SHA-256 hash of the document, which is unique…”.

Not quite. It would be more accurate to say practically unique.

Cryptographic hash functions are astounding. They take arbitrary binary data: document, image, movie, message, bytes.. and crunch over every bit to produce short, fixed length summary called a hash value or digest. E.g. The SHA1 hash function creates a 160 bit digest out of any source input, no matter what its size or content. Cryptographic digests have some very important properties.

Say you create a cryptographic digest of a document. You will find it practically impossible to:

  • Find or create a second document (or any other data) that will produce the same digest
  • Change or tamper the source document – even a single bit – without also altering the digest .
  • Reverse engineer the document from the digest – i.e. by hashing randomly generated documents until you find one that has a matching digest.

These properties make the digest unique for all practical purposes. You can take any binary data and derive a big number that represents that data and that data alone. The digest serves as a digital fingerprint for the data. This property makes cryptographic digests the basis of Digital Signatures and their close cousins, HMACs (we’ll cover both in upcoming posts).

But the fact remains: the cryptographic hash is not actually unique. Where there is a hash value, there will always be a collision: two or more arbitrary pieces of data that reduce down to the same digest.

Collisions

Why are collisions inevitable? Most of you know how a hash table works. If you don’t, then consider the following : Say you were given 5 balls and asked to place them in 3 buckets. It doesn’t take an engineer to realize that you must put at least 2 balls in buckets that already contain at least 1 other ball. Collisions!

The famous SHA-1 function, the one time champion of cryptographic hashing, produces 160 bit hash values. 160 bits represents is 2^160 (2 raised to 160) possible unique values. 2^160 is a very big number: approx. 1.46 x 10^48 – i.e. 48 zeros. By comparison, the Earth has an estimated 1.33 x 10^50 atoms.

The number of possible inputs (balls) to the hash function is infinite. The number of possible hash values (buckets) is fixed. Collisions! Multiple balls will land in the same bucket. Eventually. But it may take a long while because there are so many buckets!

In fact, the laws of probability tell us that you have a 50-50 chance of getting a collision if you have as few as 2^80 inputs. Which is a smaller but still scarily big number. Why? For the same reason that 23 random people have a 50-50 chance of sharing the same birthday (but not birth year!).

Finding Collisions

So how do you go find collisions and why? The why is obvious: imagine if you found somebody who had the same fingerprint as you – unlikely though it may be. If you were of the miscreant persuasion, you might take advantage of this knowledge. The same holds for digital fingerprints. If you could tamper with or create digital data in such a way that its digital fingerprint matched (collided) with that of the “real data”, you could cause some mischief. Since the digital fingerprint of the bad data matched the one people expected from good data, they would have little reason to be suspicious.

The simplest way to find collisions– brute force –  is also the hardest – primarily because of how long it takes. You could brute force collision detection by calculating the hashes of bazillion (all) inputs using gazillions of computers and then watching a lot of TV as you… wait……for ever. Or you could alter various bits of the original input, and try computing hashes and see if any of them stick. You could do cryptanalysis – which these days is a highly sophisticated version of how the British famously hacked the German Enigma machine. There are other techniques and they all are more involved than this short paragraph may let on (you think?), but you get the general idea.

To find collisions using the hottest cryptographic hash function in town (the SHA2 family) is (currently) practically impossible. In cryptography, practically impossible means computationally infeasible. Which is a fancy way of saying that even if you used all the current computers, algorithms and known mathematics, it would take you so long to solve the problem that it wouldn’t matter any more. You could use all of Azure and Amazon EC2 to crunch your algorithms, but you would die before you succeeded, as would all of humanity and possibly the Earth too. Of course, a brainy breakthrough that exploited of a fundamental flaw in the hash function, or a quantum computing revolution might give you a fighting chance, but until then..

You could also invent some new smarty pants Math that lets you find a collision in feasible time. Cryptographers live in abject terror of computational feasibility – the Freddie Kruger of their dreams.

Avoiding Collisions

Cryptographic hash functions are painstakingly designed to reduce the probability of collisions. If you peek at the code for a hash function, you will find it replete with bit operations like xors, bitwise and/ors, shifts and rotations. They operate on each bit, shoving and pushing and twisting the data with seemingly arbitrary, but carefully chosen and massively tested steps. A software blender using every bit in the original data to make a digital smoothie, with each smoothie having a taste of its own that incorporates the flavor of everything that went in. With values distributed values more or less randomly (evenly) across all buckets. Small changes in the original triggering an avalanche of changes in the computed digest.

The mathematics behind why or how any of this works is way over my balding head. The mathematics are actually so subtle and clever that they may include hidden flaws – either mistakes or deliberate weaknesses that a clever chap may exploit at a later time. This is why cryptographic hash functions are few and far between. Rock stars that hold sway for a while even as they are taken apart by brainiacs. Until one of them discovers a weakness. And so went MD4 and MD5, SHA0. And not so long ago, SHA1 also met its fate, even though it was compromised only in theory. Cryptographers are a paranoid bunch, which is just fine with me!

How encrypted email works

I’ve been working on the Direct Project for the past year or more. The Direct Project is a federally sponsored initiative that uses secure email as the foundation for the ubiquitous nationwide exchange of health information.

To secure an email, you have to, among other things, encrypt the message content. It is no surprise that many newcomers to Direct want to know how encrypted email works. Others, who are comfortable with classic message security, notice that unlike point to point messaging (one sender, one receiver), email is inherently multicast (one sender, many receivers). They ask: how do you encrypt email sent to multiple recipients?

In this inaugural posting for my new blog, I will try to answer both questions in plain English.

Encryption Basics

First, a quick refresher on encryption concepts:

  1. Key: An array of carefully generated bits, used to encrypt and decrypt data.
  2. Encryption: You use a key (secret) and a precise series of complicated steps (encryption algorithm or cipher) to mangle (encrypt) data into undecipherable gibberish.
  3. Decryption: You use a key (secret – hopefully the right one) and a precise series of complicated steps (decryption algorithm or cipher) to un-mangle (decrypt) gibberish back into your original data. If you use the wrong key, or the wrong algorithm, you turn the source gibberish into more gibberish.
  4. Symmetric Encryption: You use the same key to both encrypt and decrypt the data. Both the sender and the receiver have a copy of the same keya shared secret. To share the secret, the sender and receiver must exchange their shared key securely – without an attacker getting a peek. If an attacker can somehow (silently) intercept an inadequately protected secret as it moves from sender to receiver (steaming open the envelope, so to speak), the attacker can also decrypt your encrypted data.
  5. Asymmetric Encryption: You use one key (public) to encrypt the data and an associated but different key (private) to decrypt the data. Data encrypted with your public key can only be decrypted with your associated private key. You boldly give the public part of your key pair to anybody you want to receive encrypted data from. You keep your private key secret and and use it to decrypt data that people send you. Unlike symmetric encryption, there is no shared secret to exchange. You can distribute your public key to the entire world without fear. Data encrypted with your public key is truly for your eyes only – because only you can decrypt it with the secret private key that only you have.The reverse is also true. Data encrypted with your private key can only be decrypted using your public key. This property has important implications for digital signatures (more in future posts).

Symmetric and Asymmetric encryption work differently, – they use different types of keys and different encryption/decryption algorithms.

Symmetric encryption is fast. Asymmetric encryption is slow.

How does email encryption work?

Violet wants people to encrypt the email they send her. To help them do this, Violet creates a (public, private) key pair. She wraps up her public key in a secure package called an X509 Digital Certificate (more on this in future posts) and gives the certificate containing the public key to those she is corresponding with. To make it easy for others to find her public key, she even publishes her certificate in a public directory.

Violet’s good friend Toby Toby decides to send her some encrypted email.

All Toby has to do is use Violet’s public key to encrypt the message, right? Wrong.

To use Violet’s public key to encrypt his email, Toby must use asymmetric encryption. Which, unfortunately, is slow. Toby cannot practically encrypt the content of his email using Violet’s asymmetric public key – it takes too much work!

To encrypt his email content, Toby needs a faster option – symmetric encryption. Toby generates a new symmetric encryption key and uses this key to efficiently encrypt the content of his email.

But how does Violet decrypt Toby’s email? To decrypt, Violet needs a copy of the symmetric encryption key, which she doesn’t have because Toby generated it on the fly and hasn’t given it to her yet! How does Toby securely send Violet a copy of his encryption key?

Toby cleverly solves the problem by attaching the encryption key to the email itself. The message brings its own key with it.

But isn’t that crazy? Anybody can now grab the key and decrypt the email, right? Wrong.

The clever Toby encrypts the symmetric encryption key before attaching it to the email. He does this using Violet’s public key, which he had obtained earlier. And even though this requires slow asymmetric encryption, the performance conscious Toby doesn’t mind because the encryption key is relatively small – usually only 256 bits long at most.

Toby sends his email to Violet. Naturally, Toby does not encrypt the addressing information on the message – the To & From – which have to travel in the clear, just like the addressing information on the envelope of a sealed snail-mail letter. Email servers use the addressing information to transport the email to its destination.

When Violet receives the email, she decrypts the attached encryption key using her private key. She then uses the encryption key to decrypt the email content and receives Toby’s friendly missive.

How do you encrypt email sent to multiple recipients?

Toby wants to send an email message to both Violet and Margaret. How does he encrypt this message?

Should Toby repeat the encryption process twice? Encrypt the email once for Violet and again for Margaret? And what happens if Toby also puts Gitanjali on the To line? Does Toby have to encrypt the message three times? And send out 3 different copies of the same message? Isn’t that getting really inefficient?

Toby has a much better idea. Just like before, he encrypts the email exactly once, using a symmetric encryption key. Then he attaches multiple copies of the same encryption key to the message – one for each recipient and encrypted with that recipient’s public key. Toby encrypts one copy of the encryption key with Violet’s public key. He encrypts a second copy with Margaret’s public key and third with Gitanjali’s. Then he attaches the 3 copies to the message.

When Margaret receives the email, she locates the copy of the encryption key that was intended for her. She decrypts the encryption key, then uses it to decrypt Toby’s note. Violet and Gitanjali do the same.

You can use the same technique to encrypt email sent to as many recipients as you like. Every new recipient merely means the small overhead of an additional attached copy of the encryption key.

S/MIME

You should now have a high level notion of how email encryption works. Those of you who are interested in the gory details should deep dive into S/MIMEthe defacto standard for securing email. Please do peruse the S/MIME and Direct Transport specs for a bit by bit commentary.

It takes more than encryption to secure email. See my follow up posts to learn how:

Source Code

The open source Direct Project Reference implementation contains a full S/MIME and secure email implementation. To learn how to encrypt and sign email and email content in C#, check out the SMIME source code.