Friday, November 6, 2015

Secure Cloud Storage

Google will tell you that Drive is secure. Dropbox will tell you their service is secure. Microsoft tells us that OneDrive is secure. Amazon tells their clients that S3 is secure. Everyone tells us that the cloud storage they give us is secure. But what does that really mean?

For the most part it means that the data might be encrypted at rest (many cloud storage providers tell us this), that you need to authenticate to get access to the data, and that the data is backed-up. And, they are right up to a point.

What they aren't telling us, except for one provider (as far as I know), is that our data is not protected from the cloud storage provider itself. As in: the staff and applications of the cloud provider are able to read out files without us know (although in the case of staff it is probably very limited and requires significant effort).

Google basically admits this in their EULA where they tell us that they will use the content of our files (and e-mails) to help choose ads that are more suited to us. Microsoft claims that they can't see our files because they are encrypted using our login password (or they used to) but I bet there is a way to recover our password if we forgot it (which means they must be able to decrypt the files without our password being entered. The others vary between these two points.

Why does this matter?

For starters, you might want to use the cloud storage to back up sensitive information that you don't want to lose in case your computer hard drive crashes. Say, for example your financial records. Or, for many of us, that huge list of passwords for all the different websites, forums, and online services we all use now. Maybe you are a private type of person who feels uncomfortable knowing that other people might be able to look at your files. There could be files of a personal nature you don't want to get out, even accidentally. 

The point is, there are a lot of very legitimate reasons why a person would want cloud storage and not want the cloud provider to ever have access to those files. Even if it means you might lose access to those files yourself.

The big question is how do we protect our data when it is not on systems entirely under our control? The obvious answer is simple. We encrypt it on systems that are under our control before sending it away. To keep things convenient we really want to have this happen in a way that is transparent, or nearly transparent to us.

Just like cloud storage, we want to write a file to a directory and let some program that runs in the background encrypt it before it gets sent away for safe keeping.

What can we do?

Once upon a time, a long time ago, a smart person with a strong understanding of encryption created a tool called Truecrypt. He created it to solve a different problem. He wanted to be sure that if his laptop got stolen, people couldn't get access to his data. He wasn't worried as much about cloud storage. At least that is what he said. 

Much of the world saw value in Truecrypt and started using it. Many of us saw value in it beyond keeping our laptops safe should they get stolen. We saw it as a way to pass large files to other people securely. We saw it as a way to be sure our files could not be looked at even when they were placed in cloud storage. We liked this too. We trusted this tool.

Unfortunately, the author(s) of Truecrypt decided it was no longer needed and they abruptly shut down the project. The shutdown was so abrupt that many people don't believe the reasons given by the authors of the software. But, they gave a reason. The reason was simple, operating systems now have built-in disk encryption. So, they don't need to maintain Truecrypt and they stopped.

Truecrypt was never an ideal solution for cloud storage. It needed big files to be useful. You put your little files in the big files to keep them safe. This meant a lot of data has to go to the cloud storage provider and come back every time one little file changes (except in the case of Dropbox).

In any case, Truecrypt is no longer supported. There are some alternatives. VeraCrypt and CipherShed have both taken the last public release of the source code to Truecrypt and begun making their own changes and improvements. But, Truecrypt was never ideal for cloud storage because of that big file problem.

Encrypted File System (encfs)

There is this obscure tool that came out in the Linux world around the same time as Truecrypt. It was unstable, unproven, and only worked on Linux at the time. But, it had one major advantage over Truecrypt. It worked at the file level. That is, it encrypts each file separately. So, when the cloud sync happens, it only needs to send the files that actually changed.

EncFS has improved since then. It is now up to version 1.7 on Linux and we are starting to see some effort in getting it to common operating systems in a reliable and consistent manner. The user interface is still ugly. The tools for Windows and OSX are still pretty sparse and buggy but it looks like it is coming.

At the time of this writing there are a couple of Windows and OSX ports worth keeping an eye on:

  • Safe ( has both Windows and OSX support but seams unstable
  • EncFSmp ( supports both Windows and OSX. It appears to be a bit more stable than Safe, is still in Beta, and has a few growing pains to work out with the UI. This is the one I'm currently using.
  • OSXFUSE is an implementation of another Linux tool for OSX. FUSE is the tool that encfs was built to use (Safe and EncFSmp appear to have replaced FUSE functionality with alternatives built into Windows and OSX). While there is an OSX FUSE supported encfs, it requires that  you build it yourself.
  • Encfs4Win is an experimental port of encfs on Windows that requires fuse4win and is Windows only.
One very important thing to keep in mind if using encfs is that the encryption key is derived from the password you set. This means that the encryption is only as good as the password is complicated. Explaining this can get a bit complicated so here is a simple example:

A one word password taking from the words you know can be guessed by a modern desktop in under 10 seconds.

A password that is 8 characters long and randomly generated using printable characters can be guessed by a modern desktop computer in under 5 minutes. This is about on par with an old encryption algorithm from the 1970's called DES. It is not supposed to be used anymore.

A password that is 16 characters long and randomly generated using printable characters is as complicated as the keys used for AES (the replacement for DES) created in the 1990's. This is still considered acceptable for use today by governments and banks. It is probably good enough for us.

A password that is 32 characters long and randomly generated is as good as AES 256 encryption keys. This is as strong as encfs gets when put in paranoid mode. Odds are it will take a determined government months, or years, to crack this type of encryption key.

The problems with the usable passwords (above) is that most people won't be able to memories them and who in their right mind wants to type 16 (or even 32) characters at a password prompt.

If you aren't trying to prevent people who have access to your computer (or laptop) from seeing the encrypted files, you can keep the password in a text file to cut-and-paste it into the password prompt for encfs. Even better, EncFSmp conveniently saves the password so you don't have to keep entering it.

As to how you generate such an ugly complicated password. That is up to you. I use a tool (that runs on both OSX and Window) called KeePass. Conveniently, it also stores the passwords in a secure way.

No comments:

Post a Comment