I use a program called ESPlanner to help with planning our insurance and retirement portfolio. ESPlanner wants to move to the cloud. Below I explore who I imagine would want to attack a site like ESPlanner and what sort of things cloud services like ESPlanner can do to frustrate their attackers. I especially look at using derived keys and per user encryption to potentially slow down attacks. But in the end, I'm uncomfortable with the legal protections afforded me as a service user in the US and so I really want a download version of ESPlanner.
Important Disclaimer - Please Read
Although this article is theoretically about ESPlanner in reality it’s about securing data in the cloud. ESPlanner had no input to this article. They just make an interesting example to use to explore this problem space.
1 What is ESPlanner?
ESPlanner is a piece of software in which one enters all the gory details about one’s financial life and in return it produces recommendations for buying insurance, projects future income, etc.
2 Who would want to attack them?
ESPlanner does not have your banking passwords. In fact, it doesn’t even know what your specific bank accounts or investments are. It needs more rolled up data such as “how much cash do you have total across all accounts?” or “how much money in all accounts do you have in municipal bonds?”. That sort of thing. It also doesn’t want or need information such as social security numbers.
So at first glance it looks pretty harmless. Not much here to motivate an attack.
But I believe it’s a goldmine for smart attackers.
My guess is that the most obvious people to have an interest in ESPlanner’s data are 419 scammers. I have been informed that there forums online where stolen data can be sold. A reliable seller (yes, apparently they have reputations) selling a list of email address, names, ages, retirement status, future home buying plans, future college plans, some personal information (such as children’s names) along with some financial data is a gift from heaven for 419 scams. And since the market for this data already exists an attacker can quickly calculate their likely profit.
Another group who I believe would be interested in ESPlanner’s data are thieves who specialize in stealing data from banks. The problem, so I’m told, is that there is a lot of effort in stealing from banks. The process goes something like:
- Find a likely target’s email address
- Phish them to get a Trojan on their machine
- Use the Trojan to collect the user’s name and passwords to their bank accounts
- Use the names and passwords to enter the bank accounts and transfer their money to an account the attacker controls
- Get a shill to pull the money out of the attacker controlled account and move it as cash to another account (thus killing the trail)
- Profit!
This process is not easy. But enterprising criminals have made it easier than one might think. I’m told that there are black markets where one can buy phishing services along with password trojans. There are also black market services for hiring shills to pull money out of accounts. In other words this whole attack can be arranged completely remotely using nothing but online markets. I know this sounds nuts but type in something like ’black market stolen data’ to your favorite search engine and you will get a sense for how well run these markets are. There are even standard prices for standard services like bot nets, phishing, trojans, front end attacks, etc.
But the real trick, I’m told, with bank attacks is actually step 4. Most banks make it hard to move money around. It’s the kind of thing that typically requires paperwork, signatures, etc. It’s expensive and easily detected. That’s why most attackers focus on things like credit cards rather than direct bank hacks.
But... what if... what if our enterprising hackers could find a list of email address (for phishing) associated with their net worths? In other words, what if the attackers could know ahead of time which email addresses where owned by people with enough money in their bank accounts to be interesting but not enough to really call too much attention to themselves?
That’s where ESPlanner comes into the picture. It’s a directory of valuable targets. It lets attackers focus on where the meat is.
What I have no idea about is how likely either of these attacks are to actually happen. I’m not aware of any good statistics on this, especially since there is no motivation for most companies to know that they were attacked (see 4↓). So who knows how common this all is?
3 What can Cloud Services like ESPlanner do?
3.1 Eating their Wheaties
There are obvious things they can do. For example, they should make sure that their cloud VMs are as stripped down as possible (e.g. reduce security surface). They can make sure they are running the latest patches. They can encrypt all data in transit. They can eat an apple every day. This is all pretty standard cloud security stuff.
Even trimmed down software still has a lot of security surface area to cover and holes are constantly being found in any and all software. So an attacker can go on the black market, buy access to the latest exploit engines and have a shot at breaking just about any service’s security.
So assuming you are “eating your Wheaties” what more can you do?
3.2 Watchful waiting
Next up we can make sure there are third party monitors on data access. Remember, the key attack here is stealing the user data. Typically in a cloud solution the data is going to be stored in some kind of storage/table/etc. service. So the cloud provider should be able to provide tools that can monitor storage usage and provide alerts (and shut downs) if things go sideways. For example, if data usage suddenly spikes to levels much higher than normal that is a good sign of an attack. Or... you know... a successful day. The reality is that monitoring is useful but it’s really hard to tell the difference between success and an attack. So on it’s own it’s not enough.
3.3 Encryption pixie dust
So usually the next step is to sprinkle some encryption pixie dust. A classic approach is to encrypt all user data with some global key. This is a good thing to do for two reasons. First, it makes attacks just against the back end useless since they only get noise. Second, it provides defense in depth against bad handling of the disk drives. Ideally a cloud service should clean drives before getting rid of them but that doesn’t always happen. Although by now I would imagine that just about all cloud services exclusively run their customer’s data on top of encrypted partitions but I don’t know that for a fact.
Unfortunately encrypting the data on the back end provides no protection against a front end attack. If the attacker can get into the front end then they can abscond with the global encryption key and that’s that. Ideally monitoring systems might detect the misuse of the global key but even a mildly intelligent hacker should be able to work around such monitoring.
3.4 More encryption pixie dust - Hardware Security Module (HSM)
We can make the attackers life slightly more annoying if instead of keeping the global decryption key on the front end we instead keep the key in a HSM. That way there is, as a practical matter, no key to steal. Instead the attacker will need to pass all the encrypted data through the HSM in order to get access to it. This provides two points of monitoring, the storage layer and the HSM layer.
But in practice nobody does it this way because passing data through the HSM would be ridiculously expensive, not to mention slow. What most folks are going to do is something slightly less secure but more economical.
Let’s say that we have the data for user U, we’ll call that Du. Let’s further say that we generate a unique encryption key for user U, let’s call that Ku. We then store in the storage layer Encryption(Ku,Du) = EKuDu.
The trick then is that we store in the back end Encryption(Kg,Ku) = EKgKu where Kg is a global key for the whole service. When we want to get to user U’s data we read in EKgKu and pass it to the HSM who uses Kg to decrypt it and return Ku. We then use Ku in the front end to decrypt EKuDu to get Du. We then discard Ku when we don’t need it anymore for that session.
What this does is allow us to encrypt all the data in the back end with a key we can’t leak (Kg, which is in the HSM) but still get access to user data without having to push it all through the HSM (which would be slow as all get out).
It’s not a perfect solution since any attacker that has access to the front end can still make requests to the HSM to retrieve various user’s keys and then use those keys to steal from the back end. But at least we now have two points of monitoring, HSM accesses and data accesses, that we can try to use to look for odd behavior. One could imagine doing correlations between the Identity Provider (IdP), the front end, HSM and storage layer to look for things like user data being accessed for users who aren’t logged in.
But we have to remember that monitoring is never perfect. It’s a third string defense at best. If you are relying on monitoring to protect yourself system you are probably going to have a bad time.
3.5 Even more encryption pixie dust - Per user keys
The problem with just using the HSM is that anyone who hacks the front end can steal any data they ask for. That isn’t going to really slow the attacker down much. But what if the attacker can only get data for users who are actively logged in? For services, like ESPlanner, where users typically don’t log in very frequently (how often do you mess with your financial profile?) this could be a powerful protection.
Before we continue a very little bit of terminology:
Identity Provider (IdP) This is a service that logs users in. It holds the user’s credentials and validates their identity. Examples include Microsoft Active Directory or Google Plus.
Relying Party (RP) This is a service that uses an IdP to log in users. ESPlanner is an example of an RP.
So let’s go back to user U’s data, Du. We will generate a per user key Ku. So we store Encryption(Ku, Du) = EKuDu. But then we encrypt Ku using a derived key. A derived key is a key created via an algorithm that takes some number of inputs and uses them to create a key. The magic of derived key algorithms is that it’s practically impossible to calculate what a key is unless you have all the inputs to the derived key algorithm. So even if some of the inputs leak, you are still secure.
The derived key in our scenario would be derived from two secrets. The first value is a global secret for ESPlanner we will call Se. The other secret is generated by the user’s IdP and is unique per user/per relying party, we’ll call that Sieu for the secret for IdP I, RP ESPlanner and user U. So now we can run DerivedKey(Se, Sieu) = DKu. So we store Encrypt(DKu, Ku) = EDKuKu.
Now for a front end to successfully access a user’s data the front end has to know two pieces of data, Se and Sieu. Se is presumably recorded in the RP’s HSM and so always available (if not leakable). Sieu is generated and persistently stored by the IdP. The IdP will generate a unique secret on a per RP/per user basis. So users A, B and C of ESPlanner that are using the same IdP will have secrets Siea, Sieb and Siec, respectively, generated for them. The point being that for each combination of RP and user there is a unique secret held by the IdP. This means that even if the front end has Sieu for User U that doesn’t provide the ability to access user A, B or C’s data since they each have different secrets that can only be retrieved from the IdP during login. Hence an attacker can only access the data of users who happened to login while the attack was underway.
Just to hammer home the point. The only way ESPlanner’s front end can get to a user’s data is if the front end has both ESPlanner’s global secret and the IdP’s per RP/per user secret. ESPlanner will only get the IdP’s per RP/per user secret during login and will discard it when done. So an attack can only get access to user data for users who login during the attack and therefore provide the attacker with the IdP’s per RP/per user secret. No login? No attack.
Also note that the IdP can’t compromise user data either. Even if the IdP published all of its secrets those values on their own are not sufficient to recover the user’s data. One must also have access to Se as well, that is, the RP’s global secret.
The point of this exercise is to slow the attacker down and hopefully make the attack less worthwhile. That is, the attacker can only steal data at the rate that users log in. If most users don’t log in that frequently then this is a powerful way to slow down attacks.
3.5.1 How does ESPlanner get Sieu?
In most situations login/permission protocols like OAuth or Open ID Connect or SAML or what have you follow the same pattern. The relying party (RP) will redirect the user to the Identity Provider (IdP) with a login request. The IdP then confirms the user’s identity and responds with a login token. So clearly Sieu gets sent in the login token.
In the simplest case the IdP can encrypt Sieu using the RP’s global public key. That way nobody can see the secret. But in practice this isn’t ideal.
An attacker could launch a man in the middle attack where they silently intercept large numbers of login tokens. Then when they have enough of them they could hack the RP’s front end and try to use those tokens along with the RP’s HSM to get access to user’s data. The idea being to launch the attack quickly and for a short enough period of time to potentially escape getting caught by any threshold monitors. Presumably one is using TLS for all communications between all parties (IdP, RP and user). This means that tokens can’t be collected via a straight forward Man In the Middle attack. But the attacker can still launch an attack against the RP’s front end and just quietly collect tokens, storing them away some where but not trying to use them. Only when a critical mass of tokens have been collected then would the attack be launched.
A way to frustrate this ’sit and wait’ strategy is to use nonce keys for the exchange. The RP’s front end would generate a new key pair inside the HSM before issuing a login request and then include the nonce public key in the signed/encrypted login request. The IdP will then encrypt Sieu with the nonce key. Once Sieu is decrypted the nonce key is discarded inside the HSM. This provides forward secrecy and renders any stockpiled tokens useless. The point is to force the attacker to issue requests against the HSM, storage, etc. in order to increase the probability of detection.
3.5.2 What happens with Se, Sieu or Ku need to be rolled over?
It is a truth of cryptography that keys need to be rolled over. They can be compromised. They can be used to the point where there is enough data around to make cryptography easier. Etc. In the case of replacing Se the easiest approach is to just stop using Se for any new values. As users login we check if their derived key used the old Se and if so use it to get their old DKu and use the new Se to generate a new DKu.
Rotating Sieu requires a protocol extension where the IdP can send both the old Sieu and the new Sieu so that we can do a similar decryption (to retrieve Ku) and re-encryption with the new Sieu.
Rotation Ku is easy if we only do it when the user logs in. If we want to do it earlier then we need a protocol with the IdP to request Sieu in order to finally get to Ku and change it.
Key rotation, it should be pointed out, is one of the more dangerous moments in a service’s existence. This is a point in time where a large number of user secrets are being accessed in a short period of time. So a lot of precautions have to be taken. Otherwise a smart adversary will make it look like an attack has happened (even if it hasn’t), trigger a key roll over and then attack during the key roll over process.
3.5.3 What happens if the IdP doesn’t support generating Sieu?
Since I don’t actually have a magic wand I must assume that there will be a long period of time before the features I propose above are adopted. But that doesn’t remove the utility of the idea. As cloud services offer their own stand alone RP services (e.g. a service that handles the login process on behalf of the relying party) it would be pretty easy to extend those services to support a variant of this protocol. In the normal course of business these services work by having the RP’s front end bounce the user to the RP login service who then bounces the user to the IdP, the IdP bounces the user back to the RP login service who then bounces the user back to the RP front end. The RP Proxy handles different protocols, IdP types, etc. so the RP front end doesn’t need to worry itself with these details.
Thus the RP Proxy is a great place to plug in this functionality. When the RP Proxy deals with an IdP who doesn’t support generating Sieu then the RP Proxy can generate the value itself and forward it in the response login token to the RP front end. This still provides a strong level of security because the RP Proxy is a separate service run by separate folks using different software than the RP front end. So the RP front end would still keep its own Se and the RP Proxy the Sieus.
3.5.4 Is this worth doing?
Security is just a form of insurance, don’t buy more than you need. If an RP’s data usage pattern is amenable to per user encryption and if an off the shelf library is available to provide this capability as part of a cloud infrastructure then this is a reasonable and fairly low pain way to frustrate attackers. The pain is in implementing the library, in actual usage it’s just ’Get me user X’s data’ and you are on your way.
But one suspects that nothing can stop a sufficiently dedicated and well resourced attacker. A dedicated attacker can sit quietly on the front end, intercepting actual user data, recording desired information in hidden places (temporary directories, log files, etc.) and then slowly exfiltrating it so alarms won’t go off. A dedicated attacker can hack the developer’s machines using phishing or zero days and spoil the software at the source. A dedicated attacker can break into the data center or bribe cloud hosting employees and so on.
The point is - how dedicated is your attacker? Take a look at the prices listed at the bottom here, where does your data come in? Now multiply that by your number of users. Now you have some idea of what your data is worth on the market and can decide what is and is not worth defending against.
4 Why I still want a downloadable version
Even if ESPlanner takes these precautions, I still wouldn’t feel comfortable using a cloud version. The reality is that I have no way of knowing what they actually implement or not and how well they have done it and how well they have maintained it. By running the software on my own machine in a restricted VM with locked down networking I can control what data is available and to whom. Of course it’s highly likely that ESPlanner’s VM will be more secure than my local VM. Remember, my VM is running a user level OS with UX and programs and such. It is no where near as locked down as what ESPlanner could run. But nevertheless unless I’m directly being targeted (in which case the attacker need not bother with ESPlanner) I’m still likely to be more secure in practice.
But what makes things even worse is that companies (at least in the US) appear to have little incentive to worry about security. As near as I can tell companies have basically zero liability in the US for hacking attacks, even if such attacks succeed due to the failure of the victim to take reasonable precautions. That is, the T.J. Hooper standard is truly dead. Heck, last I checked, we were still trying to get some kind of federal law to even require companies to tell people when they have been hacked!
And for corporations there seems to be no reputation repercussions from being massively hacked. Go take a look at something like this website. How many of those companies went out of business or were even seriously affected by leaking their employees and customer’s data? Have people stopped buying from Sony? Home Depot? Have people abandoned JP Morgan Chase?
In America it would appear that a service can be hacked, leak user data all over the place, and face literally no meaningful consequences either legal or reputation based.
So personally, I want a download version.