Software Security Field Guide for the Bewildered

If you have worked your way in software for a number of years and you’re not a security specialist, you might be occasionally confronted by someone from ‘security’ who generally says ‘no’ to things you deliver.

For a long time I was in this position and was pretty bewildered by how to interpret what they were saying, or understand how they thought.

Without being trained or working in the field, it can be difficult to discern the underlying principles and distinctions that mark out a security magus from a muggle.

While it’s easy to find specific advice on what practices to avoid…

XKCD: Little Bobby Tables

…if you’ve ever been locked in a battle with a security consultant to get something accepted then it can be hard to figure out what rules they are working to.

WhenSecOps Shows up to the meeting When SecOps Shows Up to the Meeting (hat tip to http://devopsreactions.tumblr.com)

So here I try and help out anyone in a similar position by attempting to lay out clearly (for the layperson) some of the principles (starting with the big ones) of security analysis before moving onto more detailed matters of definition and technology.

Principles

‘There’s no such thing as a secure system’

The broadest thing to point out that is not immediately obvious to everyone is that security is not a science, it’s an art. There is no such thing as a secure system, so to ask a security consultant ‘is that secure?’ is to invite them to think of you as naive.

Any system that contains information that is in any way private is vulnerable, whether to a simple social engineering attack, or a state-funded attempt to infiltrate your systems that uses multiple ways to attack your system. What security consultants generally try to do is establish both where these weaknesses may be, and how concerned to be about them.

IT security is an art, not a science

This makes IT security an art, not a science, which took me some time to catch onto. There’s usually no magic answer to getting your design accepted, and often you can get to a position where some kind of tradeoff between security and risk is evaluated, and may get you to acceptance.

Anecdote: I was once in a position where a ‘secrets store’ that used base64 encoding was deemed acceptable for an aPaaS platform because the number of users was deemed low enough for the risk to be acceptable. A marker was put down to review that stance after some time, in case the usage of the platform spread, and a risk item added to ensure that encryption at rest was addressed by no later than two years.

A corollary of security being an art is that ‘layer 8’ of the stack (politics and religion) can get in the way of your design, especially if it’s in any way novel. Security processes tend to be an accretion of: specific directions derived from regulations; the vestigial scars of past breaches; personal prejudice; and plain superstition.

Trust has to begin somewhere

Often when you are discussing security with people you get into situations where you get into a ‘turtles all the way down’ scenario, where you wonder how anything can be done because nothing is ever trusted.

Anecdote: I have witnessed a discussion with a (junior) security consultant where a demand was made to encrypt a public key, based on a general injunction that ‘all data must be encrypted’. ‘Using what key?’ was the natural question, but an answer was not forthcoming…

The plain fact is that everyone has to trust something at some point in order to move information around anything. Examples of things you might (or might not) trust are:

The veracity of the output of dmesg on a Linux VM
The Chef server keys stored on your hardened VM image
Responses to calls to the metadata IP address when running on AWS (viz: http://169.254.169.254)
That Alice in Accounts will not publish her password on Twitter
That whatever is in RAM has not been tampered with or stolen
The root public keys shipped with your browser

Determine your points of trust

Very often determining what you are allowed to trust is the key to unlocking various security conundrums when designing systems. When you find a point of trust, exploit it (in a good way) as much as you can in your designs. If you’ve created a new point of trust as part of your designs, then prepare to be challenged.

Responsibility has to end somewhere

When you trust something, usually someone or something must be held responsible when it fails to honour that trust. If Alice publishes her password on Twitter, and the company accounts are leaked to the press, then Alice is held responsible for that failure of trust. Establishing and making clear where the trust failure would lie in the event of a failure of trust is also a way of getting your design accepted in the real world.

Determining what an acceptable level of trust to place in Alice will depend on what her password gives her access to. Often there are data classification levels which determine minimum requirements before trust can be given for access to that data. At the extreme end of “secret”, root private keys can be subject to complex ceremonies that attempt to ensure that no one person can hijack the process for their own ends.

Consequences of failure determines level of paranoia

Another principle that follows from the ‘security is an art, not a science’ principle is that the extent to which you fret about security will depend on the consequences of failure. The loss of a password that allows someone to read some publicly-available data stored on a company server will not in itself demand much scrutiny from security.

The loss of a root private key, however, is about as bad as it can get from a security standpoint, as that can potentially give access to all data across the entire domain of that key hierarchy.

If you want to reduce the level of scrutiny your design gets put under, reduce the consequences of a breach.

Key distinctions

If you want to keep pace with a security consultant as they explain their concerns to you, then there are certain key distinctions that they may frequently refer to, and assume you understand.

Getting these distinctions and concepts under your belt will help you convince the security folks that you know what you’re doing.

Encryption vs encoding

This is a 101 distinction you should grasp.

Encoding is converting some data into some other format. Anyone who understands the encoding can convert the data back into readable form. ASCII and UTF-8 are examples of encodings that convert numbers into characters. If you give someone some encoded data, it won’t take them long to figure out what the data is, unless the encoding is extremely complex or obscure.

Encryption involves needing some secret or secure process to get access to the data, like a private ‘key’ that you store in your ~/.ssh folder. A key is just a number that’s very difficult to guess, like your house key’s (probably) unique shape. Without access to that secret key, you can’t work out what that data is without a lot of resources (sometimes more than all the world’s current computing power) to overcome the mathematical challenge.

Hashing vs encryption

Hashing and encryption may be easily confused also. Hashing is the process of turning one set of data into another through a reproducible algorithm. The key point about hashing is that the data goes one-way. If you have the hash value (say, ae5690f1aff) then you can’t easily reverse that to the original

Hashing has a weakness. Let’s say you ‘md5sum’ an insecure password like password. You will always get the value: 5f4dcc3b5aa765d61d8327deb882cf99&oq=5f4dcc3b5aa765d61d8327deb882cf99

from the hash.

If you store that hashed password in a database, then anyone can google it to find out what your password really is, even though it’s a hash. Try it with other commonly-used passwords to see what happens.

This is why it’s important to ‘salt‘ your hash with a secret key so that knowledge of the hash algorithm isn’t enough to crack a lot of passwords.

Authentication vs authorization

Sometimes shortened to ‘authn‘ and ‘authz‘, this distinction is another standard one that gets slipped into security discussions.

Authentication

Authentication is the process of determining what your identity is. The one we’re all familiar with is photo id. You have a document with a name and a photo on it that’s hard to fake (and therefore ‘trusted’), and when asked to prove who you are you produce this document and it’s examined before law enforcement or customs accepts your claimed identity.

There have been many interesting ways to identify authenticity of identity. My favourite is the scene in Big where the Tom Hanks character has to persuade his friend that he is who he says he is, even though he’s trapped in the body of a man:

Shared Secret Authentication

To achieve this he uses a shared secret: a song (and associated dance data) that only they both know. Of course it’s possible that the song was overheard or some government agency had listened in to their conversations for years to fake the authentication, but the chances of this are minimal, and would raise the question of: why would they bother?

What would justify that level of resources just to trick a boy into believing something so ludicrous? This is another key question that can be asked when evaluating the security of a design.

The other example I like is the classic spy trope of using two halves of a torn postcard, giving one half to each side of a communication, making a ‘symmetric key’ that is difficult to forge unless you have access to one side of it.

torn paper 3

Symmetric vs asymmetric keys

This also exemplifies nicely what a symmetric key is. It’s a key that is ‘the same’ one used on both sides of the communication. A torn postcard is not ‘the same’ on both sides, but it can be argued that if you have one part of it, it’s relatively easy to fake the other. This could be complicated if the back of the postcard had some other message known only to both sides written on it. Such a message would be harder to fake since you’d have to know the message in both people’s minds.

An asymmetric key is one where access to the key used to encrypt the message does not imply access to decrypt the message. Public key encryption is an example of this: anyone can encrypt a message with the public key, but the private key is kept secret by the receiver. Anyone can know the public key (and write a message using it), but only the holder of the private key can read the message.

No authentication process is completely secure (remember, nothing is secure, right?), but you can say that you have prohibitively raised the cost of cheating security by demanding evidence of authenticity (such as a passport or a driver’s licence) that is costly to fake, to the point where it’s reasonable to say acceptably few parties would bother.

If the identification object itself contains no information (like a bearer token), then there is an additional level of security as you have to both own the objects, and know what it’s for. So even if the key is lost, more has to happen before there is a compromise of the system.

Authorization

Authorization is the process of determining whether you are allowed to do something or not. While authentication is a binary fact about one piece of information (you are either who you say you are, or you are not), authorization will depend on both who you are and what you are asking to do.

In other words: Dave is still Dave. But Dave can’t open the bay doors anymore. Sorry Dave.

Concepts

RBAC

Following on from Authentication and Authorization, Role-Based Access Control gives permission to a more abstract entity called a role.

Rather than giving access to that user directly, you give the user access to the role, and then that role has the access permissions set for it. This abstraction allows you to manage large sets of users more easily. If you have thousands of users that have access to the same role, then changing that role is easier than going through thousands of users one-by-one and changing their permissions.

To take a concrete example, you might think of a police officer as having access to the ‘police officer’ role in society, and has permission to stop someone acting suspiciously in addition to their ‘civilian’ role permissions. If they quit, that role is taken away from them, but they’re still the same person.

Security through obscurity

Security through obscurity is security through the design of a system. In other words, if the design of your system were to become public then it would be easy to expose.

Placing your house key under a plant next to the door, or under the doormat would be the classic example. Anyone aware of this security ‘design’ (keeping the key in some easy-to-remember place near the door) would have no trouble breaking into that house.

By contrast, the fact that you know that I use public key encryption for my ssh connections, and even the specifics of the algorithms and ciphers used in those communications does not give you any advantage in breaking in. The security of the system depends on maths, specifically the difficulty in factoring a specific class of large numbers.

If there are weaknesses in these algorithms then they’re not publicly known. That doesn’t preclude the possibility that someone, somewhere can break them (state security agencies are often well ahead of their time in cryptography, and don’t share their knowledge, for obvious reasons).

‘Anybody wanna shut down the Federal Reserve?’

It’s a cliche to say that security through obscurity is bad, but it can be quite effective at slowing an attacker down. What’s bad about it is when you depend on security through obscurity for the integrity of your system.

An example of security through obscurity being ‘acceptable’ might be if you run an ssh server on (say) port 8732 rather than 22. You depend on ssh security, but the security through obscurity of running on a non-standard port prevents casual attackers from ‘seeing’ that your port 22 is open, and as a secondary effect also can prevent your ssh logs from getting overloaded (perhaps exposing to other kinds of attack). But any cracker worth her salt wouldn’t be put off by this security measure alone.

If you really want to impress your security consultant, then casually mention Kerckhoffs Principle which is a more formal way of saying ‘security through obscurity is not sufficient’.

Principle of least privilege

The principle of least privilege states that any process, user or program has only the privileges it needs to do its job.

Authentication works the same way, but authorization is only allowed for a minimal set of functions. This reduces the blast radius of compromise.

Blast radius is a metaphor from nuclear weapons technology.
IT people use it in various contexts to make what they do sound significant.

A simple example might be a process that starts as root (because it might need access to a low-numbered port, like an http server), but then drops down. This ensures that if the server is compromised after that initial startup then the consequences would be far less than before. It is then up for debate whether that level of security is sufficient.

Anecdote: I once worked somewhere where the standard http server had this temporary root access removed. Users had to run on a higher-numbered port and low-numbered ports were run on more restricted servers.

In certain NSA-type situations, you can even get data stores that users can write to, but not read back! For example, if a junior security agent submits a report to a senior, they then get no access to that document once submitted. This gives the junior the minimal level of privilege they need to do their job. If they could read the data back, then that increases the risk of compromise as the data would potentially be in multiple places instead of just one.

Blast radius

There are other ways of reducing the blast radius of compromise. One way is to use tokens for authentication and authorization that have very limited scope.

At an extreme, an admin user of a server might receive a token to log into it (from a highly secured ‘login server’) that:

can only be used once
limits the session to two minutes
expires in five minutes
can only perform a very limited action (eg change a single file)
can only be used from a specific subnet

If that token is somehow lost (or copied) in transit then it could only be used before it’s used (within five minutes) by the intended recipient for a maximum of two minutes, and the damage should be limited to a specific file if (and only if) the user misusing the token already has access to the specified network.

By limiting the privileges and access that that token has the cost of failure is far reduced. Of course, this focusses a large amount of risk onto the login server. If the login server itself were compromised then the blast radius would be huge, but it’s often easier for organisations to manage that risk centrally as a single cost rather than spreading it across a wide set of systems. In the end, you’ve got to trust something.

Features like these are available in Hashicorp’s Vault product, which centralise secrets management with open source code. It’s the most well-known, but other products are available.

N-Factor authentication

You might have noticed in the ‘Too Many Secrets’ clip from the film Sneakers above that access to all the systems was granted simply by being able to decrypt the communications. You could call this one-factor authentication, since it was assumed that the identity of the user was ‘admin’ just by virtue of having the key to the system.

Of course, in the real world that situation would not exist today. I would hope that the Federal Reserve money transfer system would at least have a login screen as well before you identify yourself as someone that can move funds arbitrarily around the world.

A login page can also be regarded as one-factor authentication, as the password (or token) is the only secret piece of information required to prove authenticity.

Multi-factor authentication makes sure that the loss of one piece of authentication information is not sufficient to get access to the system. You might need a password (something you know), and a secret pin (another thing you have), and a number generated by your mobile phone, and a fingerprint, and the name of your first pet. That would be 5-factor encryption.

Of course, all this is undermined if the recovery process sends a link to an authentication reset to an email address that isn’t secured so well secured. All it takes then is for an attacker to compromise your email, and then tell the system that you’ve lost your login credentials. If your email is zero- or one-factor authentication than the system is only as secure as that and all the work to make it multi-factor has been wasted.

This is why get those ‘recovery questions’ that supposedly only you know (name of your first pet). Then, when people forget those, you get other recovery processes, like sending a letter to your home with a one-time password on it (which of course means trusting the postal service end-to-end), or an SMS (which means trusting the network carrier’s security). Once again, it’s ‘things you can trust’ all the way down.

So it goes.

Acceptable risk and isolation

We’ve touched on this already above when discussing the ‘prohibitive cost of compromising a system’ and the ‘consequences of a breach’, but it’s worth making explicit the concept of ‘acceptable risk’. An acceptable risk is a risk that is known about, but whose consequences of compromise are less than the effort of

A sensible organisation concerned about security in the real world will have provisions for these situations in their security standards, as it could potentially save a lot of effectively pointless effort at the company level.

For example, a username/password combination may be sufficient to secure an internal hotel booking system. Even if that system were compromised, then (it might be argued) you would still need to compromise the credit card system to exploit it for material gain.

The security consultant may raise another factor at this point, specifically: whether the system is appropriately isolated. If your hotel booking system sits on the same server as your core transaction system, then an exploit of the book system could result in the compromise of your core transaction system.

Sometimes, asking a security consultant “is that an acceptable risk?” can yield surprising results, since they may be so locked into saying ‘no’ that they may have overlooked the possibility that the security standards they’re working to do indeed allow for a more ‘risk-based’ approach.