Storing Passwords in Databases

One thing that needs to be kept in mind when storing passwords is that at some point the people who those passwords belong to will forget what their password is. The system you use therefore needs to provide some mechanism for allowing them to get access to their account when they have forgotten their password without making it easy for anyone to obtain access to an account that doesn't belong to them.

Just how far you go in securing your system will vary depending on how sensitive the data is that is protected by the password as the more complicated you make it to prevent unwanted access, the harder you make it for those who have forgotten their password.

The first thing you need to avoid in protecting passwords is storing them in the database in plain text. Storing the passwords in plain text makes it possible to tell someone what their password is when they have forgotten it but it also makes it possible for anyone who manages to break into the database via any means whatever to read all the passwords. Even where the data in this database isn't sensitive information, knowing what passwords people have used may make it easier to break into their accounts on other systems that do contain sensitive data where the person has been silly enough to use the same password in both places.

Encrypting the data doesn't help. Anyone with access to the database most likely has sufficient access to the system to be able to run whatever is needed to decrypt the passwords as well. After all the decryption routine needs to be called by whatever processing is designed to extract forgotten passwords from the system.

The minimum level of security that is required for passwords is to hash them. Now hashing algorithms were designed to provide a means of detecting tampering with original content. The original purpose of a hase was to generate a value from the content that will e completely different if even a small change is made to the original source. By transmitting the original content and the hash of that content separately and having the recipient rehash the content using the same algorithm, they can easily tell if the content is the same as was originally hashed. There are only a limited number of possible hash values and so there will be an infinite number of possible values that will match to each hash but in each instance the original will be so different to each other that changing the original to a different value that matches the same hash will not only be at least slightly difficult to work out but will also serve no purpose since that original would not be anything like what the recepient is expecting.

As hashes were not designed for protecting passwords, there are security issues relating to using a straight hash on the password that do not apply where the hash is being used for its original purpose where the recipient gets both the hash and the original content. Here we want to not only prevent them getting the original content, we also need to proevent them finding any other value for original content that will map to the same hash. To start with simpler hashing algorithms such as md5 and sha1are insufficient for this purpose when used by themselves since it is too easy to work out a value that will match the hash. Even more complex hashing algorithms such as sha224, sha256, sha384 and sha512 will eventually suffer this same issue as computers become more powerful.

The solution to this issue is to use what is known as a salt. This is a string of characters that is added to the password before it is hashed. Since the original value must now not only map to the hash but must also contain the salt value it is now far more difficult to find any value that will work as the password since applying the salt means that generally the only possible value that is going to match the hash that is ever likely to be found is the actual password itself. Also, while tools exist to help find a value that will generate a given hash, there are no tools that will help when the value you are looking for contains a specific salt value in it as well.

Some places go one step further and use a different salt for each password. What this does is to make it harder to break into multiple accounts. Instead of being able to try lots of different values with the one salt added and see if it matches any password in the database, the person trying to break in can now only look at one account at a time since that is the only account using that salt.

The final step in protecting the password which has nothing to do with the database at all is to use SSL for the page where the password is first entered so that it will be encrypted in the browser and only decrypted again on the server. That prevents someone who has no access to the server from being able to intercept the password before it gets to the server. Just using SSL for the login page though doesn't do much good if the pages being protected by the password don't also use SSL. Obtaining access to whatever token is being used to maintain the session once the password has been input still provides access to the account while the person is logged in and so if that is intercepted then that may provide all the access to the account that is needed without even needing to know the password.

So those are the types of security that passwords require. Just how far through that list you go depends on how sensitive the data is that you need to protect. Once you use a hash with a salt the hashes in your database will not provide any help in breaking into other sites since even if the person does use the same password somewhere else the salt used with the password on that site will be different and so there will be no way to tell by looking at the stored hashes to tell whetehr anyone is or isn't using the same password.

So let's return to our original problem. A legitimate user has forgotten their password. We have no way to tell what it is because we have made it as difficult as possible for everyone with access to the hashed value to be able to work backwards to work out what the original password is. We therefore can't just look up the password for someone who has forgotten it.

All we can do in this instance is to allow them to reset their password. This means that we need to use some alternative means to identify who they are in order to allow them to proceed with the password reset. Generally the way most sites do this is to send the person an email containing information that will allow them to get back into their account. The first thing necessary for this to work is for the email address to which this email is sent to be one that is already attached to their account. We then have two alternatives - we can either allocate a new password to them and send that in an email and then if we want to be extra secure we can require that they change the password once they log in. Even if we don't require that they change the password they most likely will anyway if you generate a random string of characters as the password. The alternative is to send them an email containing a special token value. They can then use that token to reset their password to whatever they want it to be. Having the token expire after a specific amount of time will also reduce the opportunity for someoone to intercept the value and thus hijack the account.

Some sites don't send an email where the person has forgotten their password. Instead they collect information up front that can be used by the person who has forgotten their password to identify themselves in order to be allowed to set a new password. The answers to these questions are effectively the equivalent of passwords in that they allow someone access to reset the password. As such the answers to these questions should have the same protection applied to them as is applied to passwords.


This article written by Stephen Chapman, Felgall Pty Ltd.

go to top

FaceBook Follow
Twitter Follow