Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Should a salt be stored in the same database as the hash?
To protect against dictionary and rainbow table attacks it is well known that passwords should be salted before hashing. The salt (unique to each password) gets stored with the hash, often in the same string separated by a semi-colon.
However if the salts and hashes are stored together and the database is compromised then the attacker will have access to each salt used for each hash, which defeats the purpose of the salt.
Is this a legitimate concern? Should salts be stored in a separate database to hashes?
The purpose of a cryptographic salt is to make the same input (password) hash to different values in different instances …
4y ago
When I first learned about salting, I thought the same thing. But as I understand it, because each salt is unique (or su …
4y ago
The other answers are correct, but over complicate things. Suppose you have a database with 1,000,000 email addresse …
9mo ago
With bcrypt, the salt is stored in the same string as the hash. This is done so that you have everything you need to get …
9mo ago
The issue with using two separate databases is you need to: - store both access strings - back up both databases - …
4y ago
5 answers
The purpose of a cryptographic salt is to make the same input (password) hash to different values in different instances, yet retain the hash function's deterministic properties. Salting accomplishes that by concatenating a random value with the password itself before hashing, and storing the value of the salt somewhere alongside the salt+password hash.
This, in turn, renders generating and storing lists of precomputed hashes impractical, because each candidate inputs list will need to be hashed separately for each salt value. If the salt value is large enough, this causes the work and storage requirements to become prohibitive, even for relatively small lists of candidate input values (passwords). This forces the attacker to compute each hash separately for the particular salt; they can't precompute hashes and then reduce finding the password to little more than a table lookup or a search.
Since the salt is chosen at random, there is a very good chance that every single user account (and every time each user changes their password) has a different salt. A large enough salt can virtually guarantee global uniqueness simply by being picked at random.
At the same time, any process that needs to verify that a given candidate input matches the hash, will by necessity require access to the salt value for the particular hash anyway.
The idea behind this is that it forces the attacker to basically do the same work as the legitimate software for each candidate user account and password combination. In the case of a legitimate user authenticating to a legitimate system, making computing the password hash take 10 or 20 or even 100 ms is largely inconsequential, and storing precomputed values is of no benefit; but making an attacker do that work for each combination of user account and candidate password greatly increases the attacker's workload compared to being able to do the work just once for each candidate password.
The fact that the salt is stored together with the hashed password does not materially change that.
Therefore, because of what a salt is intended to do and how it does that, as long as the salt is meaningfully large and selected at random, storing the salt together with the password hash should not materially decrease system security compared to storing the salt separately. As already discussed, it also likely makes it much easier to keep the two values in sync both when reading and when writing, both of which are critical to enable successful authentication by legitimate users.
There is, however, one significant exception to the above reasoning. In NIST SP 800-63B, a reference is made to a "secret salt", more commonly referred to as pepper. This is a value that acts similarly to a salt, but is stored separately and normally is (but does not need to be) the same for all accounts. The purpose of a pepper is to mitigate against the risk of an attacker obtaining a copy of both the salt and the hashed password; if the pepper is large enough, and inaccessible to the attacker (for example, by being stored and processed only within a Hardware Security Module which only exposes, say, an interface that allows hashing a single input, or even only one confirming whether or not a specific input matches a specific hash), this means that the attacker does not have access to all the information required to even confirm whether a specific password guess is correct or incorrect for a specific account given a full data dump; they also need access to the pepper.
0 comment threads
When I first learned about salting, I thought the same thing. But as I understand it, because each salt is unique (or substantially unique within a given database table), storing the salt & hash together does not open you up to a rainbow attack. On the other hand, if they are stored separately (separate database; separate fields in a table or even separate tables in a database would in the end be little different from all in one field), you now need to do multiple retrievals to verify the data (one to get the salt to create the hash, the other to verify the hash), which slows down access, raises serious synchronization issues, and may even increase possible security vulnerabilities.
0 comment threads
The other answers are correct, but over complicate things.
Suppose you have a database with 1,000,000 email addresses and password hashes, 20% of whom are idiots and have used "password123" as their password.
Without salts - the attacker calculates the hash of "password123" once, compares it with all the hashes, and a millisecond later knows the password of 200,000 of the users, all at once.
With salts - the attacker has to calculate the hash of "password123" with EVERY SINGLE unique salt. So checking which users have used "password123" as their password just cost them a million times more processing power.
The attacker having access to the salts does not "defeat the purpose of the salt": the salt doesn't attempt to defend against "The attacker targeting one and only one password, and is going to attack that until they find it, and then stop". You salt hashes to defend against "The attacker is hoping to retrieve at least some of the passwords in the database, and they don't care which ones".
0 comment threads
With bcrypt, the salt is stored in the same string as the hash. This is done so that you have everything you need to get that hash identifier if you know the password.[1] Wikipedia breaks down the format:
Description
The input to the bcrypt function is the password string (up to 72 bytes), a numeric cost, and a 16-byte (128-bit) salt value. The salt is typically a random value. The bcrypt function uses these inputs to compute a 24-byte (192-bit) hash. The final output of the bcrypt function is a string of the form:
$2<a/b/x/y>$[cost]$[22 character salt][31 character hash]
For example, with input password
abc123xyz
, cost12
, and a random salt, the output of bcrypt is the string$2a$12$R9h/cIPz0gi.URNNX3kh2OPST9/PgBkqquzi.Ss7KIUgO2t0jWMUW \__/\_/\____________________/\_____________________________/ Alg Cost Salt Hash
Where:
$2a$
: The hash algorithm identifier (bcrypt)12
: Input cost (212 i.e. 4096 rounds)R9h/cIPz0gi.URNNX3kh2O
: A base-64 encoding of the input saltPST9/PgBkqquzi.Ss7KIUgO2t0jWMUW
: A base-64 encoding of the first 23 bytes of the computed 24 byte hashThe base-64 encoding in bcrypt uses the table
./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
, which differs from RFC 4648 Base64 encoding.
-
Or password + pepper, if you do that. ↩︎
0 comment threads
The issue with using two separate databases is you need to:
- store both access strings
- back up both databases
- manage both databases
- keep both databases patched
By the time you have done that, the risk of both databases getting hacked is much the same as the risk of a single database getting hacked.
1 comment thread