Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

What is the difference between hashing and encryption?

+2
−1

According to this article:

Since encryption is two-way, the data can be decrypted so it is readable again. Hashing, on the other hand, is one-way, meaning the plaintext is scrambled into a unique digest, through the use of a salt, that cannot be decrypted.

What specific algorithm makes it possible to scramble data into an unrecoverable form, yet still be usable for its intended purpose? Is it something like a checksum, in which a function can be applied to a hash to validate something about it while making it impossible to forge an unauthorized piece of hashed data?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

3 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+1
−1

Hashing is lossy compression. You can't recover the input of a hash from the result.

This would obviously not work as an encryption. How would you decrypt it, if half the message is destroyed :)


Consider the SHA hash. You can hash a 1 GB file into a 0.1 kB string. Wow! Why don't we just send people hashes and save all that bandwidth? The problem is that millions of files all hash to the same thing. You can't know which is the correct one.

Ciphertext is usually the same size as the plaintext.


What specific algorithm makes it possible to scramble data into an unrecoverable form, yet still be usable for its intended purpose? Is it something like a checksum

Yes. Simply counting the number of 1 bits in a file is a very primitive hash. You can see why it's lossy: You're throwing away their location and where the 0 bits are.

Multiplying each byte and taking its mod is another primitive one. Again, it is painfully obvious why it's lossy. So is the utility: If some of the bytes are corrupted, the final value would probably change.

Modern hashes, especially secure ones like SHA, are a bit more complex but they still rely on things like sums, products and mods which "eat" information.


A non-digital example would be if I "hashed" a book by giving you the last word of each page. This would very effectively identify the book, but it is obvious that you cannot reconstruct the whole book from this hash, and why. There's no special "scrambling" going on, you're just throwing away data.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+6
−0

What specific algorithm makes it possible to scramble data into an unrecoverable form, yet still be usable for its intended purpose?

It isn't any one specific algorithm. There are many different algorithms both for hashing and for encryption.

Is it something like a checksum, in which a function can be applied to a hash to validate something about it, while making it impossible to forge an unauthorized piece of hashed data?

Actually, the hash itself is something like a checksum. Typical usage is to perform the hash function on the original data (e.g., a password) and save the result. Then on next login the user enters the password and the same function is run and the results are compared. This is better than storing the password because if someone broke into the database they would not get the password, just the hash result. In this particular case, there is no need to retrieve the original value - in fact, ability to retrieve the original value would make the database more vulnerable.

One additional advantage of hashing is size. If you encrypt a megabyte of data you will need (more or less) a megabyte to store the result. If you hash a megabyte of data you need some much smaller size, perhaps 256 or 512 bits (32 or 64 bytes). Hashing a large amount of data does no good if you want to retrieve it, but it can be used to verify things, much like a checksum. If you have a large amount of data and a good hash function, you can transmit the data and the hash value separately and then perform the hash again at the destination to verify that the data is valid. A common example of this is MD5. MD5 is often referred to as a checksum, but is really a hash function.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+4
−1

I would say the difference is one of intent.

In encryption, the objective is to hide the information contained in such a way that third parties cannot get it, but approved second parties can retrieve it (with the key). In pursuit of this, security is the primary goal, with performance as a secondary goal.

The intent of encryption is to hide the data, but eventually be able to recover it.

In hashing, the objective is to map incoming data to a pre-determined range of values, with unpredictability and performance as primary goals. No consideration is given to the ability to retrieve the data, but no consideration is given to avoiding retrieval of the data. Secondary goals will include distribution of result values, avoiding collisions, etc.

The intent of hashing is to convert the data into a smaller, simpler form that can be used as a proxy for the original data.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »