Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

77%
+5 −0
Q&A Why often times data compression causes data loss?

I understand data compression as making data structures nearer (if they are mere machine code without any abstract representation) or representing them in less and less abstract computer languages...

posted 3y ago by meriton‭

Answer
#1: Initial revision by user avatar meriton‭ · 2021-04-19T19:00:53Z (over 3 years ago)
> I understand data compression as making data structures nearer (if they are mere machine code without any abstract representation) or representing them in less and less abstract computer languages (from the "complicated" to the "simple").

Nonsense.

Let's turn to Wikipedia for a better definition:

> In signal processing, data compression, source coding,[1] or bit-rate reduction is the process of encoding information using fewer bits than the original representation.

That is, the same information can be represented in different ways, which require a different amount of bits. By picking a representation with fewer bits, we can transmit the same information more cheaply.

For instance, consider the information contained in this answer. I could have sent it to you by holding my phone to the screen, taking a picture, and emailing you that picture. Or I could have copy and pasted the text into the email. Either way, you receive the same information, but one email will be far smaller than the other, and be transmitted far more quickly.

> Why often times data compression causes data loss?

Suppose I wanted to tell someone what your avatar looks like. I could do this by telling them, for each pixel, the exact rgb color of that pixel. That would take a long time, but preserve every detail.

Or I could say "It's a pink unicorn with rainbow hair in front of a reddish sky with two white clouds and greenish-gray ground". That's far shorter, and enough information to recognize your avatar, but not enough information to recreate it precisely. Or I could simply say "It's a pink unicorn". That even shorter, and still enough information to distinguish your avatar from mine. 

Put differently, the easiest way to compress data is to discard information that doesn't matter. And that's why the most efficient compression (particularly for audio and video) loses information.

But that is not the only way to compress data. I could have said "It's unicorn 36363 at unicornify.pictures". Then, anyone with access to the internet, or who knows the algorithm unicornify uses to turn 36363 into a picture of a unicorn, whould be able to recreate your avatar perfectly. Sometimes, data is best described by the process that created it :-)

Even barring such special cases, a general purpose compression algorithm may be able to exploit redundancy in the original message to shorten it. For instance rather than saying

> Her Triumphant Radiance, the Wisdom of the Storm, Duchess of the Seven Seas visited the lands of Duke Henry. After a lengthy stay, Her Triumphant Radiance, the Wisdom of the Storm, Duchess of the Seven Seas, traveled to Permbridge Hold, where Her Triumphant Radiance, the Wisdom of the Storm, Duchess of the Seven Seas visited with Lady Alnor.

You could transmit 

> Her Triumphant Radiance, the Wisdom of the Storm, Duchess of the Seven Seas (henceforth called HTR), visited the lands of Duke Henry. After a lengthy stay, HTR traveled to Permbridge Hold, where HTR visited with Lady Alnor.

Which is far shorter, but permits the original message to be recreated perfectly. Of course, such general compression only works if the initial message is redundant in a way the compression algorithm recognizes.