Data compression uses a wide variety of tactics to reduce the storage needed for data, including (just off the top of my head):
- Run length encoding - e.g., store aaaaa as ax5
- Tokenizing commonly used items - e.g., a BASIC interpreter might store each keyword as a single byte, since the location/usage is unambiguous (determined by the language structure). The output may actually look difference (e.g., UPPER vs. lower case) but functionally the result will be the same as the original.
- Analysis to determine frequently used strings and assign special codes for those strings - this can get quite complex but can result in significant lossless compression across a wide variety of data files.
- Stripping unused bits - e.g., a 7-bit ASCII file can be stored in 7/8 the space of the original, provided that 8th bit is always 0. The catch is that any character with the 8th bit set will need a special code to indicate it (and the special code will need a special code to indicate when it is that code as an actual character, etc.)
The above items are all, if done correctly, lossless compression.
There is another category of lossy compression. This is most often used with images, but could be applied to other data as well, though hopefully not to bank account balances. Lossy compression is typically based on combining things that look alike (e.g., several pixels that are all very close shades of blue - make them all one color so that they can be treated as a compressible block), removing insignificant (to the human eye) detail, or deliberately lowering resolution in a simple space vs. quality tradeoff. Probably the most commonly encountered example is JPEG.
The bottom line is that lossless compression has limits. A typical example is that plain text data might compress 90% - e.g., 100k -> 10k with zip or a similar algorithm. Process that same file again, with the same or a similar algorithm, and the size will still be around 10k. If you could keep compressing indefinitely, every file would eventually compress down to a single bit, and storage manufacturers would be out of business.
Back to some terms in the original question:
making data structures nearer (if they are mere machine code without any abstract representation)
At the data structure level, it is sometimes possible to compress things. For example, using integers as references between data objects instead of human-readable text can make a big difference in storage space and in speed of access. That difference in speed of access can go both ways though: integers make for more compact index files, but on the other hand, the human-readable text has to be read from another file, resulting in an extra storage access.
representing them in less and less abstract computer languages (from the "complicated" to the "simple").
Not exactly languages. Generally speaking, any language (from C to Python to LISP, etc.) in the end boils down to simple assembly language. The issue here is the storage of the data, which is, largely, language independent.
either "removing gaps"
If done by run-length-encoding (or similar) of whitespace (in text) or null data (in a data file), then that should be lossless.
re-organize data better (as in data defragmentation)
Defragmentation can make access faster but does not necessarily provide any data compression.
or a simpler representation (simpler machine code in computer memory) would neccessate data loss.
A simpler representation is, by definition, a lossy compression. Something must be taken away to make it simpler.