I've had this showerthought where i wonder if its possible to losslessly compress large amounts of data into an extremely small space. Like a jpeg image fit into a few sentences. It sounds impossible...
I am afraid this kind of thinking is not at all new, and also not feasible. Say I toss a fair coin 100 times, and I want to store the resulting head/tail values in order (as bits).
There simply is no way around the fact that I would need 100 bits minimum to account for the entropy in the data. If instead there was a common pattern (not fair coin) compression
would in theory be possible. For example, if we were guaranteed that the coin only would be heads 3 times, we could just store the index of the 3 heads.
Now you may say, I don't know all algorithms, I cannot be sure that AES for example does not remove entropy, I certainly haven't done the math on all of them, and I gave no proof.
But think of it this way, take the whole thing to the extreme. Suppose I use only 4 applications: MacOs, OpenOffice, Brave browser and Gimp. Should I be able then to compress these so
much that I only need 2 bits to save their entire code? MacOs=00, OpenOffice=01, Brave=10, Gimp=11. And when there is a new version of macOs that I want, what then? Should I be able
to compress the 4 apps AGAIN with the same codes as before, so now suddenly two versions of MacOS can be rebuilt only from the bitpattern 00? Does that make logical sense?
I could give a more pure logical proof if you wanted, but I will not do that unless you really have use for it. Btw, someone else might proved this already.
I understand the basic criticism here. Entropy is something that you cant just make smaller, according to theory. Yeah, you could never compress stuff down to like a single letter
Another thing. Isnt it weird how if you flip a coin 100 times and it for example does 50 heads in a row then 50 tails in a row that we could totally represent this in way less than 100
If it was 100 heads, and it always was, sure, you could save that in zero bits. If it was often 100 heads, it could be 1 bit for this special case, and a cost of one bit for all other
It is definitely objective in my example. Think of it this way, instead of 100 coinflips, do 2, so in binary: 00,01,10,11. And try to map the info into 1 bit: 0 or 1.
You can map 0 to 00 and 1 to 01, but then what do you map to 10 and 11? If you know that 10 and 11 never occurs, sure, you have compressed the data, but you do not.
This is proof that you cannot reduce entropy, only change the dataformat (and often add some entropy). Even a compression algorithm will INCREASE entropy of a totally random dataset.
About compression: For a given special dataset it may indeed decrease size, but if you use it over and over for different random datasets, you will spend more space than uncompressed.
That is, it is "proof" (not very rigorous and mathematically clear) that you cannot reduce entropy by a whole (information-)bit. It could easily be generalized to be more fine grained.
| number lives on irrational and transcendental numbers such as pi, right? (Nevermind the unfeasiblity, we just care about whether or not its logically possible). Could we not identify
| a string of data represented in numbers as an index position of pi? For example, '14159265' could be p[1:8]. Is that not potential for insane bit reduction, theoretically? (If the
So you use some Pi magic to compress, and when it doesn't work you use some more magic. I am sorry, it will not help you, "the devil is in the details" here
| taking the square root of a Very Large number (1000 characters) can massively decrease the number of characters, then isnt that by itself a compression? All you gotta do is not add
My advice for sqrt, DO THE MATH. It is really not hard, and it will help you understand the issues involved. DO NOT try to add AES or something into the mix, do ONE thing first.
| realistic, no, but what if the shrinking process involved with square-rooting results in more compressioj than not? If the compression even had a mere 51% chance of being superior,