• 0 Posts
  • 5 Comments
Joined 1 year ago
cake
Cake day: July 5th, 2023

help-circle
  • I’m no expert in this subject either, but a theoretical limit could be beyond 200x - depending on the data.

    For example, a basic compression approach is to use a lookup table that allows you to map large values to smaller lookup ids. So, if the possible data only contains 2 values: One consisting of 10,000 letter 'a’s. The other is 10,000 letter 'b’s. We can map the first to number 1 and the second to number 2. With this lookup in place, a compressed value of “12211” would uncompress to 50,000 characters. A 10,000x compression ratio. Extrapolate that example out and there is no theoretical maximum to the compression ratio.

    But that’s when the data set is known and small. As the complexity grows, it does seem logical that a maximum limit would be introduced.

    So, it might be possible to achieve 200x compression, but only if the complexity of the data set is below some threshold I’m not smart enough to calculate.