The downside is that it's harder to get to a file in the middle or end of the archive, as it means that you must decompress everything before it to reconstruct the dictionary, and it also means that any damage to the archive will affect every file beyond the point of damage. With solid compression, you treat all the files as one big file and never reset your dictionary. In other words, regular non-solid compression = amnesia. Compress file 1, reset dictionary, compress file 2, reset dictionary, etc. Solid compression is also important because without it, you are using a separate dictionary for every file. All compression algorithms basically work on eliminating repetition and patterns, so being able to recognize them is vital, and for dictionary-based algorithms (which comprise most mainstream general-purpose algorithms), a small dictionary forces you to forget what you saw earlier, thus hurting your ability to recognize those patterns and repetitions small dictionary = amnesia. With a small dictionary, what happens is that you'll encounter pattern A, then pattern B, C, D, etc., and by the time you encounter pattern A again, it's been pushed out of the dictionary so that you can't re-use it, and thus you take a hit on your size. You encounter pattern A, compress/encode it, and then the next time you see pattern A, since it's already in the dictionary, you can just use that. The key here is the dictionary, which, in a simplified nutshell, stores patterns that have been encountered. I've long been fascinated by compression algorithms ever since studying the original Lempel-Ziv algorithm some years ago. (This is going way off-topic if a mod could split out the recent compression-related posts into a new thread, that'd be great!)
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
March 2023
Categories |