Solid
archive definition
|
Solid compression
is an option meant
to improve data
compression ratio providing a wider context for
compression
algorithm while compressing multiple files, which are treated as a
single block rather than as separate entities. |
|
Advantages of
solid compression
The ideas behind solid compression are simple and effective:
- when
multiple files are processed as a single
solid block
(especially similar files, i.e. same type / extension, or even
revisions of the
same file), it is easier to find redundant data between the files of
the group, an advantage for the compression algorithm improving
efficiency of compressed representation of the
data, better than treating each file separately
- solid compression is especially beneficial for
archives containing multiple copies of same or very similar files,
which can be archived with minimal overhead
- when many small files are processed as a single
solid block, overhead content (marker of file begin/end, checksum,
table of
content) is written only once rather than once per file, saving extra
bytes of size for each input object - a double advantage when
compressing a large number of small files, which would normally provide
few context to optimize compression and will add multiple overhead
content to output archive.
Archive formats
supporting solid compression
Solid mode is natively supported in some archival formats standards,
like for 7Z and RAR file types, and is available as option in PeaZip
file compression utility - it can be set in archiving screen (add to
archive), "advanced options" tab.
A form of solid compression is used in compressed TAR
files (TAR.GZ, TAR.BR, TAR.BZ2,
TAR.ZST, TGZ, TXZ...): the data is initially added to a single
container, the
uncompressed tar archive, then the tar file is compressed as a single
block i.e. with GZip, Brotli, BZip2, LMZA,m or Zstandard algorithm.
Disadvantages of solid
compression
Main
drawbacks of solid data compression are:
- Speed performances penalities on some operations: the
context information is needed both during
compression / extraction to preserve the advantages of solid
compression. So, the partial,
selective extraction (a single file or group of file
rather than the whole archive) from a solid archive, or adding or deleting
files to
existing archive, and updating
existing files in archive needs more time on a solid archive
because all the
relevant context data (usually defined "solid block") must be parsed,
making the process significantly slower than adding / extracting data
from a non solid archive, making solid archives less simple / less fast
to manage for common archiving operations
- for the very same reason, a damage to data in any
part
of the
archive
may make all the content after that point non-usable for lack of
context
information needed for extraction operation, while data corruption in
non solid
archive usually harms only the data of a single file, with the
disadvantage of solid archive being usually less resilient to suport /
transmission failures and more easily severely damaged by
even small loss / corruption of data.
Role of solid
block size in improving the
compression
ratio
To mitigate
those disadvantages, 7Z format allows to set the block
size to be used for
solid mode operation (the "window" of data context that is parsed by
the
compression / extraction algorithm) minimizing drawbacks both in terms
of overhead during
extraction, and possible impact of data corruption.
However for the very
same reason reducing solid block size reduces potential benefices and
compression
ratio improvements, providing a smaller context window to optimize the
compressor algorithm.
So, the tradeoff in compression ratio vs resilience to data corruption
damages, and update ease, should be accurately set.
|
Solid block "size" can be defined by size in
bytes, or by number of
files in a block. PeaZip file compression utility features this option
in the archiving screen, advanced options tab.
An alternative optimization strategy are blocks separated by file
extension, to provide more homogeneous data to each single block
of the archive; this setting is also featured by PeaZip advanced
compression options tab.
|
Read more: solid
compression definition
on Wikipedia.
Learn how to create 7z archives with
solid compression, and how to create
compressed TAR archives using Brotli file compressor
(very fast), BZ2 file compressor
(average / good compression), GZ file
compressor (fast), or Zstandard
file compressor (very fast).
Synopsis: What is solid
compression definition. What are advantages and disadvantages of solid
data compression. How to use solid compression mode to improve
compression ratio in archives containing similar files. How to define
solid block size to improve compression performances with minimal
decompression overhead and less speed drawbacks.
Topics: what is solid
compression, solid block size, advantages and disadvantages of using
solid compression
PeaZip > FAQ >
What is solid compression, how to use, advantages
|