TTBOMK,
LZ77 (in its "pure" form) is
not concerned with how symbols are encoded binary! Instead it just deals with "symbols", regardless of how they are represented. It uses a "sliding window" approach, which means that
all input symbols that are processed also are
appended to the "window" buffer. Once the "window" has reached its maximum size, the oldest symbols are dropped, so that the size limit won't be exceeded.
Every
sequence of input symbols is stored as a triple of:
1. The number of symbols to copy (i.e re-use) from the current "window" buffer
2. The offset
within the "window" buffer where to start copying
3. The next symbol (literal)
after the symbols that were copied from the buffer
Note: Values for (1) and (2) are chosen by the compressor in such a way that the
longest possible sequence is copied from the "window" buffer, before the literal symbol is stored. This maximizes the compression. Still, the sequence to be copied may be of length
zero, e.g. if the next input symbol is
not found in the buffer at all.
In practice, we usually use a
combination of LZ77 (or one of its variants) and
entropy coding. It's the
entropy coding that is concerned with how symbols (literal) are encoded in binary. And not just the symbols, also the
offsets and
lengths! Common methods of entropy coding include
Huffmann-Trees and
Arithmetic-Coding.
For example,
Deflate, the compression algorithm used in ZIP and also in many other applications (probably the most widely used compression algorithm), is a combination of
LZSS (a variant of LZ77) and
Huffmann-Trees.
https://en.wikipedia.org/wiki/Deflate