In some cases, bzip2 is inferior to 7z and RAR formats in absolute compression efficiency. According to the persistent effect of Moore's law, the calculation time is becoming less and less important, so similar compression methods are becoming more and more popular. According to the author, among all known compression algorithms, bzip2 can rank among the best algorithms (PPM), ranging from 10% to 15%, although it is about 2 times faster in compression speed and 6 times faster in decompression speed.
Bzip2 uses Burrows-Wheeler transform to convert repeated character sequences into character strings with the same letters, then uses move-to-front transform to process them, and finally uses Huffman coding to compress them. In bzip2, all data blocks are plain text data blocks with the same size. They can be selected by command-line variables and then identified as compressed text by an arbitrary bit sequence obtained from the decimal representation of π.
At first, the previous generation of bzip2, Bzip2, compressed the data blocks by arithmetic coding. Due to the limitation of software patents, arithmetic coding is no longer used.