bpo-21417: Compression level for zipfile#5385
Conversation
|
|
||
| Open a ZIP file, where *file* can be a path to a file (a string), a | ||
| file-like object or a :term:`path-like object`. | ||
|
|
There was a problem hiding this comment.
Per the style guide, I took the liberty of adding line breaks here; the constructor explanation was already quite long.
| 'filename', | ||
| 'date_time', | ||
| 'compress_type', | ||
| '_compress_level', |
There was a problem hiding this comment.
I've prefixed this with an underscore because it's not exposed when reading back. ZipInfo has another "private" attribute, _raw_time below.
| writing files to the archive. When using :const:`ZIP_STORED` or | ||
| :const:`ZIP_LZMA` it has no effect. When using :const:`ZIP_DEFLATED` | ||
| integers ``0`` through ``9`` are accepted. When using :const:`ZIP_BZIP2` | ||
| integers ``1`` through ``9`` are accepted. |
There was a problem hiding this comment.
Maybe it is worth linking this to the compresslevel parameters of the :class:`gzip.GzipFile` and :class:`bz2.Bz2File` constructors, which explain the acceptable values.
There was a problem hiding this comment.
Hmm, I wonder if zlib.compressobj and bz2.BZ2Compressor would be more appropriate? I don't want to give the false impression that GzipFile or Bz2File are actually used at all.
There was a problem hiding this comment.
I went with your suggestion; the link to the zlib.compressobj() function looks weird when rendered.
There was a problem hiding this comment.
Sorry, yeah zlib.compressobj would have been more appropriate (I was confused; the gzip format is not even used in zip files, just the underlying Deflate algorithm). My point was that you should explain what the different values mean, compared to None. How do the different values control the compression level relative to the default of None?
| overrides the value given for the *compression* parameter to the constructor for | ||
| the new entry. | ||
| the new entry. Similarly, if given, *compress_level* overrides the *compresslevel* | ||
| parameter. |
There was a problem hiding this comment.
Spelling is too similar yet inconsistent. Wouldn’t it be better to remove the underscore from all parameters, and differentiate them as method parameters versus the constructor parameter?
There was a problem hiding this comment.
I was following the example of compression (constructor's argument) to compress_type (method's argument). It's not clear to me why those are different, but if it's OK for compresslevel to be the same throughout that's fine by me?
| self.assertEqual(a_info.compress_type, self.compression) | ||
| self.assertEqual(a_info._compress_level, 1) | ||
|
|
||
| # Compression level is overriden. |
|
The only way is to recompress the zip file with different levels until you find the one that matches the lengths. You could just recompress one of the entries to find the level, on the assumption that the entire zip file used the same level. Even that only works if you know the tool, and the version of the tool that was used. E.g. 7z, WinZip, Info-ZIP. |

This PR adds a
compresslevelparameter tozipfile.ZipFile(), similar to theone that exists for
tarfile.open().A few things to point out:
compresslevel=Nonemight make it look like no compression is being done - this is in fact the case with the defaultZIP_STORED.ZIP_DEFLATEDandZIP_BZIP2compression types support different levels -0through9for the former;1through9for the latter. Specifyingmode='w:bz2'intarfile.open()leads to aValueError; the same thing happens here.compresslevelonmode='r', but it doesn't do anything. This matchestarfile.open()'s behavior - there you can do stuff likecompresslevel=9000andcompresslevel='kittens'with no error.ZIP_LZMAcompression type doesn't support "levels." It has a similar concept called a preset, however we can't set that here because we're using custom filters. These are mutually exclusive.mode='w:xz'andcompresslevel=9(or whatever) intarfile.open, you get an obscureTypeError. I will file a new bug for that.compresslevelforZIP_LZMAandZIP_STOREDhave no effect forzipfile.ZipFile. I could raise an exception instead, but given thattarfile.open()ignores parameters that don't make sense (e.g.mode='r:gz'andcompresslevel=9000) I thought this was appropriate.test_zipfilemodule. In my own testing I used this script, which shows things in action.PyZipFile()for simplicity's sake; I can be persuaded to try it out.I will add a few more notes inline.
https://bugs.python.org/issue21417