Skip to content

bpo-32728: Add compresslevel support for zipfile and LZMA#5534

Closed
bbayles wants to merge 11 commits intopython:masterfrom
bbayles:zipfile-lzma-presets
Closed

bpo-32728: Add compresslevel support for zipfile and LZMA#5534
bbayles wants to merge 11 commits intopython:masterfrom
bbayles:zipfile-lzma-presets

Conversation

@bbayles
Copy link
Copy Markdown
Contributor

@bbayles bbayles commented Feb 4, 2018

This PR extends the compresslevel support added to zipfile.ZipFile in #5385 to the LZMA compression method. Currently ZIP_DEFLATED and ZIP_BZ2 can be used with compresslevel, but ZIP_LZMA can't.


zipfile currently defines a custom LZMACompressor instead of using lzma.LZMACompressor.
This is so the parameters needed for the "LZMA Properties Header" in the ZIP file can be be written. This custom compressor sets the LZMA dict_size to 8 MiB, but leaves the other compression parameters (those determined by LZMA's presets) unspecified.

Here I have added a dictionary that maps the presets (0 through 9, plus 0 through 9 OR-ed with lzma.PRESET_EXTREME (see these docs) to their compression parameters. See here for
what each compression parameter means. The values can be be mostly assembled from the xz man page (example), but see also liblzma's lzma_lzma_preset function.

Once we have the compression parameters for the selected preset, then we write the header such that the ZIP file can be decoded later. I've added a few comments about this in the code; the current process is a bit unclear if you don't have the ZIP spec memorized.


One can see from this script that the compression results are different when using the keyword (default behavior does not change).

I've also added a test that checks that the correct property header is written to the ZIP file; this we can do with LZMA much more easily than with zlib or bzip2.


It may be noted that tarfile has a way of setting an LZMA archive's preset with a preset keyword (instead of compresslevel), but I don't think this is worth bringing here - (a) that capability is currently undocumented in tarfile; (b) The zipfile and tarfile modules already differ substantially (e.g., write() vs. add() methods); (c) it seems a bit wrong to make people remember that zlib and bz2 use compresslevel but lzma uses preset in this module.

https://bugs.python.org/issue32728

Comment thread Doc/library/zipfile.rst Outdated
Comment thread Lib/test/test_zipfile.py
# Write a ZIP archive with that LZMA compression preset and ensure
# that the property header above is written to that archive.
with io.BytesIO() as fp:
kwargs = {'compression': self.compression, 'compresslevel': 1}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a need for this kwargs variable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite right. I put it there to avoid a long line.

Comment thread Misc/NEWS.d/next/Library/2018-02-04-13-18-14.bpo-32728.2u0pLO.rst Outdated
Comment thread Lib/zipfile.py

def __init__(self, preset=None):
self._comp = None
if (preset is not None) and (preset not in self._PRESET_OPTIONS_MAP):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the parentheses here are needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. My thought was that it helped with readability.

Comment thread Doc/library/zipfile.rst Outdated
Comment thread Lib/zipfile.py Outdated
Comment thread Doc/library/zipfile.rst
.. versionadded:: 3.8
The *strict_timestamps* keyword-only argument

.. versionchanged:: 3.8
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionchanged:: 3.8
.. versionchanged:: 3.9

Comment thread Lib/zipfile.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants