Skip to content

bpo-30977: make uuid.UUID use __slots__ to reduce its memory footprint#9078

Merged
taleinat merged 5 commits intopython:masterfrom
taleinat:bpo-30977/uuid-slots
Sep 6, 2018
Merged

bpo-30977: make uuid.UUID use __slots__ to reduce its memory footprint#9078
taleinat merged 5 commits intopython:masterfrom
taleinat:bpo-30977/uuid-slots

Conversation

@taleinat
Copy link
Copy Markdown
Contributor

@taleinat taleinat commented Sep 6, 2018

Based on original patch by Wouter Bolsterlee. (see PR #2785)

Added pickle/unpickle logic to maintain forward and backward compatibility with other Python versions.

Manually tested that unpickling works with Python 3.6 and 2.7.

https://bugs.python.org/issue30977

Based on original patch by Wouter Bolsterlee.
Copy link
Copy Markdown
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. You should use "Co-Authored-By: name " syntax in the (final) commit message to credit Wouter Bolsterlee. I just have a minor request on coding style.

Comment thread Lib/test/test_uuid.py Outdated

# Python 2.7 protocol 0-2 pickles of u
py27_pickles = [
b'ccopy_reg\n_reconstructor\np0\n(cuuid\nUUID\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS\'int\'\np6\nL24197857161011715162171839636988778104L\nsb.',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try to reformat to limit to 80 characters?

@taleinat taleinat merged commit 3e2b29d into python:master Sep 6, 2018
@taleinat taleinat deleted the bpo-30977/uuid-slots branch September 6, 2018 11:34
@taleinat
Copy link
Copy Markdown
Contributor Author

taleinat commented Sep 6, 2018

Thanks for the quick review, @vstinner!

Comment thread Lib/test/test_uuid.py
self.addCleanup(restore_uuid_module)

def test_pickle_roundtrip(self):
self._setup_for_pickle()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be easier to use with support.swap_item(sys.modules, 'uuid', self.uuid):?

Comment thread Lib/test/test_uuid.py
self._setup_for_pickle()

u = self.uuid.UUID('12345678123456781234567812345678')
self.assertEqual(u, pickle.loads(pickle.dumps(u)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to test with all pickle protocols. Need to test shallow and deep copying too, since they use the same protocol.

Comment thread Lib/test/test_uuid.py

u = self.uuid.UUID('12345678123456781234567812345678')

# Python 2.7 protocol 0-2 pickles of u
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to pass pickles through pickletools.optimize() for removing unneeded memoization. This will make the input data shorter.

Comment thread Lib/uuid.py
object.__setattr__(self, 'is_safe', is_safe)

def __getstate__(self):
d = {attr: getattr(self, attr) for attr in self.__slots__}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use self.__slots__ instead of just hardcoding two attribute names? In any case this code doesn't work in case of subclasses with additional attributes.

If is_safe is SafeUUID.unknown, it can be omitted for saving space.

Comment thread Lib/uuid.py
# is_safe was added in 3.7
state.setdefault('is_safe', SafeUUID.unknown.value)

for attr in self.__slots__:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code would be simpler if just set int and is_safe attributes.

Comment thread Lib/test/test_uuid.py
def test_unpickle_previous_python_versions(self):
self._setup_for_pickle()

u = self.uuid.UUID('12345678123456781234567812345678')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use UUID that doesn't contain regular patterns?

Comment thread Lib/uuid.py
try:
value = SafeUUID(value)
except ValueError:
value = SafeUUID.unknown
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this error is silenced?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You brought up the need to support pickles from future Python versions; those could potentially define new values in the SafeUUID enum.

Comment thread Lib/test/test_uuid.py
def test_pickle_roundtrip(self):
self._setup_for_pickle()

u = self.uuid.UUID('12345678123456781234567812345678')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to test with different is_safe values. And please used UUID with less regular hexadecimal representation.

Comment thread Lib/test/test_uuid.py
b'\x12xV4\x12xV4\x12sb.',
]

for pickled in py27_pickles + py36_pickles:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two lists are used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clearly differentiate between those generated with different Python versions, using variable names rather than comments.

Comment thread Lib/uuid.py

def __getstate__(self):
d = {attr: getattr(self, attr) for attr in self.__slots__}
# is_safe is a SafeUUID instance. Return just its value, so that
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the value portable? Seems there is a problem in 3.7 which should be fixed (this is a different issue).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. The value is one of None, 0 or -1, so it should be "portable" for any meaning of that word.

We have to keep just the value to avoid errors when unpickling in Python versions <3.7, where the SafeUUID enum isn't defined. Using the .value, is_safe is ignored and unpickling succeeds.

@taleinat
Copy link
Copy Markdown
Contributor Author

taleinat commented Sep 7, 2018

Thanks for the in-depth review @serhiy-storchaka! I learned a lot.

Since this has already been merged, see new PR #9106 with the appropriate changes.

@taleinat taleinat added the type-feature A feature request or enhancement label Sep 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type-feature A feature request or enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants