Skip to content

bpo-32107 - Improve MAC address calculation and fix test_uuid.py#4600

Merged
warsaw merged 40 commits intopython:masterfrom
warsaw:test_uuid_3
Nov 28, 2017
Merged

bpo-32107 - Improve MAC address calculation and fix test_uuid.py#4600
warsaw merged 40 commits intopython:masterfrom
warsaw:test_uuid_3

Conversation

@warsaw
Copy link
Copy Markdown
Member

@warsaw warsaw commented Nov 28, 2017

Here's yet another attempt, but this time without the bitmask check for universally administered MAC addresses. That check fails miserably on the buildbots and Travis, so it's not worth it.

https://bugs.python.org/issue32107

warsaw and others added 30 commits November 21, 2017 12:00
The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
While it is required that the multicast bit (1<<40) should be set in
random generated addresses, there is no requirement that it should be
cleared in addresses obtained from network cards.
The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
…torchaka/cpython into test_uuid

This merges the relevant bits of python#4572 and should be considered instead of
that PR.
The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
While it is required that the multicast bit (1<<40) should be set in
random generated addresses, there is no requirement that it should be
cleared in addresses obtained from network cards.
@vstinner
Copy link
Copy Markdown
Member

bpo-32107 - Third attempt at fixing test_uuid.py #4600

Please use a better commit title

Why does this PR have so many commits? Try maybe to rebase it?

Comment thread Lib/uuid.py
# RFC 4122, $4.1.6 says "For systems with no IEEE address, a randomly or
# pseudo-randomly generated value may be used; see Section 4.5. The
# multicast bit must be set in such addresses, in order that they will
# never conflict with addresses obtained from network cards."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer "the multicast bit" rather than "eight bit". Please update also the documentation:
https://docs.python.org/dev/library/uuid.html#uuid.getnode

Comment thread Lib/uuid.py
# significant, or 1<<41. We'll skip over any locally administered MAC
# addresses, as it makes no sense to use those in UUID calculation.
#
# See https://en.wikipedia.org/wiki/MAC_address#Universal_vs._local
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must document this change in the doc:
https://docs.python.org/dev/library/uuid.html#uuid.getnode

Add also a ".. versionchanged:: 3.7" describing that it changed in Python 3.7.

Comment thread Lib/uuid.py Outdated
#
# This bit works out to be the 42nd bit counting from 1 being the least
# significant, or 1<<41. We'll skip over any locally administered MAC
# addresses, as it makes no sense to use those in UUID calculation.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "it makes no sense", I suggest to explain it differently, something like: "prefer universally administered addresses since UUID1 objects should be unique".

Comment thread Lib/uuid.py
pass
except OSError:
pass
return first_local_mac or None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or None" seems useless here. Same comment above.

@bedevere-bot
Copy link
Copy Markdown

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Comment thread Lib/test/test_uuid.py
def test_netbios_getnode(self):
node = self.uuid._netbios_getnode()
self.check_node(node, network=True)
self.check_node(node)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems like a regression, why removing the following test? Maybe added it back here?
self.assertFalse(node & 0x010000000000, hex)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the original report. This check never failed before on developer's machines and buildbots, but Barry has a machine on which it is failed. While the initial diagnostic of this failure was wrong, actually there are no guaranties that this check always passed.

@warsaw warsaw changed the title bpo-32107 - Third attempt at fixing test_uuid.py bpo-32107 - Improve MAC address calculation and fix test_uuid.py Nov 28, 2017
@warsaw
Copy link
Copy Markdown
Member Author

warsaw commented Nov 28, 2017 via email

@warsaw
Copy link
Copy Markdown
Member Author

warsaw commented Nov 28, 2017 via email

* Update the getnode() documentation.
* Add a versionchanged note.
* Elaborate a comment.
@warsaw
Copy link
Copy Markdown
Member Author

warsaw commented Nov 28, 2017

I have made the requested changes; please review again.

@bedevere-bot
Copy link
Copy Markdown

Thanks for making the requested changes!

@serhiy-storchaka, @vstinner: please review the changes made to this pull request.

@vstinner
Copy link
Copy Markdown
Member

Look closely though - “eight bit” is in a removed part of the docstring. We don’t go into such detail in other docstrings, so I opted to remove it here and explain in more detail in the comment.

I proposed to replace "eighth bit" with "multicast bit" at:
https://docs.python.org/dev/library/uuid.html#uuid.getnode

I was talking about this documentation, the Doc/ directory, not docstrings.

@warsaw
Copy link
Copy Markdown
Member Author

warsaw commented Nov 28, 2017 via email

Copy link
Copy Markdown
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I just have a minor suggestion on the NEWS entry.

@@ -0,0 +1,4 @@
Improve the private ``*_getnode()`` methods for UUID1 such that universally
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the NEWS entry, I suggest to only describe changes on the public API. In this case, the change impacts getnode().

@vstinner
Copy link
Copy Markdown
Member

Yes, my PR already made that change to the docs.

Oh yes, now I saw the changes, good!

@warsaw
Copy link
Copy Markdown
Member Author

warsaw commented Nov 28, 2017 via email

Copy link
Copy Markdown
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@warsaw warsaw merged commit 23df2d1 into python:master Nov 28, 2017
@warsaw warsaw deleted the test_uuid_3 branch November 28, 2017 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants