bpo-32107 - Improve MAC address calculation and fix test_uuid.py#4600
bpo-32107 - Improve MAC address calculation and fix test_uuid.py#4600warsaw merged 40 commits intopython:masterfrom
Conversation
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
While it is required that the multicast bit (1<<40) should be set in random generated addresses, there is no requirement that it should be cleared in addresses obtained from network cards.
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
…torchaka/cpython into test_uuid This merges the relevant bits of python#4572 and should be considered instead of that PR.
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
While it is required that the multicast bit (1<<40) should be set in random generated addresses, there is no requirement that it should be cleared in addresses obtained from network cards.
Please use a better commit title Why does this PR have so many commits? Try maybe to rebase it? |
| # RFC 4122, $4.1.6 says "For systems with no IEEE address, a randomly or | ||
| # pseudo-randomly generated value may be used; see Section 4.5. The | ||
| # multicast bit must be set in such addresses, in order that they will | ||
| # never conflict with addresses obtained from network cards." |
There was a problem hiding this comment.
I prefer "the multicast bit" rather than "eight bit". Please update also the documentation:
https://docs.python.org/dev/library/uuid.html#uuid.getnode
| # significant, or 1<<41. We'll skip over any locally administered MAC | ||
| # addresses, as it makes no sense to use those in UUID calculation. | ||
| # | ||
| # See https://en.wikipedia.org/wiki/MAC_address#Universal_vs._local |
There was a problem hiding this comment.
You must document this change in the doc:
https://docs.python.org/dev/library/uuid.html#uuid.getnode
Add also a ".. versionchanged:: 3.7" describing that it changed in Python 3.7.
| # | ||
| # This bit works out to be the 42nd bit counting from 1 being the least | ||
| # significant, or 1<<41. We'll skip over any locally administered MAC | ||
| # addresses, as it makes no sense to use those in UUID calculation. |
There was a problem hiding this comment.
Instead of "it makes no sense", I suggest to explain it differently, something like: "prefer universally administered addresses since UUID1 objects should be unique".
| pass | ||
| except OSError: | ||
| pass | ||
| return first_local_mac or None |
There was a problem hiding this comment.
"or None" seems useless here. Same comment above.
|
When you're done making the requested changes, leave the comment: |
| def test_netbios_getnode(self): | ||
| node = self.uuid._netbios_getnode() | ||
| self.check_node(node, network=True) | ||
| self.check_node(node) |
There was a problem hiding this comment.
This change seems like a regression, why removing the following test? Maybe added it back here?
self.assertFalse(node & 0x010000000000, hex)
There was a problem hiding this comment.
See the original report. This check never failed before on developer's machines and buildbots, but Barry has a machine on which it is failed. While the initial diagnostic of this failure was wrong, actually there are no guaranties that this check always passed.
|
On Nov 28, 2017, at 04:31, Victor Stinner ***@***.***> wrote:
Why does this PR have so many commits? Try maybe to rebase it?
It doesn’t matter since this branch will be squash merged anyway.
|
|
On Nov 28, 2017, at 04:48, Victor Stinner ***@***.***> wrote:
def _random_getnode():
- """Get a random node ID, with eighth bit set as suggested by RFC 4122."""
+ """Get a random node ID."""
+ # RFC 4122, $4.1.6 says "For systems with no IEEE address, a randomly or
+ # pseudo-randomly generated value may be used; see Section 4.5. The
+ # multicast bit must be set in such addresses, in order that they will
+ # never conflict with addresses obtained from network cards."
I prefer "the multicast bit" rather than "eight bit". Please update also the documentation:
https://docs.python.org/dev/library/uuid.html#uuid.getnode
Look closely though - “eight bit” is in a removed part of the docstring. We don’t go into such detail in other docstrings, so I opted to remove it here and explain in more detail in the comment.
except OSError:
pass
+ return first_local_mac or None
"or None" seems useless here. Same comment above.
I think there could be a case where first_local_mac is False but not None. The implementation requires a None to be returned when no suitable MAC is found.
|
* Update the getnode() documentation. * Add a versionchanged note. * Elaborate a comment.
|
I have made the requested changes; please review again. |
|
Thanks for making the requested changes! @serhiy-storchaka, @vstinner: please review the changes made to this pull request. |
I proposed to replace "eighth bit" with "multicast bit" at: I was talking about this documentation, the Doc/ directory, not docstrings. |
|
On Nov 28, 2017, at 15:52, Victor Stinner ***@***.***> wrote:
I proposed to replace "eighth bit" with "multicast bit" at:
https://docs.python.org/dev/library/uuid.html#uuid.getnode
I was talking about this documentation, the Doc/ directory, not docstrings.
Yes, my PR already made that change to the docs.
|
vstinner
left a comment
There was a problem hiding this comment.
LGTM. I just have a minor suggestion on the NEWS entry.
| @@ -0,0 +1,4 @@ | |||
| Improve the private ``*_getnode()`` methods for UUID1 such that universally | |||
There was a problem hiding this comment.
For the NEWS entry, I suggest to only describe changes on the public API. In this case, the change impacts getnode().
Oh yes, now I saw the changes, good! |
|
On Nov 28, 2017, at 16:08, Victor Stinner ***@***.***> wrote:
For the NEWS entry, I suggest to only describe changes on the public API. In this case, the change impacts getnode().
Good point, thanks.
|
Here's yet another attempt, but this time without the bitmask check for universally administered MAC addresses. That check fails miserably on the buildbots and Travis, so it's not worth it.
https://bugs.python.org/issue32107