bpo-32285: Add unicodedata.is_normalized to check the current norma…#4806
bpo-32285: Add unicodedata.is_normalized to check the current norma…#4806benjaminp merged 11 commits intopython:masterfrom
unicodedata.is_normalized to check the current norma…#4806Conversation
…lization of a unistr
|
This was tested locally using |
vstinner
left a comment
There was a problem hiding this comment.
I like the new function, but it should be documented in Doc/library/unicodedata.rst.
You may also add it to Doc/whatsnew/3.7.rst, in a "unicodedata" section of Improved Modules.
|
Please add also a NEWS entry for the Changelog using the "blurb" tool: |
|
@vstinner any other changes you'd like to see here? Just made a tiny signature change to ensure consistency with the rest of the module, otherwise I think this is good to go. |
|
@vstinner should I rebase this patch for 3.8? |
| self.assertTrue(is_normalized("NFC", c2)) | ||
| self.assertTrue(is_normalized("NFD", c3)) | ||
| self.assertTrue(is_normalized("NFKC", c4)) | ||
| self.assertTrue(is_normalized("NFKD", c5)) |
There was a problem hiding this comment.
There should be some negative cases, too. Make sure the MAYBE case is being exercised.
There was a problem hiding this comment.
Increased coverage + confirmed that this is exercising the MAYBE path.
There was a problem hiding this comment.
Maybe add also tests when it returns False. If the function always returns True, the test still pass ;-)
|
|
||
| PyObject *result; | ||
| int nfc = 0; | ||
| int k = 0; |
There was a problem hiding this comment.
This is meant to conform to the existing implementation of is_normalized, which takes in ints. Could change is_normalized, but I preferred to avoid making changes outside the scope of my own.
Introduces
unicodedata.is_normalized, which can check whether aunistris in a given normal form.This makes use of the internal helper (also called
is_normalized) that can "quick check" normalization, but falls back on creating a normalized copy and comparing when necessary.https://bugs.python.org/issue32285