The characters that python's `isalpha` accepts
The isalpha string method in python is often used to limit accepted strings to "alpha" characters. But the set of characters that it accepts might be greater than you'd expect. The documentation states "Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”."
The categories referred to here are:
https://www.compart.com/en/unicode/category/Lm
https://www.compart.com/en/unicode/category/Lt
https://www.compart.com/en/unicode/category/Lu
https://www.compart.com/en/unicode/category/Ll
https://www.compart.com/en/unicode/category/Lo
If you look through these tables you'll see that the characters in these categories include many that intuitively don't feel very "alphabetical". eg. characters that look like quotation marks: ˮ
, or like a semicolon ꓼ
, even the degree symbol is considered alphabetical: º