Python 如何将表情符号的unicode转换为CLDR短名称

Python 如何将表情符号的unicode转换为CLDR短名称,python,unicode,emoji,extraction,data-extraction,Python,Unicode,Emoji,Extraction,Data Extraction,我正在使用python提取注释并显示它们。 我打印的时候是这样的 This was heart wrenching \u2764\ufe0f Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears \u2764\ufe0f\u2764\ufe0f\u2764\ufe0f 如何将表情符号的Unicode转换为相应的CLDR短名称? 例如,U+1F44D将以竖起大拇指的方式打印。编辑:我想我找到了代码问题的解决方案\ud83d\

我正在使用python提取注释并显示它们。 我打印的时候是这样的

This was heart wrenching \u2764\ufe0f
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\u2764\ufe0f\u2764\ufe0f\u2764\ufe0f
如何将表情符号的Unicode转换为相应的CLDR短名称?
例如,U+1F44D将以竖起大拇指的方式打印。

编辑:我想我找到了代码问题的解决方案
\ud83d\udc9c

text = text.encode('utf-16', 'surrogatepass').decode('utf-16')
它将代理项值
\ud83d\udc9c
转换为正确的表情符号值
\U0001f49c

资料来源:

维基百科:

其他:


使用谷歌我发现

print('\U0001F44D'.encode('ascii', 'namereplace').decode())
结果

\N{THUMBS UP SIGN}

结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}
因此,在询问Stackoverflow之前,最好使用
Google


文本也一样

text = '''This was heart wrenching \u2764\ufe0f
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\u2764\ufe0f\u2764\ufe0f\u2764\ufe0f'''

print(text.encode('ascii', 'namereplace').decode())
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}
现在您可能必须删除
\N{
}

但它在
\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c

import unicodedata

# http://www.unicode.org/reports/tr44/#General_Category_Values

for char in text:
    try:
        print(char, '|', unicodedata.category(char), '|', unicodedata.name(char))
    except ValueError:
        print(repr(char), '| (repr)')
import demoji

# run only once after installing module
demoji.download_codes()

print(demoji.findall(text))

您可以在
for
-循环中使用
unicodedata
来获取文本中每个字符的名称,但如果没有名称,则可能会出现问题,例如。
'\n'
。它还提供了普通字符的名称,因此您可能必须使用
unicodedata.category()
来决定要替换哪些字符

这在
\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c

import unicodedata

# http://www.unicode.org/reports/tr44/#General_Category_Values

for char in text:
    try:
        print(char, '|', unicodedata.category(char), '|', unicodedata.name(char))
    except ValueError:
        print(repr(char), '| (repr)')
import demoji

# run only once after installing module
demoji.download_codes()

print(demoji.findall(text))
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}

因为它在
\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c
上有问题,所以我将其替换为

import unicodedata

text = '''This was heart wrenching \u2764\ufe0f
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\u2764\ufe0f\u2764\ufe0f\u2764\ufe0f'''

result = []

for char in text:
    if unicodedata.category(char) in ('So', 'Mn'):
        result.append(':{}:'.format(unicodedata.name(char)))
    elif unicodedata.category(char) in ('Cs'):
        result.append('?') #char)
    else:
        result.append(char)

print(''.join(result)) 
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}

编辑:再次使用Google,我发现外部模块可以转换一些名称,但它也有问题
\ud83d\udc9c
,所以我使用
repr
来显示它-但它也会将新行打印为
\n

text = '''This was heart wrenching \u2764\ufe0f
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\u2764\ufe0f\u2764\ufe0f\u2764\ufe0f'''

import emoji

#print( repr(emoji.demojize(text, use_aliases=True)) ) 
print( repr(emoji.demojize(text)) ) 
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}


顺便说一句:我找到了一个可以找到表情符号并给出名字的模块。但它也存在代码
\ud83d\udc9c

import unicodedata

# http://www.unicode.org/reports/tr44/#General_Category_Values

for char in text:
    try:
        print(char, '|', unicodedata.category(char), '|', unicodedata.name(char))
    except ValueError:
        print(repr(char), '| (repr)')
import demoji

# run only once after installing module
demoji.download_codes()

print(demoji.findall(text))
安装模块后,只需下载一次\u code()

结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}

如果您将其作为JSON数据
“\ud83d\udc9c”
获取,那么您应该不会有问题-它应该自动转换它

import json

# escaped unicode in " "  
data = r'"\ud83d\udc9c"' 
print(json.loads(data))
在其他情况下,您必须转换它

# convert to escaped unicode and put in " "  
data = '"{}"'.format('\ud83d\udc9c'.encode('unicode-escape').decode())
print(json.loads(data))

编辑:我想我找到了代码问题的解决方案
\ud83d\udc9c

text = text.encode('utf-16', 'surrogatepass').decode('utf-16')
它将代理项值
\ud83d\udc9c
转换为正确的表情符号值
\U0001f49c

资料来源:

维基百科:

其他:


使用谷歌我发现

print('\U0001F44D'.encode('ascii', 'namereplace').decode())
结果

\N{THUMBS UP SIGN}

结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}
因此,在询问Stackoverflow之前,最好使用
Google


文本也一样

text = '''This was heart wrenching \u2764\ufe0f
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\u2764\ufe0f\u2764\ufe0f\u2764\ufe0f'''

print(text.encode('ascii', 'namereplace').decode())
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}
现在您可能必须删除
\N{
}

但它在
\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c

import unicodedata

# http://www.unicode.org/reports/tr44/#General_Category_Values

for char in text:
    try:
        print(char, '|', unicodedata.category(char), '|', unicodedata.name(char))
    except ValueError:
        print(repr(char), '| (repr)')
import demoji

# run only once after installing module
demoji.download_codes()

print(demoji.findall(text))

您可以在
for
-循环中使用
unicodedata
来获取文本中每个字符的名称,但如果没有名称,则可能会出现问题,例如。
'\n'
。它还提供了普通字符的名称,因此您可能必须使用
unicodedata.category()
来决定要替换哪些字符

这在
\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c

import unicodedata

# http://www.unicode.org/reports/tr44/#General_Category_Values

for char in text:
    try:
        print(char, '|', unicodedata.category(char), '|', unicodedata.name(char))
    except ValueError:
        print(repr(char), '| (repr)')
import demoji

# run only once after installing module
demoji.download_codes()

print(demoji.findall(text))
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}

因为它在
\ud83d\udc9c\ud83d\udc9c\ud83d\udc9c
上有问题,所以我将其替换为

import unicodedata

text = '''This was heart wrenching \u2764\ufe0f
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\u2764\ufe0f\u2764\ufe0f\u2764\ufe0f'''

result = []

for char in text:
    if unicodedata.category(char) in ('So', 'Mn'):
        result.append(':{}:'.format(unicodedata.name(char)))
    elif unicodedata.category(char) in ('Cs'):
        result.append('?') #char)
    else:
        result.append(char)

print(''.join(result)) 
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}

编辑:再次使用Google,我发现外部模块可以转换一些名称,但它也有问题
\ud83d\udc9c
,所以我使用
repr
来显示它-但它也会将新行打印为
\n

text = '''This was heart wrenching \u2764\ufe0f
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\u2764\ufe0f\u2764\ufe0f\u2764\ufe0f'''

import emoji

#print( repr(emoji.demojize(text, use_aliases=True)) ) 
print( repr(emoji.demojize(text)) ) 
结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}


顺便说一句:我找到了一个可以找到表情符号并给出名字的模块。但它也存在代码
\ud83d\udc9c

import unicodedata

# http://www.unicode.org/reports/tr44/#General_Category_Values

for char in text:
    try:
        print(char, '|', unicodedata.category(char), '|', unicodedata.name(char))
    except ValueError:
        print(repr(char), '| (repr)')
import demoji

# run only once after installing module
demoji.download_codes()

print(demoji.findall(text))
安装模块后,只需下载一次\u code()

结果:

THUMBS UP SIGN
This was heart wrenching \N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
Amazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears
\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}\N{HEAVY BLACK HEART}\N{VARIATION SELECTOR-16}
\N{THUMBS UP SIGN}
T | Lu | LATIN CAPITAL LETTER T
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
  | Zs | SPACE
h | Ll | LATIN SMALL LETTER H
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
t | Ll | LATIN SMALL LETTER T
  | Zs | SPACE
w | Ll | LATIN SMALL LETTER W
r | Ll | LATIN SMALL LETTER R
e | Ll | LATIN SMALL LETTER E
n | Ll | LATIN SMALL LETTER N
c | Ll | LATIN SMALL LETTER C
h | Ll | LATIN SMALL LETTER H
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
'\n' | (repr)
A | Lu | LATIN CAPITAL LETTER A
m | Ll | LATIN SMALL LETTER M
a | Ll | LATIN SMALL LETTER A
z | Ll | LATIN SMALL LETTER Z
i | Ll | LATIN SMALL LETTER I
n | Ll | LATIN SMALL LETTER N
g | Ll | LATIN SMALL LETTER G
  | Zs | SPACE
c | Ll | LATIN SMALL LETTER C
o | Ll | LATIN SMALL LETTER O
m | Ll | LATIN SMALL LETTER M
p | Ll | LATIN SMALL LETTER P
a | Ll | LATIN SMALL LETTER A
s | Ll | LATIN SMALL LETTER S
s | Ll | LATIN SMALL LETTER S
i | Ll | LATIN SMALL LETTER I
o | Ll | LATIN SMALL LETTER O
n | Ll | LATIN SMALL LETTER N
  | Zs | SPACE
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
'\ud83d' | (repr)
'\udc9c' | (repr)
  | Zs | SPACE
# | Po | NUMBER SIGN
t | Ll | LATIN SMALL LETTER T
e | Ll | LATIN SMALL LETTER E
a | Ll | LATIN SMALL LETTER A
r | Ll | LATIN SMALL LETTER R
s | Ll | LATIN SMALL LETTER S
'\n' | (repr)
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
❤ | So | HEAVY BLACK HEART
️ | Mn | VARIATION SELECTOR-16
This was heart wrenching :HEAVY BLACK HEART::VARIATION SELECTOR-16:
Amazing compassion ?????? #tears
:HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16::HEAVY BLACK HEART::VARIATION SELECTOR-16:
'This was heart wrenching :heart:\nAmazing compassion \ud83d\udc9c\ud83d\udc9c\ud83d\udc9c #tears\n:heart::heart::heart:'
{'❤️': 'red heart'}

如果您将其作为JSON数据
“\ud83d\udc9c”
获取,那么您应该不会有问题-它应该自动转换它

import json

# escaped unicode in " "  
data = r'"\ud83d\udc9c"' 
print(json.loads(data))
在其他情况下,您必须转换它

# convert to escaped unicode and put in " "  
data = '"{}"'.format('\ud83d\udc9c'.encode('unicode-escape').decode())
print(json.loads(data))