Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从文本中删除所有表情符号_Python_Python 3.x_Emoji_Python Unicode_Unicode String - Fatal编程技术网

Python 从文本中删除所有表情符号

Python 从文本中删除所有表情符号,python,python-3.x,emoji,python-unicode,unicode-string,Python,Python 3.x,Emoji,Python Unicode,Unicode String,这个问题在这里被问到了,但没有一个解决办法,我已经向解决办法迈出了一步。但我需要帮助完成它 我从表情符号网站获得了所有表情符号十六进制代码点: 然后我像这样读入文件: file = open('emoji-ordering.txt') temp = file.readline() final_list = [] while temp != '': #print(temp) if not temp[0] == '#' : utf_8_values = (

这个问题在这里被问到了,但没有一个解决办法,我已经向解决办法迈出了一步。但我需要帮助完成它

我从表情符号网站获得了所有表情符号十六进制代码点:

然后我像这样读入文件:

file = open('emoji-ordering.txt')
temp = file.readline()

final_list = []

while temp != '':
    #print(temp)
    if not temp[0] == '#' :
            utf_8_values = ((temp.split(';')[0]).rstrip()).split(' ')
            values = ["u\\"+(word[0]+((8 - len(word[2:]))*'0' + word[2:]).rstrip()) for word in utf_8_values]
            #print(values[0])
            final_list = final_list + values
    temp = file.readline()

print(final_list)

我希望这会给我unicode文本。事实并非如此,我的目标是获得unicode文本,这样我就可以使用上一个问题的部分解决方案,并能够排除所有表情符号。您知道我们需要什么解决方案吗?

首先安装表情符号:

pip install emoji
import emoji

def give_emoji_free_text(self, text):
    allchars = [str for str in text]
    emoji_list = [c for c in allchars if c in emoji.UNICODE_EMOJI]
    clean_text = ' '.join([str for str in text.split() if not any(i in str for i in emoji_list)])

    return clean_text

text = give_emoji_free_text(text)
emoji_pattern = re.compile("["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
        u"\U0001F1F2-\U0001F1F4"  # Macau flag
        u"\U0001F1E6-\U0001F1FF"  # flags
        u"\U0001F600-\U0001F64F"
        u"\U00002702-\U000027B0"
        u"\U000024C2-\U0001F251"
        u"\U0001f926-\U0001f937"
        u"\U0001F1F2"
        u"\U0001F1F4"
        u"\U0001F620"
        u"\u200d"
        u"\u2640-\u2642"
        "]+", flags=re.UNICODE)

text = emoji_pattern.sub(r'', text)

这样做:

pip install emoji
import emoji

def give_emoji_free_text(self, text):
    allchars = [str for str in text]
    emoji_list = [c for c in allchars if c in emoji.UNICODE_EMOJI]
    clean_text = ' '.join([str for str in text.split() if not any(i in str for i in emoji_list)])

    return clean_text

text = give_emoji_free_text(text)
emoji_pattern = re.compile("["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
        u"\U0001F1F2-\U0001F1F4"  # Macau flag
        u"\U0001F1E6-\U0001F1FF"  # flags
        u"\U0001F600-\U0001F64F"
        u"\U00002702-\U000027B0"
        u"\U000024C2-\U0001F251"
        u"\U0001f926-\U0001f937"
        u"\U0001F1F2"
        u"\U0001F1F4"
        u"\U0001F620"
        u"\u200d"
        u"\u2640-\u2642"
        "]+", flags=re.UNICODE)

text = emoji_pattern.sub(r'', text)
这是我的工作

或者您可以试试:

pip install emoji
import emoji

def give_emoji_free_text(self, text):
    allchars = [str for str in text]
    emoji_list = [c for c in allchars if c in emoji.UNICODE_EMOJI]
    clean_text = ' '.join([str for str in text.split() if not any(i in str for i in emoji_list)])

    return clean_text

text = give_emoji_free_text(text)
emoji_pattern = re.compile("["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
        u"\U0001F1F2-\U0001F1F4"  # Macau flag
        u"\U0001F1E6-\U0001F1FF"  # flags
        u"\U0001F600-\U0001F64F"
        u"\U00002702-\U000027B0"
        u"\U000024C2-\U0001F251"
        u"\U0001f926-\U0001f937"
        u"\U0001F1F2"
        u"\U0001F1F4"
        u"\U0001F620"
        u"\u200d"
        u"\u2640-\u2642"
        "]+", flags=re.UNICODE)

text = emoji_pattern.sub(r'', text)

下面是一个使用表情库的
get\u emoji\u regexp()
的Python脚本

它从一个文件中读取文本,并将表情符号自由文本写入另一个文件

import emoji
import re

def strip_emoji(text):
    print(emoji.emoji_count(text))
    new_text = re.sub(emoji.get_emoji_regexp(), r"", text)
    return new_text


with open("my_file.md", "r") as file:
    old_text = file.read()

no_emoji_text = strip_emoji(old_text)

with open("file.md", "w+") as new_file:
    new_file.write(no_emoji_text)