Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/357.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 用空格替换标点符号_Python_String_Python 3.x - Fatal编程技术网

Python 用空格替换标点符号

Python 用空格替换标点符号,python,string,python-3.x,Python,String,Python 3.x,我对代码有问题,不知道如何前进 tweet = "I am tired! I like fruit...and milk" clean_words = tweet.translate(None, ",.;@#?!&$") words = clean_words.split() print tweet print words 输出: ['I', 'am', 'tired', 'I', 'like', 'fruitand', 'milk'] 我想用空格替换标点符号,但不知道使用什么函数

我对代码有问题,不知道如何前进

tweet = "I am tired! I like fruit...and milk"
clean_words = tweet.translate(None, ",.;@#?!&$")
words = clean_words.split()

print tweet
print words
输出:

['I', 'am', 'tired', 'I', 'like', 'fruitand', 'milk']

我想用空格替换标点符号,但不知道使用什么函数或循环。有人能帮我吗?

有几种方法可以解决这个问题。我有一个有效的,但我相信它是次优的。希望更好地了解regex的人会来改进答案或提供更好的答案

您的问题标记为python-3.x,但您的代码是python 2.x,因此我的代码也是2.x。我包括了一个3.x版本

#!/usr/bin/env python

import re

tweet = "I am tired! I like fruit...and milk"
# print tweet

clean_words = tweet.translate(None, ",.;@#?!&$")  # Python 2
# clean_words = tweet.translate(",.;@#?!&$")  # Python 3
print(clean_words)  # Does not handle fruit...and

regex_sub = re.sub(r"[,.;@#?!&$]+", ' ', tweet)  # + means match one or more
print(regex_sub)  # extra space between tired and I

regex_sub = re.sub(r"\s+", ' ', regex_sub)  # Replaces any number of spaces with one space
print(regex_sub)  # looks good

如果您使用的是Python 2.x,您可以尝试:

import string

tweet = "I am tired! I like fruit...and milk"
clean_words = tweet.translate(string.maketrans("",""), string.punctuation)

print clean_words
对于Python3.x,它可以工作:

import string

tweet = "I am tired! I like fruit...and milk"
transtable = str.maketrans('', '', string.punctuation)
clean_words = tweet.translate(transtable)

print(clean_words)

这些代码部分删除了字符串中的所有标点符号。

这是一个基于正则表达式的解决方案,已经在Python 3.5.1下进行了测试。我认为它既简单又简洁

import re

tweet = "I am tired! I like fruit...and milk"
clean = re.sub(r"""
               [,.;@#?!&$]+  # Accept one or more copies of punctuation
               \ *           # plus zero or more copies of a space,
               """,
               " ",          # and replace it with a single space
               tweet, flags=re.VERBOSE)
print(tweet + "\n" + clean)
结果:

I am tired! I like fruit...and milk
I am tired I like fruit and milk
紧凑型:

tweet = "I am tired! I like fruit...and milk"
clean = re.sub(r"[,.;@#?!&$]+\ *", " ", tweet)
print(tweet + "\n" + clean)

通过如下方式更改“maketrans”很容易实现:

import string
tweet = "I am tired! I like fruit...and milk"
translator = string.maketrans(string.punctuation, ' '*len(string.punctuation)) #map punctuation to space
print(tweet.translate(translator))
它在我运行Python3.5.2和2.x的机器上运行。
希望它也适用于你的标点符号。

这将用无标点符号代替标点符号,产生一个单词“水果和”。如果用户想用空格替换,则需要小心这不是我们要问的问题,这个问题需要用空格替换puncktuations不确定python3,但python2.7.x将
str.maketrans(…)
更改为
string.maketrans(…)
对于python3,使用str.maketrans而不是string.maketrans