Python 将文件夹中每个文件中的句点和逗号替换为空格，然后打印文件_Python

Python 将文件夹中每个文件中的句点和逗号替换为空格，然后打印文件

python

Python 将文件夹中每个文件中的句点和逗号替换为空格，然后打印文件,python,Python,我有一个文件夹，里面有一堆文件，每个文件都包含一个文本字符串、句点和逗号。我想用空格替换句点和逗号，然后打印所有文件但是在用空格替换逗号和点并打印文件后，图像中的东西被打印，我不想打印点和逗号，我想在替换逗号和点后打印文件您的问题不是replace，而是对于拆分为字符而不是单词的文本中的单词此代码 stopwords = ['and', 'or'] text = "Hello World. Bye." tokens_without_sw = [word for wo

我有一个文件夹，里面有一堆文件，每个文件都包含一个文本字符串、句点和逗号。我想用空格替换句点和逗号，然后打印所有文件

但是在用空格替换逗号和点并打印文件后，图像中的东西被打印，我不想打印点和逗号，我想在替换逗号和点后打印文件

您的问题不是

replace

，而是

对于拆分为字符而不是单词的文本中的单词
此代码
stopwords = ['and', 'or']

text = "Hello World. Bye."
tokens_without_sw = [word for word in text if word not in stopwords]
print(tokens_without_sw)

给予
但是你期待着[“你好”，“世界”，“再见]

您应该使用text.split（“”
）来获取单词-但首先您应该替换逗号和点，因为您可能有没有空格的文本，如“World.Bye”
，然后它会将其视为单个单词。此外，如果在删除停止字列表之前未替换逗号和点
，则可能会保留带有点
或逗号的停止字，如或…

stopwords = ['and', 'or']

text = "Hello World. Bye."

text = text.replace('.', ' ').replace(',', ' ')

text = text.replace('  ', ' ')  # convert double space into single space

text = text.strip()  # remove space at the end

tokens_without_sw = [word for word in text.split(' ') if word not in stopwords]

print(tokens_without_sw)

结果:
['Hello', 'World', 'Bye']


使用或…
和regex
删除两个以上空格的示例
import re

stopwords = ['and', 'or']

text = "Hello World. or... Bye."

text = text.replace('.', ' ').replace(',', ' ')
#text = re.sub('\.|,', ' ', text)

#text = text.replace('  ', ' ')  # convert double space into single space
text = re.sub('\s+', ' ', text)  # convert many spaces into single space

text = text.strip()  # remove space at the end

tokens_without_sw = [word for word in text.split(' ') if word not in stopwords]

print(tokens_without_sw)

打印tokens\u而不打印
，但应打印filtered\u语句
如果text
是字符串，则for word in text
将其拆分为单个字符，而不是单词。您可以使用text.split（“”）
拆分为单词。但是首先你应该替换逗号和点。代码[在“你好，世界”中逐字逐句]给出['H'，'e'，'l'，'l'，'o'，'W'，'o'，'r'，'l'，'d']，而不是[“你好”，“世界”]
['Hello', 'World', 'Bye']

import re

stopwords = ['and', 'or']

text = "Hello World. or... Bye."

text = text.replace('.', ' ').replace(',', ' ')
#text = re.sub('\.|,', ' ', text)

#text = text.replace('  ', ' ')  # convert double space into single space
text = re.sub('\s+', ' ', text)  # convert many spaces into single space

text = text.strip()  # remove space at the end

tokens_without_sw = [word for word in text.split(' ') if word not in stopwords]

print(tokens_without_sw)