Python 创建计算单词和字符（包括标点符号，但不包括空格）的函数_Python

Python 创建计算单词和字符（包括标点符号，但不包括空格）的函数

python

Python 创建计算单词和字符（包括标点符号，但不包括空格）的函数,python,Python,我需要做一个函数来计算给定短语中的字符数（包括标点符号，不包括空格）和单词数。到目前为止，我已经创建了一个可以计算字符数的函数，但它也包含空格，不计算单词数。如何排除空白并实现单词计数 text = " If I compare myself to someone else, then I am playing a game I will never win. " def count_chars_words(txt): chars = len(txt.replace(' ',''))

我需要做一个函数来计算给定短语中的字符数（包括标点符号，不包括空格）和单词数。到目前为止，我已经创建了一个可以计算字符数的函数，但它也包含空格，不计算单词数。如何排除空白并实现单词计数

text = " If I compare myself to someone else, then I am playing a game 
I will never win. "
def count_chars_words(txt):
    chars = len(txt.replace(' ',''))
    words = len(txt.split(' '))
    return [words,chars]

print(count_chars_words(text))


output [19, 63]

函数

string.split（）

可能对您有用！它可以获取一个字符串，找到输入到其中的任何内容的每个实例（例如

”

），并将字符串拆分为一个由

“

”分隔的每组字符的列表（基本上是按单词分隔的）。有了这个，你应该可以继续

“如果我把自己和别人比较，那么我在玩一个永远不会赢的游戏。”.split（“”

给予

[“如果”、“我”、“比较”、“我自己”、“对”、“别人”、“然后”、“我”、“正在”、“玩”、“游戏”、“我”、“将”、“永远”、“赢”]

为了避免计算空白，您是否考虑过使用

如果语句？您可能会发现，
中的操作符在这里很有用
至于数词，是你的朋友。事实上，如果您先将单词拆分，是否有一种简单的方法可以避免上面提到的if
？
通过使用replace（''，）
从文本中去除空白，然后获得字符串的长度来计算字符数
通过将句子拆分为单词列表并检查列表长度来计算单词数
然后，在列表中同时返回这两个值
text ="If I compare myself to someone else, then I am playing a game I will never win."
def count_chars_words(txt):
        chars = len(txt.replace(' ',''))
        words = len(txt.split(' '))
        return [words,chars]

print(count_chars_words(text))

输出：
[17, 63]

要了解replace（）
和split（）
>> text.replace(' ','')
'IfIcomparemyselftosomeoneelse,thenIamplayingagameIwillneverwin.'
>> text.split(' ')
['If', 'I', 'compare', 'myself', 'to', 'someone', 'else,', 'then', 'I', 'am', 'playing', 'a', 'game', 'I', 'will', 'never', 'win.']

这只是一个想法，不是一种有效的方法，如果您需要一种好的方法，请使用regex：
text ="If I compare myself to someone else, then I am playing a game I will never win."

total_num = len(text)
spaces = len([s for s in text if s == ' '])
words = len([w for w in text.split()])

print('total characters = ', total_num)
print('words = ', words)
print('spaces=', spaces)
print('charcters w/o spaces = ', total_num - spaces)

输出：
total characters =  79
words =  17
spaces= 16
charcters w/o spaces =  63

编辑：使用正则表达式，效率更高的是：
import re

chars_without_spaces = re.findall(r'[^\s]', text)  # charcters w/o spaces 
words = re.findall(r'\b\w+', text)  # words

此处不需要在文本上循环两次。@glhr此处的循环在哪里？您有两个列表理解。是的，正如我在回答顶部提到的，整个过程对于文本处理来说并不高效。最好使用regexWow never new about replace或split，这非常有用，谢谢！