Python 列表中每个单词的平均字符数_Python_Regex_String_Python 3.x

Python 列表中每个单词的平均字符数

python regex string python-3.x

Python 列表中每个单词的平均字符数,python,regex,string,python-3.x,Python,Regex,String,Python 3.x,我是python新手，需要计算列表中每个单词的平均字符数使用这些定义和辅助功能清理令牌是一个str，通过调用文件行上的string方法split获得单词是文件中不完全由标点组成的非空标记。使用str.split查找标记，然后使用助手功能clean\u删除单词中的标点符号，以查找文件中的“单词” 句子是以字符结尾（但不包括）的字符序列、？、或文件的结尾不包括任何一端的空白，并且不为空这是我大学计算机科学课上的家庭作业问题清理功能是： def clean_up(s): punct

我是python新手，需要计算列表中每个单词的平均字符数

使用这些定义和辅助功能清理

令牌是一个str，通过调用文件行上的string方法split获得

单词是文件中不完全由标点组成的非空标记。使用

str.split

查找标记，然后使用助手功能

clean\u

删除单词中的标点符号，以查找文件中的“单词”

句子是以字符

结尾（但不包括）的字符序列

、

？

、

或文件的结尾不包括任何一端的空白，并且不为空

这是我大学计算机科学课上的家庭作业问题

清理功能是：

def clean_up(s):
    punctuation = """!"',;:.-?)([]<>*#\n\"""
    result = s.lower().strip(punctuation)
    return result

我得到了5.0，但我真的很困惑，非常感谢您的帮助：）

PS我正在使用Python3

让我们用导入和生成器表达式清理一些函数，好吗

import string

def clean_up(s):
    # I'm assuming you REQUIRE this function as per your assignment
    # otherwise, just substitute str.strip(string.punctuation) anywhere
    # you'd otherwise call clean_up(str)
    return s.strip(string.punctuation)

def average_word_length(text):
    total_length = sum(len(clean_up(word)) for sentence in text for word in sentence.split())
    num_words = sum(len(sentence.split()) for sentence in text)
    return total_length/num_words

您可能会注意到，这实际上压缩为一个长度，无法读取的一行：

average = sum(len(word.strip(string.punctuation)) for sentence in text for word in sentence.split()) / sum(len(sentence.split()) for sentence in text)

这很恶心，这就是为什么你不应该这么做；）。可读性很重要。

这是一个简短而甜蜜的方法，可以解决您仍然可读的问题

def clean_up(word, punctuation="!\"',;:.-?)([]<>*#\n\\"):
    return word.lower().strip(punctuation)  # you don't really need ".lower()"

def average_word_length(text):
    cleaned_words = [clean_up(w) for w in (w for l in text for w in l.split())]
    return sum(map(len, cleaned_words))/len(cleaned_words)  # Python2 use float

>>> average_word_length(['James Fennimore Cooper\n', 'Peter, Paul and Mary\n'])
5.142857142857143

def clean\u（单词，标点符号=“！\”，；：.-？）（[]*\n\\”）：
返回单词.lower（）.strip（标点符号）#你真的不需要“.lower（）”
def平均字长（文本）：
cleaned_words=[clean_up（w）for w in（w for l in text for w in l.split（））]
返回和（map（len，cleaned_words））/len（cleaned_words）#Python2使用float
>>>平均字长（['James Fennimore Cooper\n'、'Peter、Paul和Mary\n']）
5.142857142857143

所有这些先决条件的负担都落在你身上。

你想

浮动（len（word））

@Hoopdady OP正在使用Python3。这只会给出输入中最后一项的平均值

在for
循环中的平均值，这将是一个很好的生成器。@Hoopdady是的，这就是为什么来自u future_u的导入部分在Python2中工作的原因。Python3是未来！！！：）@ashwinaudichhary或列表理解。：）
def clean_up(word, punctuation="!\"',;:.-?)([]<>*#\n\\"):
    return word.lower().strip(punctuation)  # you don't really need ".lower()"

def average_word_length(text):
    cleaned_words = [clean_up(w) for w in (w for l in text for w in l.split())]
    return sum(map(len, cleaned_words))/len(cleaned_words)  # Python2 use float

>>> average_word_length(['James Fennimore Cooper\n', 'Peter, Paul and Mary\n'])
5.142857142857143