Python 每句话的平均词组数

Python 每句话的平均词组数,python,average,sentence,phrases,Python,Average,Sentence,Phrases,给出了这两个函数 def split_on_separators(original, separators): """ (str, str) -> list of str Return a list of non-empty, non-blank strings from the original string determined by splitting the string on any of the separators. separators is a string of si

给出了这两个函数

def split_on_separators(original, separators):
""" (str, str) -> list of str

Return a list of non-empty, non-blank strings from the original string
determined by splitting the string on any of the separators.
separators is a string of single-character separators.

>>> split_on_separators("Hooray! Finally, we're done.", "!,")
['Hooray', ' Finally', " we're done."]
"""

# To do: Complete this function's body to meet its specification.
# You are not required to keep the two lines below but you may find
# them helpful. (Hint)
for i in separators:
    original = original.replace(i,"<*)))>{")
    ret = original.split("<*)))>{")
return ret

def clean_up(s):
""" (str) -> str

Return a new string based on s in which all letters have been
converted to lowercase and punctuation characters have been stripped 
from both ends. Inner punctuation is left untouched. 

>>> clean_up('Happy Birthday!!!')
'happy birthday'
>>> clean_up("-> It's on your left-hand side.")
" it's on your left-hand side"
"""

punctuation = """!"',;:.-?)([]<>*#\n\t\r"""
result = s.lower().strip(punctuation)
return result
这只给了我3.0,即总短语除以总句子。
我的问题是如何计算(第一句中的短语总数)/(第二句中的句子总数)+(第二句中的短语总数)/(句子总数)+……

我的意思是,从技术上讲,正如你所描述的,你只需计算
1/总句子*num_短语
,它等于
num_短语/总句子
,因为每个
短语
在我的理解中只是
1

你真正想做的是计算每个句子中的词组数。然后,您可以在短语计数列表上使用
numpy.mean
,以查找平均短语计数


我不会说得更具体,因为这显然是一个家庭作业:p

而不是
对于文本中的行:maging\u str+=lines
你可以简单地写
maging\u str=''。在你声明的行上加入(text)
。同样,干净的句子中的
if''子句也没用<对于任何字符串
s
,s
中的code>''为
True
。试试看。
def avg_sentence_complexity(text):
""" (list of str) -> float

Return the average number of phrases per sentence.

A sentence is defined as a non-empty string of non-terminating
punctuation surrounded by terminating punctuation
or beginning or end of file. Terminating punctuation is defined as !?.
Phrases are substrings of sentences, separated by one or more of the
following delimiters ,;: 

>>> text = ['The time has come, the Walrus said\n',
     'To talk of many things: of shoes - and ships - and sealing wax,\n',
     'Of cabbages; and kings.\n',
     'And why the sea is boiling hot;\n',
     'and whether pigs have wings.\n']
>>> avg_sentence_complexity(text)
3.5
"""

huge_str = ''
clean_sentences = []
for lines in text:
    huge_str += lines   
list_of_sentences = split_on_separators(huge_str, '?!.')    
for strings in list_of_sentences:
    cleaned = clean_up(strings)
    clean_sentences.append(cleaned) 
    if '' in clean_sentences:
        clean_sentences.remove('')  
num_sentences = len(clean_sentences)

large = ''
for phrases in text:
    large += phrases
list_of_phrases = split_on_separators(large, ',;:')
num_phrases = len(list_of_phrases)

asc =  num_phrases / num_sentences
return asc