Python:减少圈复杂度
我需要帮助降低以下代码的圈复杂度:Python:减少圈复杂度,python,cyclomatic-complexity,Python,Cyclomatic Complexity,我需要帮助降低以下代码的圈复杂度: def avg_title_vec(record, lookup): avg_vec = [] word_vectors = [] for tag in record['all_titles']: titles = clean_token(tag).split() for word in titles: if word in lookup.value:
def avg_title_vec(record, lookup):
avg_vec = []
word_vectors = []
for tag in record['all_titles']:
titles = clean_token(tag).split()
for word in titles:
if word in lookup.value:
word_vectors.append(lookup.value[word])
if len(word_vectors):
avg_vec = [
float(val) for val in numpy.mean(
numpy.array(word_vectors),
axis=0)]
output = (record['id'],
','.join([str(a) for a in avg_vec]))
return output
输入示例:
record ={'all_titles': ['hello world', 'hi world', 'bye world']}
lookup.value = {'hello': [0.1, 0.2], 'world': [0.2, 0.3], 'bye': [0.9, -0.1]}
def clean_token(input_string):
return input_string.replace("-", " ").replace("/", " ").replace(
":", " ").replace(",", " ").replace(";", " ").replace(
".", " ").replace("(", " ").replace(")", " ").lower()
所以lookup.value中出现的所有单词,我取它们的向量形式的平均值。这可能并不能算作正确答案,因为最终圈复杂度并没有降低 这个变体稍微短了一点,但我看不出有任何方法可以将其推广。如果你有的话,你似乎需要这些
def avg_title_vec(record, lookup):
word_vectors = [lookup.value[word] for tag in record['all_titles']
for word in clean_token(tag).split() if word in lookup.value]
if not word_vectors:
return (record['id'], None)
avg_vec = [float(val) for val in numpy.mean(
numpy.array(word_vectors),
axis=0)]
output = (record['id'],
','.join([str(a) for a in avg_vec]))
return output
你的CC是6,这已经很好了,根据。您可以通过使用助手函数来减少函数的CC,如
def get_tags(record):
return [tag for tag in record['all_titles']]
def sanitize_and_split_tags(tags):
return [word for tag in tags for word in
re.sub(r'[\-/:,;\.()]', ' ', tag).lower().split()]
def get_vectors_words(words):
return [lookup.value[word] for word in words if word in lookup.value]
它将降低平均CC,但总体CC将保持不变或增加。我看不出你怎么能摆脱那些检查单词是否在查找中的。值,或者检查我们是否有向量可以使用的。这可能不是一个真正正确的答案,因为最终圈复杂度没有降低
这个变体稍微短了一点,但我看不出有任何方法可以将其推广。如果你有的话,你似乎需要这些
def avg_title_vec(record, lookup):
word_vectors = [lookup.value[word] for tag in record['all_titles']
for word in clean_token(tag).split() if word in lookup.value]
if not word_vectors:
return (record['id'], None)
avg_vec = [float(val) for val in numpy.mean(
numpy.array(word_vectors),
axis=0)]
output = (record['id'],
','.join([str(a) for a in avg_vec]))
return output
你的CC是6,这已经很好了,根据。您可以通过使用助手函数来减少函数的CC,如
def get_tags(record):
return [tag for tag in record['all_titles']]
def sanitize_and_split_tags(tags):
return [word for tag in tags for word in
re.sub(r'[\-/:,;\.()]', ' ', tag).lower().split()]
def get_vectors_words(words):
return [lookup.value[word] for word in words if word in lookup.value]
它将降低平均CC,但总体CC将保持不变或增加。我看不出你怎么能摆脱那些if
s检查单词是否在查找中。value
或检查我们是否有任何向量可以使用。你介意解释一下代码首先要做什么吗?添加了一些更详细的信息我从一开始就尝试过自己编写代码,最后得到了相同的代码:)你介意吗介意先解释一下代码要做什么吗?添加了更多的细节。我从一开始就尝试自己编写代码,最后得到了相同的代码:)