Python：减少圈复杂度_Python_Cyclomatic Complexity

Python：减少圈复杂度

python

Python：减少圈复杂度,python,cyclomatic-complexity,Python,Cyclomatic Complexity,我需要帮助降低以下代码的圈复杂度： def avg_title_vec(record, lookup): avg_vec = [] word_vectors = [] for tag in record['all_titles']: titles = clean_token(tag).split() for word in titles: if word in lookup.value:

我需要帮助降低以下代码的圈复杂度：

def avg_title_vec(record, lookup):
    avg_vec = []
    word_vectors = []
    for tag in record['all_titles']:
        titles = clean_token(tag).split()
        for word in titles:
            if word in lookup.value:
                word_vectors.append(lookup.value[word])
    if len(word_vectors):
        avg_vec = [
            float(val) for val in numpy.mean(
                numpy.array(word_vectors),
                axis=0)]

    output = (record['id'],
              ','.join([str(a) for a in avg_vec]))
    return output

输入示例：

record ={'all_titles': ['hello world', 'hi world', 'bye world']}

lookup.value = {'hello': [0.1, 0.2], 'world': [0.2, 0.3], 'bye': [0.9, -0.1]}

def clean_token(input_string):
    return input_string.replace("-", " ").replace("/", " ").replace(
    ":", " ").replace(",", " ").replace(";", " ").replace(
    ".", " ").replace("(", " ").replace(")", " ").lower()

所以lookup.value中出现的所有单词，我取它们的向量形式的平均值。

这可能并不能算作正确答案，因为最终圈复杂度并没有降低

这个变体稍微短了一点，但我看不出有任何方法可以将其推广。如果你有的话，你似乎需要这些


def avg_title_vec(record, lookup):
    word_vectors = [lookup.value[word] for tag in record['all_titles']
                    for word in clean_token(tag).split() if word in lookup.value]
    if not word_vectors:
        return (record['id'], None)
    avg_vec = [float(val) for val in numpy.mean(
               numpy.array(word_vectors),
               axis=0)]

    output = (record['id'],
              ','.join([str(a) for a in avg_vec]))
    return output

你的CC是6，这已经很好了，根据。您可以通过使用助手函数来减少函数的CC，如
def get_tags(record):
    return [tag for tag in record['all_titles']]

def sanitize_and_split_tags(tags):
    return [word for tag in tags for word in
            re.sub(r'[\-/:,;\.()]', ' ', tag).lower().split()]

def get_vectors_words(words):
    return [lookup.value[word] for word in words if word in lookup.value]

它将降低平均CC，但总体CC将保持不变或增加。我看不出你怎么能摆脱那些检查单词是否在查找中的。值，或者检查我们是否有向量可以使用的。
这可能不是一个真正正确的答案，因为最终圈复杂度没有降低
这个变体稍微短了一点，但我看不出有任何方法可以将其推广。如果你有的话，你似乎需要这些
def avg_title_vec(record, lookup):
    word_vectors = [lookup.value[word] for tag in record['all_titles']
                    for word in clean_token(tag).split() if word in lookup.value]
    if not word_vectors:
        return (record['id'], None)
    avg_vec = [float(val) for val in numpy.mean(
               numpy.array(word_vectors),
               axis=0)]

    output = (record['id'],
              ','.join([str(a) for a in avg_vec]))
    return output

你的CC是6，这已经很好了，根据。您可以通过使用助手函数来减少函数的CC，如
def get_tags(record):
    return [tag for tag in record['all_titles']]

def sanitize_and_split_tags(tags):
    return [word for tag in tags for word in
            re.sub(r'[\-/:,;\.()]', ' ', tag).lower().split()]

def get_vectors_words(words):
    return [lookup.value[word] for word in words if word in lookup.value]

它将降低平均CC，但总体CC将保持不变或增加。我看不出你怎么能摆脱那些if
s检查单词是否在查找中。value
或检查我们是否有任何向量可以使用。
你介意解释一下代码首先要做什么吗？添加了一些更详细的信息我从一开始就尝试过自己编写代码，最后得到了相同的代码：）你介意吗介意先解释一下代码要做什么吗？添加了更多的细节。我从一开始就尝试自己编写代码，最后得到了相同的代码：）