Python 在给定拆分原始列表的条件下，计算两个列表中的匹配数_Python_List_Split_Boolean

Python 在给定拆分原始列表的条件下，计算两个列表中的匹配数

python list

Python 在给定拆分原始列表的条件下，计算两个列表中的匹配数,python,list,split,boolean,Python,List,Split,Boolean,我有一个浮动列表，其中包含一些隐藏的“级别”信息，编码在浮动的比例中，我可以将浮动的“级别”拆分为： import math import numpy as np all_scores = [1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23, 6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.658458232

我有一个浮动列表，其中包含一些隐藏的“级别”信息，编码在浮动的比例中，我可以将浮动的“级别”拆分为：

import math
import numpy as np

all_scores = [1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23,
6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.65845823274041e+23,
1.6435516525621017e+24, 2.307193960221247e+24, 1.285806971089594e+24, 9603539.08653573,
17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801,
31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014,
4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0]

easy, med, hard = [], [], []

for i in all_scores:
    if i > math.exp(50):
        easy.append(i)
    elif i > math.exp(10):
        med.append(i)
    else:
        hard.append(i)

print ([easy, med, hard])

[out]：

[[1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23, 6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.65845823274041e+23, 1.6435516525621017e+24, 2.307193960221247e+24, 1.285806971089594e+24], [9603539.08653573, 17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801, 31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014], [4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0]]

[False, True, False, True, False, False, True, False, True, False, False, True, True, False, True, False, True, True, False, True, True, True, True, True, False, True, False, False, False, True]

4 10 3.52041505391e+24
6 10 143744715.777
6 10 37326.0

我还有另一个列表，它将对应于

all_分数

列表：

input_scores = [0.0, 2.7997409854370188e+23, 0.0, 6.7401171871631936e+22, 0.0, 0.0, 8.6584582327404103e+23, 0.0, 2.3071939602212471e+24, 0.0, 0.0, 17489013.841076534, 11806185.6660164, 0.0, 8546268.728385007, 0.0, 31828243.073495708, 51740168.152000979, 0.0, 22334836.315934014, 4354.0, 7474.0, 4354.0, 4030.0, 0.0, 8635.0, 0.0, 0.0, 0.0, 8479.0]

我需要检查easy、med和hard中有多少与所有分数匹配，我可以这样做，以获得flatten

all_分数列表中是否存在匹配的布尔值，如下所示：
matches = [i == j for i, j in zip(input_scores, all_scores)]
print ([i == j for i, j in zip(input_scores, all_scores)])

[out]：
[[1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23, 6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.65845823274041e+23, 1.6435516525621017e+24, 2.307193960221247e+24, 1.285806971089594e+24], [9603539.08653573, 17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801, 31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014], [4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0]]

[False, True, False, True, False, False, True, False, True, False, False, True, True, False, True, False, True, True, False, True, True, True, True, True, False, True, False, False, False, True]

4 10 3.52041505391e+24
6 10 143744715.777
6 10 37326.0

有没有办法知道比赛中有多少简单/中等/困难以及每个级别的比赛总数？
我已经尝试过这个方法，效果很好：
matches = [int(i == j) for i, j in zip(input_scores, all_scores)]

print(sum(matches[:len(easy)]) , len(easy), sum(np.array(easy) * matches[:len(easy)]) )
print(sum(matches[len(easy):len(easy)+len(med)]), len(med), sum(np.array(med) * matches[len(easy):len(easy)+len(med)]) )
print (sum(matches[len(easy)+len(med):]) , len(hard), sum(np.array(hard) * matches[len(easy)+len(med):]) )

[out]：
[[1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23, 6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.65845823274041e+23, 1.6435516525621017e+24, 2.307193960221247e+24, 1.285806971089594e+24], [9603539.08653573, 17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801, 31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014], [4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0]]

[False, True, False, True, False, False, True, False, True, False, False, True, True, False, True, False, True, True, False, True, True, True, True, True, False, True, False, False, False, True]

4 10 3.52041505391e+24
6 10 143744715.777
6 10 37326.0

但是必须有一种不太冗长的方法来实现相同的输出。
您可以使用dict
：
k = ('easy', 'meduim', 'hard')    
param = dict.fromkeys(k,0) ; outlist = []
for index,i in enumerate(range(0, len(matches), 10)):
    count = {k[index]:sum(matches[i:i + 10])}
    outlist.append(count)

print(outlist)
[{'easy': 4}, {'meduim': 6}, {'hard': 6}]

我不确定这个方法是否更详细，但我会使用np.inad
来匹配分数：
# we need numpy arrays
easy = np.array(easy)
med = np.array(med)
hard = np.array(hard)

for level in [easy, med, hard]:
    matches = level[np.where(np.in1d(level, input_scores))]
    print(len(matches), len(level), np.sum(matches))

这段代码不会产生与您所拥有的相同的输出，但是我认为您提供的数据已经被破坏了。例如，在硬
-数组中有两个7474.0
和4354.0
副本。这是预期的吗？easy数组中还有两个6.7401171871631936e+22

在给定当前数据的情况下使用我的方法输出
5 10 3.58781622578e+24
6 10 143744715.777
8 10 53435.0

另外，我也不完全确定如何求和，所以我只是对所有匹配的分数进行求和（因此我们的值会不同）

编辑：使用匹配的输入\u分数
代替所有\u分数
。唯一需要改变的是，我们必须对np.in1d
进行双重匹配：
scores = input_scores[np.where(np.in1d(input_scores, all_scores))]
for level in [easy, med, hard]:
    matches = scores[np.where(np.in1d(scores, level))]
    print(len(matches), len(level), np.sum(matches))

这就消除了以前的重复问题。输出：
4 10 3.52041505391e+24
6 10 143744715.777
6 10 37326.0


编辑2:我意识到我对np.where
的使用是多余的，可以完全删除它们
scores = input_scores[np.in1d(input_scores, all_scores)]
for level in [easy, med, hard]:
    matches = scores[np.in1d(scores, level)]
    print(len(matches), len(level), np.sum(matches))

生成与第一次编辑相同的输出

编辑3:我把它们放在一个程序中。也可以使用numpy方便地进行简单/中等/困难分数的拆分。它可能会更有效，但这是相当可读的：
import math
import numpy as np

all_scores = np.array([1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23,
6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.65845823274041e+23,
1.6435516525621017e+24, 2.307193960221247e+24, 1.285806971089594e+24, 9603539.08653573,
17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801,
31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014,
4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0])

input_scores = np.array([0.0, 2.7997409854370188e+23, 0.0, 6.7401171871631936e+22, 0.0, 0.0, 8.6584582327404103e+23, 0.0, 2.3071939602212471e+24, 0.0, 0.0, 17489013.841076534, 11806185.6660164, 0.0, 8546268.728385007, 0.0, 31828243.073495708, 51740168.152000979, 0.0, 22334836.315934014, 4354.0, 7474.0, 4354.0, 4030.0, 0.0, 8635.0, 0.0, 0.0, 0.0, 8479.0])

easy = all_scores[math.exp(50) < all_scores]
med = all_scores[(math.exp(10) < all_scores)*(all_scores < math.exp(50))] # * is boolean `and`
hard = all_scores[all_scores < math.exp(10)]

scores = input_scores[np.in1d(input_scores, all_scores)]
for level in [easy, med, hard]:
    matches = scores[np.in1d(scores, level)]
    print(len(matches), len(level), np.sum(matches))

导入数学
将numpy作为np导入
所有分数=np.数组（[1.0369411057174144e+22,2.7997409854370188e+23,1.296176382146768e+23，
6.7401171871631936e+22、6.7401171871631936e+22、2.022035156148958e+24、8.65845823274041e+23、，
1.643551652621017E+24、2.307193960221247e+24、1.285806971089594e+249603539.08653573、，
17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801,
31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014,
4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0])
输入_分数=np.array(0.0、0.0、0.0、0.0、0.0、0.0、8546268.7283838385852007、0.0、0.0、0.0、8.6584585858252525252525252525252525252525258.0、8.0、8.0、8.0、8.0 0、8.46468.72838.72838.72838383838383807、0、0.28282828282843.07282828282828282828284141414141343434343434343434343441414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141418.077878787878（见附件，8479.0]）
easy=所有分数[数学实验（50）<所有分数]
med=所有分数[（数学实验（10）<所有分数）*（所有分数<数学实验（50））]#*是布尔值，``
hard=所有分数[所有分数
听起来像是
如果您还没有遇到它，计数器
类似于dict，但在.update（）
等方法中，它们只是被添加到新值中，而不是用新值替换旧值。因此：
from collections import Counter

counter = Counter({'a': 2})
counter.update({'a': 3})
counter['a']
> 5

因此，您可以通过以下代码获得上述结果：
from collections import Counter

matches, counts, scores = [
    Counter({'easy': 0, 'med': 0, 'hard': 0}) for _ in range(3)
]

for score, inp in zip(all_scores, input_scores):
    category = (
        'easy' if score > math.exp(50) else
        'med' if score > math.exp(10) else
        'hard'
    )
    matches.update({category: score == inp})
    counts.update({category: 1})
    scores.update({category: score if score == inp else 0})

for cat in ('easy', 'med', 'hard'):
    print(matches[cat], counts[cat], scores[cat])

您可以使用一系列DICT作为查找表：
scores=defaultdict（list）#跟踪哪些数字属于类别
values=defaultdict（int）#对看到的数字进行计数
对于i，在所有的大学分数中：
如果i>math.exp（50）：
值[“简单”]+=1
分数[i]=“容易”
elif i>数学实验（10）：
值[“中等”]+=1
分数[i]=“中等”
其他：
值[“硬”]+=1
分数[i]=“难”
0.0、0.0、0.0、0.0、8.65845858582525252525258、8.6558585858585858585858582741414141414141414141414141414141414141414141414141414141414141414141418.5、10.0、6.0、6.0、6.7.7、6.7.7、6.7、6.7.7、6.7、6.7.4141414141414141781871871871871871871871616161637373737417、6、6、6.7、6.7、8.784646268.728.728.72838.7283838.7、0、0、0.0、0、0、0.0.0.0、0.0、0、0、0.4141464646464626268.7826268.78268.0,0.08479.0]
#找到您输入的类别
r=[（分数[i]，i）对于输入中的i_分数，如果i在分数中]
#加入你的分类，以获得计数
res=defaultdict（列表）
对于r中的k，v：
res[k].追加（v）
对于k，v在res.items（）中：
打印k，len（v），值[k]，和（v）
>>>中等61014744715.777
硬61037326.0
easy 4 10 3.52041505391e+24
这是一个numpy解决方案，它使用数字化来创建类别，并使用bincount
对匹配项进行计数和求和。作为免费奖励，还为剩余项创建了这些统计数据
categories = 'hard', 'med', 'easy'

# get group membership by splitting at e^10 and e^50
# the 'right' keyword tells digitize to include right boundaries
cat_map = np.digitize(all_scores, np.exp((10, 50)), right=True)
# cat_map has a zero in all the 'hard' places of all_scores
# a one in the 'med' places and a two in the 'easy' places

# add a fourth group to mark all non-matches
# we have to force at least one np.array for element-by-element
# comparison to work
cat_map[np.asanyarray(all_scores) != input_scores] = 3

# count
numbers = np.bincount(cat_map)
# count again, this time using all_scores as weights
sums = np.bincount(cat_map, all_scores)

# print
for c, n, s in zip(categories + ('unmatched',), numbers, sums):
    print('{:12}  {:2d}  {:6.4g}'.format(c, n, s))

# output:
#
# hard           6  3.733e+04
# med            6  1.437e+08
# easy           4  3.52e+24
# unmatched     14  5.159e+24

虽然你的问题已经得到了回答，但我还是想尝试一下（为了练习）。函数给出了预期的输出，但保罗·潘泽的解决方案是目前为止最理想的解决方案。：）
值适用于所有的\u分数，输入的\u分数是非唯一的。唯一约束它们的是顺序以及它们的值是否匹配。酷，我没有听说过np。数字化！！顺便说一句，什么是“不匹配”？为什么会出现不匹配的情况？@alvas我指的是那些input_scores
和all_scores
不匹配的情况。他们必须被转移到一个额外的组中，这样他们就不会与其他三个组中的任何一个一起计算。啊，这是有意义的。谢谢你的解释！