Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/344.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python numpy列表过滤_Python_Optimization_Numpy_Scipy - Fatal编程技术网

Python numpy列表过滤

Python numpy列表过滤,python,optimization,numpy,scipy,Python,Optimization,Numpy,Scipy,是否可以优化/矢量化下面的代码?现在,这似乎不是一种正确的做事方式,也不是很“pythonish”。该代码旨在处理大量数据集,因此性能非常重要 其思想是删除两个列表中未出现的所有值及其名称 例如,下面代码的结果将是两个列表,其中name2和name4的值分别为[2,4]和[5,6] import numpy as np names1=np.array(["name1","name2","name3","name4"]) names2=np.array(["name2","name4","nam

是否可以优化/矢量化下面的代码?现在,这似乎不是一种正确的做事方式,也不是很“pythonish”。该代码旨在处理大量数据集,因此性能非常重要

其思想是删除两个列表中未出现的所有值及其名称

例如,下面代码的结果将是两个列表,其中name2和name4的值分别为[2,4]和[5,6]

import numpy as np

names1=np.array(["name1","name2","name3","name4"])
names2=np.array(["name2","name4","name5","name6"])

pos1=np.array([1,2,3,4])
pos2=np.array([5,6,7,8])


for entry in names2:
    if not np.any(names1==entry):
        pointer=np.where(names2==entry)
        pos2=np.delete(pos2,pointer)
        names2=np.delete(names2,pointer)

for entry in names1:
    if not np.any(names2==entry):
        pointer=np.where(names1==entry)

        pos1=np.delete(pos1,pointer) 
        names1=np.delete(names1,pointer)

以下是矢量化的答案:

import numpy as np

names1=np.array(["name1","name2","name3","name4"])
names2=np.array(["name2","name4","name5","name6"])

pos1=np.array([1,2,3,4])
pos2=np.array([5,6,7,8])

intersection=np.intersect1d(names1,names2)
pointer1=np.argwhere(np.in1d(names1, intersection) == False)
pointer2=np.argwhere(np.in1d(names2, intersection) == False)

pos2=np.delete(pos2,pointer2)
names2=np.delete(names2,pointer2)

pos1=np.delete(pos1,pointer1)
names1=np.delete(names1,pointer1)

FWIW,这是一个简单的合并操作:


你真的想用numpy来做这个吗?这更像是熊猫的问题。我没有熊猫的经验。有什么提示吗
>>> df1 = pd.DataFrame({"name": names1, "pos": pos1})
>>> df2 = pd.DataFrame({"name": names2, "pos": pos2})
>>> df1
    name  pos
0  name1    1
1  name2    2
2  name3    3
3  name4    4
>>> df2
    name  pos
0  name2    5
1  name4    6
2  name5    7
3  name6    8
>>> df1.merge(df2, on="name", suffixes=[1,2])
    name  pos1  pos2
0  name2     2     5
1  name4     4     6