Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/ant/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 基于间隔的Numpy/Pandas切片_Python_Pandas_Numpy_Stride - Fatal编程技术网

Python 基于间隔的Numpy/Pandas切片

Python 基于间隔的Numpy/Pandas切片,python,pandas,numpy,stride,Python,Pandas,Numpy,Stride,试图找出一种方法来分割pandas/numpy矩阵中非连续和非等长的行,以便将这些值设置为公共值。有没有人想出一个优雅的解决方案 import numpy as np import pandas as pd x = pd.DataFrame(np.arange(12).reshape(3,4)) #x is the matrix we want to index into """ x before: array([[ 0, 1, 2, 3],

试图找出一种方法来分割pandas/numpy矩阵中非连续和非等长的行,以便将这些值设置为公共值。有没有人想出一个优雅的解决方案

import numpy as np
import pandas as pd
x = pd.DataFrame(np.arange(12).reshape(3,4))
#x is the matrix we want to index into

"""
x before:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
"""
y = pd.DataFrame([[0,3],[2,2],[1,2],[0,0]]) 
#y is a matrix where each row contains a start idx and end idx per column of x

"""
   0  1
0  0  3
1  2  3
2  1  3
3  0  1
"""
我要寻找的是一种基于y行有效选择不同长度x切片的方法

x[y] = 0 
"""
x afterwards:
array([[ 0,  1,  2,  0],
       [ 0,  5,  0,  7],
       [ 0,  0,  0, 11]])

屏蔽仍然是有用的,因为即使不能完全避免循环,主数据帧x也不需要参与循环,因此这将加快速度:

掩码=np.zeros_likex,dtype=bool 对于我在兰格莱尼: 掩码[y.iloc[i,0]:y.iloc[i,1]+1,i]=True x[掩码]=0 x

作为进一步的改进,考虑如果可能的话,将Y定义为一个麻木数组。< /P> < P>我自定义的回答你的问题:

y_t = y.values.transpose()
y_t[1,:] = y_t[1,:] - 1 # or remove this line and change '>= r' below to '> r`

r = np.arange(x.shape[0])

mask = ((y_t[0,:,None] <= r) & (y_t[1,:,None] >= r)).transpose()

res = x.where(~mask, 0)
res
#     0   1   2   3
# 0   0   1   2   0
# 1   0   5   0   7
# 2   0   0   0   11

您可能正在寻找for循环分配backThank@ti7,我认为掩码直接在这里不起作用,因为条件不一定是一个布尔值,而是从一个间隔列表来看for循环是我当前运行的方法,但它很慢。。
y_t = y.values.transpose()
y_t[1,:] = y_t[1,:] - 1 # or remove this line and change '>= r' below to '> r`

r = np.arange(x.shape[0])

mask = ((y_t[0,:,None] <= r) & (y_t[1,:,None] >= r)).transpose()

res = x.where(~mask, 0)
res
#     0   1   2   3
# 0   0   1   2   0
# 1   0   5   0   7
# 2   0   0   0   11