Python 从数据框中选择特定索引、列对
我有一个数据帧x:Python 从数据框中选择特定索引、列对,python,pandas,Python,Pandas,我有一个数据帧x: x = pd.DataFrame(np.random.randn(3,3), index=[1,2,3], columns=['A', 'B', 'C']) x A B C 1 0.256668 -0.338741 0.733561 2 0.200978 0.145738 -0.409657 3 -0.891879 0.039337 0.400449 我想选择一组索引列对来填充一个新的系列。例如,我可以
x = pd.DataFrame(np.random.randn(3,3), index=[1,2,3], columns=['A', 'B', 'C'])
x
A B C
1 0.256668 -0.338741 0.733561
2 0.200978 0.145738 -0.409657
3 -0.891879 0.039337 0.400449
我想选择一组索引列对来填充一个新的系列。例如,我可以选择[(1,'A'),(1,'B'),(1,'A'),(3,'C')]
,它将生成一个包含4个元素的列表、数组或序列:
[0.256668, -0.338741, 0.256668, 0.400449]
你知道我应该怎么做吗?使用应该能够定位数据框中的元素,如下所示:
import pandas as pd
# using your data sample
df = pd.read_clipboard()
df
Out[170]:
A B C
1 0.256668 -0.338741 0.733561
2 0.200978 0.145738 -0.409657
3 -0.891879 0.039337 0.400449
# however you cannot store A, B, C... as they are undefined names
l = [(1, 'A'), (1, 'B'), (1, 'A'), (3, 'C')]
# you can also use a for/loop, simply iterate the list and LOCATE the element
map(lambda x: df.ix[x[0], x[1]], l)
Out[172]: [0.25666800000000001, -0.33874099999999996, 0.25666800000000001, 0.400449]
我认为get\u value()
和lookup()
更快:
import numpy as np
import pandas as pd
x = pd.DataFrame(np.random.randn(3,3), index=[1,2,3], columns=['A', 'B', 'C'])
locations = [(1, "A"), (1, "B"), (1, "A"), (3, "C")]
print x.get_value(1, "A")
row_labels, col_labels = zip(*locations)
print x.lookup(row_labels, col_labels)
如果对是位置而不是索引/列名
row_position = [0,0,0,2]
col_position = [0,1,0,2]
x.values[row_position, col_position]
或者从np.searchsorted
row_position = np.searchsorted(x.index,row_labels,sorter = np.argsort(x.index))