Python：如何使用函数过滤pandas.Series而不丢失索引关联？_Python_Filter_Pandas_Series

Python：如何使用函数过滤pandas.Series而不丢失索引关联？

python filter pandas

Python：如何使用函数过滤pandas.Series而不丢失索引关联？,python,filter,pandas,series,Python,Filter,Pandas,Series,我有一个pandas.DataFrame，我在上面迭代行。在每一行上，我需要过滤掉一些无价值的值，并保持索引关联。这就是我现在的处境： for i,row in df.iterrows(): my_values = row["first_interesting_column":] # here I need to filter 'my_values' Series based on a function # what I'm doin right now is use t

我有一个

pandas.DataFrame

，我在上面迭代行。在每一行上，我需要过滤掉一些无价值的值，并保持索引关联。这就是我现在的处境：

for i,row in df.iterrows():
    my_values = row["first_interesting_column":]
    # here I need to filter 'my_values' Series based on a function
    # what I'm doin right now is use the built-in python filter function, but what I get back is a list with no indexes anymore
    my_valuable_values = filter(lambda x: x != "-", my_values)

我该怎么做呢？

IRC的一个家伙建议我回答这个问题。这是：

w = my_values != "-" # creates a Series with a map of the stuff to be included/exluded
my_valuable_values = my_values[w]

。。。这也可以缩短在

my_valuable_values = my_values[my_values != "-"]

。。。当然，为了避免再走一步

row["first_interesting_column":][row["first_interesting_column":] != "-"]

对行进行迭代通常是不好的做法（而且非常慢）。正如@JohnE建议您使用的

如果我理解你的问题，我想你想做的是：

import pandas as pd
from io import StringIO

datastring = StringIO("""\
2009    2010    2011   2012
1       4       -      4
3       -       2      3
4       -       8      7
""")
df = pd.read_table(datastring, sep='\s\s+')
a = df[df.applymap(lambda x: x != '-')].astype(np.float).values
a[~np.isnan(a)]

我认为您不需要迭代，只需在任何迭代之外执行类似于：

df.applymap（lambda x:str（x）.find（'-'））

。通常发布一些示例数据并准确显示您想要得到的结果是一个好主意。顺便说一句，我不确定“失去索引关联”的确切含义，但请注意，当您这样迭代时，行是序列的。也许你的意思是“i”不反映数据帧索引？不管怎么说，如果你一开始就避免迭代，那也没关系。@JohnE我之所以迭代，是因为我用它在一个数据库中插入来自XLS的值。不知道这样做是否正确。当我说我失去索引关联时，我的意思是通过将

pandas.Series

传递给内置的

filter

函数，我得到了一个python

list

，它不是原始的

pandas.Series

索引。@JohnE顺便问一下，我在别处得到的答案是

my_values\u values=my_values[我的值！=“-”]

。