Pandas 在基于整数索引位置的数据帧中删除一行_Pandas_Dataframe_Duplicates_Timestamp

Pandas 在基于整数索引位置的数据帧中删除一行

pandas dataframe

Pandas 在基于整数索引位置的数据帧中删除一行,pandas,dataframe,duplicates,timestamp,Pandas,Dataframe,Duplicates,Timestamp,我的时间序列数据具有重复的时间戳索引，但我只想基于整数位置删除一行。例如，如果我有以下内容： import numpy as np import pandas as pd dates = pd.to_datetime(["2015-10-22 09:40:00","2015-10-22 09:40:00","2015-10-22 09:40:00","2015-10-22 09:50:00","2015-10-22 10:00:00"]) data_rand = np.rand

我的时间序列数据具有重复的时间戳索引，但我只想基于整数位置删除一行。例如，如果我有以下内容：

import numpy as np
import pandas as pd

dates       = pd.to_datetime(["2015-10-22 09:40:00","2015-10-22 09:40:00","2015-10-22 09:40:00","2015-10-22 09:50:00","2015-10-22 10:00:00"])
data_rand   = np.random.rand(len(dates),3)
col_head    = ['A','B','C']

df          = pd.DataFrame(data=data_rand, index=dates, columns=col_head)

print(df)
rowindex    = 1

df.drop(df.index[rowindex], inplace=True)
#df.drop(df.index.iloc[[rowindex]], inplace=True)

print(df)

数据输出一个数据帧，看起来像：

                            A         B         C
2015-10-22 09:40:00  0.755642  0.797471  0.366410
2015-10-22 09:40:00  0.475411  0.629229  0.733368
2015-10-22 09:40:00  0.003278  0.461901  0.184833
2015-10-22 09:50:00  0.803465  0.218510  0.864337
2015-10-22 10:00:00  0.153356  0.950724  0.249950

现在，如果我想删除第二行，我将使用drop函数，但是因为还有两个其他标签具有完全相同的索引，所以所有三个标签都将被删除。有没有办法只去掉三个重复的时间戳的中间部分？我希望在不重置索引的情况下执行此操作

我希望数据是这样的：

                            A         B         C
2015-10-22 09:40:00  0.755642  0.797471  0.366410
2015-10-22 09:40:00  0.003278  0.461901  0.184833
2015-10-22 09:50:00  0.803465  0.218510  0.864337
2015-10-22 10:00:00  0.153356  0.950724  0.249950

使用np.arange和iloc选择除

行索引

以外的行。非常类似于删除

行索引

，即（如果考虑删除多行索引，我建议@Zero给出答案）

输出：

A B C 2015-10-22 09:40:00 0.568431 0.302549 0.497309 2015-10-22 09:40:00 0.683263 0.916699 0.108929 2015-10-22 09:50:00 0.751543 0.480892 0.797728 2015-10-22 10:00:00 0.282703 0.433418 0.009757 A、B、C 2015-10-22 09:40:00 0.568431 0.302549 0.497309 2015-10-22 09:40:00 0.683263 0.916699 0.108929 2015-10-22 09:50:00 0.751543 0.480892 0.797728 2015-10-22 10:00:00 0.282703 0.433418 0.009757 df A、B、C 2015-10-22 09:40:00 0.568431 0.302549 0.497309 2015-10-22 09:40:00 0.683263 0.916699 0.108929 2015-10-22 09:40:00 0.495492 0.232836 0.436861 2015-10-22 09:50:00 0.751543 0.480892 0.797728 2015-10-22 10:00:00 0.282703 0.433418 0.009757

您可以使用

iloc

或

loc

等

In [5055]: idx = np.ones(len(df.index), dtype=bool)

In [5057]: idx[rowindex] = False

In [5058]: df.iloc[idx]     # or df.loc[idx]
Out[5058]:
                            A         B         C
2015-10-22 09:40:00  0.704959  0.995358  0.355915
2015-10-22 09:40:00  0.151127  0.398876  0.240856
2015-10-22 09:50:00  0.343456  0.513128  0.666625
2015-10-22 10:00:00  0.105908  0.130895  0.321981

细节

In [5059]: df
Out[5059]:
                            A         B         C
2015-10-22 09:40:00  0.704959  0.995358  0.355915
2015-10-22 09:40:00  0.762548  0.593177  0.691702
2015-10-22 09:40:00  0.151127  0.398876  0.240856
2015-10-22 09:50:00  0.343456  0.513128  0.666625
2015-10-22 10:00:00  0.105908  0.130895  0.321981

您可以使用整数位置查找行的名称（即索引）：

df=df.drop（df.iloc[i].name）

我喜欢这个解决方案的灵活性。有没有什么方法可以使用类似于drop的inplace选项？否则我将不得不重新保存数据帧（即，您的最后一行是df=df.iloc[idx]）不，我不这么认为。

In [5055]: idx = np.ones(len(df.index), dtype=bool)

In [5057]: idx[rowindex] = False

In [5058]: df.iloc[idx]     # or df.loc[idx]
Out[5058]:
                            A         B         C
2015-10-22 09:40:00  0.704959  0.995358  0.355915
2015-10-22 09:40:00  0.151127  0.398876  0.240856
2015-10-22 09:50:00  0.343456  0.513128  0.666625
2015-10-22 10:00:00  0.105908  0.130895  0.321981

In [5059]: df
Out[5059]:
                            A         B         C
2015-10-22 09:40:00  0.704959  0.995358  0.355915
2015-10-22 09:40:00  0.762548  0.593177  0.691702
2015-10-22 09:40:00  0.151127  0.398876  0.240856
2015-10-22 09:50:00  0.343456  0.513128  0.666625
2015-10-22 10:00:00  0.105908  0.130895  0.321981