Python 已使用ID的计数_Python_Pandas

Python 已使用ID的计数

python pandas

Python 已使用ID的计数,python,pandas,Python,Pandas,我有一个问题，我需要计算已经使用的ID。在我的数据集中有一些属性：id、time、Bi如下所示： id time Bi | wanted_results used 1 3 NAN | 0 [] 1 3 1 | 1 [1] 1 2 NAN | 1 [1] 2 2 1 | 2

我有一个问题，我需要计算已经使用的ID。在我的数据集中有一些属性：

id、time、Bi

如下所示：

id   time   Bi   | wanted_results    used
1     3     NAN  |       0           []
1     3      1   |       1           [1] 
1     2     NAN  |       1           [1] 
2     2      1   |       2           [1, 2]
2     1      1   |       2           [1, 2] 
2     1      1   |       2           [1, 2]

属性描述：

```
id
```
-表示我们计算的值
```
时间
```
-用于时间线，开关从
```
n变为0
```
```
Bi
```
-表示当时是否使用了id
```
used
```
-表示已计算的内容

所以现在我想要唯一的已经使用的ID作为计数。如何将数据分组以存储使用过的ID，以获得想要的结果

谢谢大家!

您可以通过迭代

数据帧

并将

id

s添加到

集合

df['wanted_result'] = 0
used_set = set()
for row in df.itertuples():
    df.loc[row.Index, 'wanted_result'] = len(used_set)
    used_set.add((row.id,))

导致

  id  time   Bi   wanted_result
0  1  3      NAN  0
1  1  3      1    1
2  1  2      NAN  1
3  2  2      1    1
4  2  1      1    2
5  2  1      1    2

您可以通过迭代

数据帧

并将

id

s添加到

集合

df['wanted_result'] = 0
used_set = set()
for row in df.itertuples():
    df.loc[row.Index, 'wanted_result'] = len(used_set)
    used_set.add((row.id,))

导致

  id  time   Bi   wanted_result
0  1  3      NAN  0
1  1  3      1    1
2  1  2      NAN  1
3  2  2      1    1
4  2  1      1    2
5  2  1      1    2

您可以结合使用展开和应用

df['id'].expanding().apply(lambda x: len(np.unique(x)))

这将返回一个包含所需结果的序列。

您可以使用展开和应用的组合

df['id'].expanding().apply(lambda x: len(np.unique(x)))

这将返回一个包含所需结果的序列