Python 熊猫-高亮显示列中的第n个后续相等值_Python_Pandas

Python 熊猫-高亮显示列中的第n个后续相等值

python pandas

Python 熊猫-高亮显示列中的第n个后续相等值,python,pandas,Python,Pandas,我试图找出如何突出显示Pandas数据帧中的第n个后续值（以及以下值），以获得如下结果： Example 1: highlight 3rd subsequent equal value in Column A: Column A | Desired_output 1 | 0 1 | 0 1 | 1 1 | 1 1 | 1 1 | 1 0 | 0 0 | 0 E

我试图找出如何突出显示Pandas数据帧中的第n个后续值（以及以下值），以获得如下结果：

 Example 1: highlight 3rd subsequent equal value in Column A:

 Column A | Desired_output
 1        | 0
 1        | 0
 1        | 1
 1        | 1
 1        | 1
 1        | 1
 0        | 0
 0        | 0

 Example 2: highlight 5th subsequent equal value in Column A:

 Column A | Desired_output
 1        | 0
 1        | 0
 1        | 0
 1        | 0
 1        | 1
 1        | 1
 0        | 0
 0        | 0

这不仅在列A等于1时有效，而且对于零也有效。其主要思想是：如果我没有足够的后续相等值，我的代码就不应该考虑它们

我本来想使用一个带有动态窗口的pd.rolling_sum命令，但我正在努力使用这个应用程序，你知道如何继续吗？感谢您考虑您的代码：

import pandas as pd
df = pd.DataFrame({'A': [1,1,1,1,1,1,0,0]})

# set n as the number of repetitions to highlight:
n=3 #or n=5

您有两种不同的方法来处理此问题：

特殊情况它可以解决您的特定问题（它要求您的列只包含1和0），并且需要numpy：

import numpy as np

df['Desired Output']=np.where(df.rolling(n).sum()%n==0, True, False)

一般情况它允许您解决行之间不同类型的比较（不仅仅是检查相等性），如下所示：

comparison = True

for i in range(n):
    comparison &= df['A'] == df['A'].shift(i)

df['Desired Output'] = comparison

# consider that in this scenario, a sequence of zeros it will be flagged with 1
df['Desired Output']=np.where(df.rolling(n).sum()%n==0, 1, 0)

df['Desired Output'] = comparison.astype(int)

两种情况的结果对于n=3，您将有：

    A   Desired Output
0   1   False
1   1   False
2   1   True
3   1   True
4   1   True
5   1   True
6   0   False
7   0   False

    A   Desired Output
0   1   False
1   1   False
2   1   False
3   1   False
4   1   True
5   1   True
6   0   False
7   0   False

对于n=5，您将有：

    A   Desired Output
0   1   False
1   1   False
2   1   True
3   1   True
4   1   True
5   1   True
6   0   False
7   0   False

    A   Desired Output
0   1   False
1   1   False
2   1   False
3   1   False
4   1   True
5   1   True
6   0   False
7   0   False

格式：如果您需要使用1和0的新列，使用特殊的案例方法，您可以在创建列时使用1和0而不是True和False，如下所示：

comparison = True

for i in range(n):
    comparison &= df['A'] == df['A'].shift(i)

df['Desired Output'] = comparison

# consider that in this scenario, a sequence of zeros it will be flagged with 1
df['Desired Output']=np.where(df.rolling(n).sum()%n==0, 1, 0)

df['Desired Output'] = comparison.astype(int)

如果您选择采用一般情况，则在创建列时只需包括astype（int），如下所示：

comparison = True

for i in range(n):
    comparison &= df['A'] == df['A'].shift(i)

df['Desired Output'] = comparison

# consider that in this scenario, a sequence of zeros it will be flagged with 1
df['Desired Output']=np.where(df.rolling(n).sum()%n==0, 1, 0)

df['Desired Output'] = comparison.astype(int)

看起来很整洁！谢谢你，乔！我已经更新了我的答案，现在它包含了一个解决方案与滚动，你正在寻找。请检查它是否解决了您的问题。