Python 熊猫字符串以结尾，并创建两个数据帧_Python_Pandas

Python 熊猫字符串以结尾，并创建两个数据帧

python pandas

Python 熊猫字符串以结尾，并创建两个数据帧,python,pandas,Python,Pandas,我有一个有两列的数据框。一个是日期列，它的末尾有一个空格，空格中有一个小时，它被存储为一个字符串。另一列是该小时的计数。我希望从这些数据中创建两个数据帧。数据从上午7点到下午6点或日期列以7-17结束的一个。另一个日期字符串的值为18,19,20,21,22,23,0,1,2,3,4,5,6。我环顾了四周，但我能找到的大部分信息都是布尔值，比如pandas.Series.str.endswith 数据如下所示： df Date Count 0 2018-11-2

我有一个有两列的数据框。一个是日期列，它的末尾有一个空格，空格中有一个小时，它被存储为一个字符串。另一列是该小时的计数。我希望从这些数据中创建两个数据帧。数据从上午7点到下午6点或日期列以7-17结束的一个。另一个日期字符串的值为18,19,20,21,22,23,0,1,2,3,4,5,6。我环顾了四周，但我能找到的大部分信息都是布尔值，比如pandas.Series.str.endswith

数据如下所示：

df
    Date            Count
0   2018-11-20 0    0
1   2018-11-20 1    0
2   2018-11-20 2    0
3   2018-11-20 3    1
4   2018-11-20 4    0
5   2018-11-20 5    0
6   2018-11-20 6    0
7   2018-11-20 7    1
8   2018-11-20 8    6
9   2018-11-20 9    0
10  2018-11-20 10   0
11  2018-11-20 11   0
12  2018-11-20 12   0
13  2018-11-20 13   0
14  2018-11-20 14   2
15  2018-11-20 15   5
16  2018-11-20 16   23
17  2018-11-20 17   0
18  2018-11-20 18   0
19  2018-11-20 19   3

期望输出：

business_hours_df
    Date            Count
0   2018-11-20 7    1
1   2018-11-20 8    6
2   2018-11-20 9    0
3   2018-11-20 10   0
4   2018-11-20 11   0
5   2018-11-20 12   0
6   2018-11-20 13   0
7   2018-11-20 14   2
8   2018-11-20 15   5
9   2018-11-20 16   23
10  2018-11-20 17   0

non_business_hours_df
    Date            Count
0   2018-11-20 0    0
1   2018-11-20 1    0
2   2018-11-20 2    0
3   2018-11-20 3    1
4   2018-11-20 4    0
5   2018-11-20 5    0
6   2018-11-20 6    0
7   2018-11-20 18   0
8   2018-11-20 19   3

您可以使用布尔掩码：

import pandas as pd

data = [['2018-11-20 0', 0],
        ['2018-11-20 1', 0],
        ['2018-11-20 2', 0],
        ['2018-11-20 3', 1],
        ['2018-11-20 4', 0],
        ['2018-11-20 5', 0],
        ['2018-11-20 6', 0],
        ['2018-11-20 7', 1],
        ['2018-11-20 8', 6],
        ['2018-11-20 9', 0],
        ['2018-11-20 10', 0],
        ['2018-11-20 11', 0],
        ['2018-11-20 12', 0],
        ['2018-11-20 13', 0],
        ['2018-11-20 14', 2],
        ['2018-11-20 15', 5],
        ['2018-11-20 16', 23],
        ['2018-11-20 17', 0],
        ['2018-11-20 18', 0],
        ['2018-11-20 19', 3]]

df = pd.DataFrame(data=data, columns=['Date', 'Count'])

mask = df['Date'].apply(lambda x: 7 <= int(x.split()[-1]) <= 17)

business_hours_df = df[mask]
non_business_hours_df = df[~mask]

print(business_hours_df)
print(non_business_hours_df)

您可以使用布尔掩码：

import pandas as pd

data = [['2018-11-20 0', 0],
        ['2018-11-20 1', 0],
        ['2018-11-20 2', 0],
        ['2018-11-20 3', 1],
        ['2018-11-20 4', 0],
        ['2018-11-20 5', 0],
        ['2018-11-20 6', 0],
        ['2018-11-20 7', 1],
        ['2018-11-20 8', 6],
        ['2018-11-20 9', 0],
        ['2018-11-20 10', 0],
        ['2018-11-20 11', 0],
        ['2018-11-20 12', 0],
        ['2018-11-20 13', 0],
        ['2018-11-20 14', 2],
        ['2018-11-20 15', 5],
        ['2018-11-20 16', 23],
        ['2018-11-20 17', 0],
        ['2018-11-20 18', 0],
        ['2018-11-20 19', 3]]

df = pd.DataFrame(data=data, columns=['Date', 'Count'])

mask = df['Date'].apply(lambda x: 7 <= int(x.split()[-1]) <= 17)

business_hours_df = df[mask]
non_business_hours_df = df[~mask]

print(business_hours_df)
print(non_business_hours_df)

这是一个很好的解释和回答@丹尼尔：这是一个很好的解释和回答@丹尼尔·梅斯乔