Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/282.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何从数据帧中删除负时间?_Python_Pandas_Datetime_Dataframe - Fatal编程技术网

Python 如何从数据帧中删除负时间?

Python 如何从数据帧中删除负时间?,python,pandas,datetime,dataframe,Python,Pandas,Datetime,Dataframe,我已经在pandas中创建了一个数据框,显示了完成工作订单所需的总时间的计算。由于人为输入错误,一些时间计算结果显示为负时间,如第30行所示,即使我将am切换到PM,也会给出错误的时间,因为工作时间在07:30-16:00之间,最好忽略这些时间 Work Order WorkType AST AFT comp_time 10 BAEBRO-898690 RM 1900-01-01 06:27:41 1900-01-0

我已经在pandas中创建了一个数据框,显示了完成工作订单所需的总时间的计算。由于人为输入错误,一些时间计算结果显示为负时间,如第30行所示,即使我将am切换到PM,也会给出错误的时间,因为工作时间在07:30-16:00之间,最好忽略这些时间

Work Order  WorkType        AST                 AFT             comp_time
10  BAEBRO-898690   RM  1900-01-01 06:27:41 1900-01-01 08:05:28 01:37:47
13  BAEBRO-914693   RM  1900-01-01 08:30:00 1900-01-01 09:00:00 00:30:00
27  BAEBRO-898787   RM  1900-01-01 10:00:00 1900-01-01 10:30:00 00:30:00
30  BAEBRO-914680   RM  1900-01-01 14:32:08 1900-01-01 10:37:17 -1 days +20:05:09
37  BAEBRO-914660   RM  1900-01-01 10:47:39 1900-01-01 11:32:02 00:44:23`
我获得此结果的代码是:

import pandas as pd
from datetime import time
from datetime import timedelta
from pandas import DataFrame
import matplotlib as plt

df = pd.read_excel('C:/Users/Nativ_Zero/Desktop/work data/July.xls')

df_work = df[['Work Order', 'WorkType', 'AST','AFT']]

df_work['AFT'] = pd.to_datetime(df_work['AFT'], format='%H:%M:%S', errors='coerce')
df_work['AST'] = pd.to_datetime(df_work['AST'], format='%H:%M:%S', errors='coerce')

rm_work = df_work[df_work.WorkType == 'RM']


rm_work['comp_time'] = rm_work['AFT'] - rm_work['AST']
rm_work.head()

下面的代码将适用于您:

df = pd.read_excel('C:/Users/Nativ_Zero/Desktop/work data/July.xls')

df_work = df[['Work Order', 'WorkType', 'AST','AFT']]

df_work['AFT'] = pd.to_datetime(df_work['AFT'], format='%H:%M:%S', errors='coerce')
df_work['AST'] = pd.to_datetime(df_work['AST'], format='%H:%M:%S', errors='coerce')

rm_work = df_work[df_work.WorkType == 'RM']


rm_work['comp_time'] = rm_work['AFT'] - rm_work['AST']
rm_work = rm_work[rm_work.comp_time >= pd.Timedelta(0)]  # Filtering condition
rm_work.head()

您需要使用适当的数据类型进行比较,在本例中为Timedelta。

使用。应用以检查pandas时间是否为负值。请确保与pd.Timedelta0进行比较,而不仅仅是0,因为那样会出错。如果为负数,则返回numpy NaN。最后,排除带有NAN的行

如果您的专栏中已经有NAN并且希望保留它们,这将导致问题!在这种情况下,您可以更改方法以返回其他内容,然后排除该唯一值

def check_if_negative(pd_time):
    if pd_time >= pd.Timedelta(0): # positive time and 0 time
        return pd_time
    elif pd_time < pd.Timedelta(0): # negative time
        return np.NaN
    else:
        print(f'problem! {pd_time} has an issue') # quick error check

rm_work['comp_time'] = rm_work['AFT'] - rm_work['AST'] # create timedelta
rm_work['comp_time'] = rm_work.comp_time.apply(check_if_negative) # apply check to column

rm_work = rm_work.dropna(subset=['comp_time']) # delete rows with NaN

你可以过滤掉负值:rm_work=rm_work[rm_work.com_time>=0]我输入了这段代码并得到:AttributeError:“DataFrame”对象没有属性“com_time”,它是comp_time,@PaulLane。抱歉我应该注意到了。我现在得到:TypeError:无法将TimedeltaIndex与int类型进行比较