Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/359.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在DataFrame中,如何按天过滤timedelta64并将值更改为str?_Python_Pandas_Dataframe - Fatal编程技术网

Python 在DataFrame中,如何按天过滤timedelta64并将值更改为str?

Python 在DataFrame中,如何按天过滤timedelta64并将值更改为str?,python,pandas,dataframe,Python,Pandas,Dataframe,因此,目前我正在从MySql中的一个表生成一个xlsx,该表有很多列,但就这一点而言,我只需要一个。 我正在使用pandas生成xlsx,我打开与数据库的连接,然后执行以下操作: import pandas as pd sql ='Select *from bd.teste' df=pd.read_sql_query(sql,sql_con_obj.con) df.to_excel(“generated.xlsx”) 如果我很快就会这样做: df.info() 我要做这个: <cla

因此,目前我正在从MySql中的一个表生成一个xlsx,该表有很多列,但就这一点而言,我只需要一个。 我正在使用pandas生成xlsx,我打开与数据库的连接,然后执行以下操作:

import pandas as pd

sql ='Select *from bd.teste'
df=pd.read_sql_query(sql,sql_con_obj.con)
df.to_excel(“generated.xlsx”)
如果我很快就会这样做:

df.info()
我要做这个:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 147 entries, 0 to 146
Data columns (total 21 columns):
 #   Column               Non-Null Count  Dtype
---  ------               --------------  -----
 0   Number               147 non-null    object
 1   Date                 147 non-null    object
 ...
 20  IT                   147 non-null    object 
dtypes: object(20)
memory usage: 24.8+ KB

现在日期是在datetime中,它在这里

它显示在datetime64[ns]中的df.info中。 我想创建一个新列,该列将根据日期推断我当前的时间,并给出天数

df["Difference"]= pds.to_datetime("now") - df['Date']
因此,创建了表,表上的值与日期之间的差异匹配,生成的表是dtype timedelta64[ns]。现在出现了大问题。我想过滤7天和7天。
你知道怎么做吗?我虽然想到了ItErrors,但我不知道怎么做…

你可以使用
map

范例

df=pd.DataFrame({
“日期时间”:pd.date\u范围(
开始时间=pd.to_日期时间('now')-pd.TIMEDTA(天数=10),
end=pd.to_datetime('现在')
)
})
计算差额

df['difference']=pd.to_datetime('now')-df.datetime
输出

                     datetime              difference
0  2021-03-15 20:50:30.231135 10 days 00:00:06.153861
1  2021-03-16 20:50:30.231135  9 days 00:00:06.153861
2  2021-03-17 20:50:30.231135  8 days 00:00:06.153861
3  2021-03-18 20:50:30.231135  7 days 00:00:06.153861
4  2021-03-19 20:50:30.231135  6 days 00:00:06.153861
5  2021-03-20 20:50:30.231135  5 days 00:00:06.153861
6  2021-03-21 20:50:30.231135  4 days 00:00:06.153861
7  2021-03-22 20:50:30.231135  3 days 00:00:06.153861
8  2021-03-23 20:50:30.231135  2 days 00:00:06.153861
9  2021-03-24 20:50:30.231135  1 days 00:00:06.153861
10 2021-03-25 20:50:30.231135  0 days 00:00:06.153861
                     datetime              difference  less_than_7
0  2021-03-15 20:50:30.231135 10 days 00:00:06.153861        False
1  2021-03-16 20:50:30.231135  9 days 00:00:06.153861        False
2  2021-03-17 20:50:30.231135  8 days 00:00:06.153861        False
3  2021-03-18 20:50:30.231135  7 days 00:00:06.153861        False
4  2021-03-19 20:50:30.231135  6 days 00:00:06.153861         True
5  2021-03-20 20:50:30.231135  5 days 00:00:06.153861         True
6  2021-03-21 20:50:30.231135  4 days 00:00:06.153861         True
7  2021-03-22 20:50:30.231135  3 days 00:00:06.153861         True
8  2021-03-23 20:50:30.231135  2 days 00:00:06.153861         True
9  2021-03-24 20:50:30.231135  1 days 00:00:06.153861         True
10 2021-03-25 20:50:30.231135  0 days 00:00:06.153861         True
                     datetime Difference
0  2021-03-15 20:50:30.231135        >=7
1  2021-03-16 20:50:30.231135        >=7
2  2021-03-17 20:50:30.231135        >=7
3  2021-03-18 20:50:30.231135        >=7
4  2021-03-19 20:50:30.231135         <7
5  2021-03-20 20:50:30.231135         <7
6  2021-03-21 20:50:30.231135         <7
7  2021-03-22 20:50:30.231135         <7
8  2021-03-23 20:50:30.231135         <7
9  2021-03-24 20:50:30.231135         <7
10 2021-03-25 20:50:30.231135         <7
然后为少于7天的天数创建一个布尔列

df['less_than_7']=df.difference.dt.days<7天
输出

                     datetime              difference
0  2021-03-15 20:50:30.231135 10 days 00:00:06.153861
1  2021-03-16 20:50:30.231135  9 days 00:00:06.153861
2  2021-03-17 20:50:30.231135  8 days 00:00:06.153861
3  2021-03-18 20:50:30.231135  7 days 00:00:06.153861
4  2021-03-19 20:50:30.231135  6 days 00:00:06.153861
5  2021-03-20 20:50:30.231135  5 days 00:00:06.153861
6  2021-03-21 20:50:30.231135  4 days 00:00:06.153861
7  2021-03-22 20:50:30.231135  3 days 00:00:06.153861
8  2021-03-23 20:50:30.231135  2 days 00:00:06.153861
9  2021-03-24 20:50:30.231135  1 days 00:00:06.153861
10 2021-03-25 20:50:30.231135  0 days 00:00:06.153861
                     datetime              difference  less_than_7
0  2021-03-15 20:50:30.231135 10 days 00:00:06.153861        False
1  2021-03-16 20:50:30.231135  9 days 00:00:06.153861        False
2  2021-03-17 20:50:30.231135  8 days 00:00:06.153861        False
3  2021-03-18 20:50:30.231135  7 days 00:00:06.153861        False
4  2021-03-19 20:50:30.231135  6 days 00:00:06.153861         True
5  2021-03-20 20:50:30.231135  5 days 00:00:06.153861         True
6  2021-03-21 20:50:30.231135  4 days 00:00:06.153861         True
7  2021-03-22 20:50:30.231135  3 days 00:00:06.153861         True
8  2021-03-23 20:50:30.231135  2 days 00:00:06.153861         True
9  2021-03-24 20:50:30.231135  1 days 00:00:06.153861         True
10 2021-03-25 20:50:30.231135  0 days 00:00:06.153861         True
                     datetime Difference
0  2021-03-15 20:50:30.231135        >=7
1  2021-03-16 20:50:30.231135        >=7
2  2021-03-17 20:50:30.231135        >=7
3  2021-03-18 20:50:30.231135        >=7
4  2021-03-19 20:50:30.231135         <7
5  2021-03-20 20:50:30.231135         <7
6  2021-03-21 20:50:30.231135         <7
7  2021-03-22 20:50:30.231135         <7
8  2021-03-23 20:50:30.231135         <7
9  2021-03-24 20:50:30.231135         <7
10 2021-03-25 20:50:30.231135         <7
最后,将布尔列映射到新列
Difference

map_dict = {False : '>=7', True: '<7'}
df['Difference'] = df.less_than_7.map(map_dict)
df[['datetime', 'Difference']]

map_dict={False:'>=7',True:'任何类型的条件设置都非常简单。只需使用.loc[condition,cols]=value即可

取样df:

from pandas import DataFrame, to_timedelta
df = DataFrame()
df["deltas"] = [to_timedelta(i, unit= "D") for i in range(1, 11)]
df
   deltas
0  1 days
1  2 days
2  3 days
3  4 days
4  5 days
5  6 days
6  7 days
7  8 days
8  9 days
9 10 days
如果您这样做:

cond = df["deltas"] > to_timedelta(7, unit= "D")
情况如下所示:

cond
0    False
1    False
2    False
3    False
4    False
5    False
6    False
7     True
8     True
9     True
Name: deltas, dtype: bool
您可以在.loc中使用它:


df.loc[cond, "deltas"] = "> 7"
df.loc[~cond, "deltas"] = "<= 7"

ddf
  deltas
0   <= 7
1   <= 7
2   <= 7
3   <= 7
4   <= 7
5   <= 7
6   <= 7
7    > 7
8    > 7
9    > 7

df.loc[cond,“三角洲”]=>7”

df.loc[~cond,“三角洲”]=“让我看看,如果我得到它,你改变数据框,只生成2列,规定了日期。第一个问题是,我不知道如何将我从mysql中提取的其他列放入数据框,我看到你可以手动操作,但值每天都在变化,所以行也在变化。当我尝试运行第一个代码时,它说“名称”to_timedelta“未定义”然后我尝试从datetime导入timedelta并给出相同的错误…这是一个pandas函数,因此:从pandas导入到_timedelta文档:确定我添加了pd。之前和现在显示此错误“值的长度(10)与索引的长度(145)不匹配”“我不能设定一个固定的范围,因为索引每天都在变化……是的,就是这样,我第三天搜索答案,你的成功了!非常感谢!是的,又成功了。你就是那个人!谢谢你