Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/343.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫月_Python_Pandas_Datetime_Dataframe - Fatal编程技术网

Python 熊猫月

Python 熊猫月,python,pandas,datetime,dataframe,Python,Pandas,Datetime,Dataframe,我想在dataFrame中创建一个列,它将是另外两个数据帧的结果 在下面的示例中,创建了两个数据帧:df1和df2 然后创建了第三个数据帧,即前两个数据帧的连接点。在此df3中,“Dates”列已更改为dateTime类型 之后,创建了“DateMonth”列,其月份从“Dates”列中提取 在这个df3中,我需要一个新列,如果“DateMonth”列的月份出现在“months”列中,则该列的值为1 我的困难是在“月”列中,或者值为零,或者值是一个月列表 如何实现此结果?尝试以下解决方案: im

我想在dataFrame中创建一个列,它将是另外两个数据帧的结果

在下面的示例中,创建了两个数据帧:df1和df2

然后创建了第三个数据帧,即前两个数据帧的连接点。在此df3中,“Dates”列已更改为dateTime类型

之后,创建了“DateMonth”列,其月份从“Dates”列中提取

在这个df3中,我需要一个新列,如果“DateMonth”列的月份出现在“months”列中,则该列的值为1

我的困难是在“月”列中,或者值为零,或者值是一个月列表


如何实现此结果?

尝试以下解决方案:

import pandas as pd

# define function for df.apply
def matched(row):
    if type(row['months'])==str:
        # for the case ('Feb, Mar, Apr') - get numerical representation of month from your string and return True if the 'Dates' value matches with some list item
        return row['Dates'].month in [datetime.strptime(mon.strip(), '%b').month for mon in row['months'].split(',')]  
    else:
        # for numbers - return True if months match
        return row['Dates'].month==row['months']

# df1 and df2:
id_sales   = [1, 2, 3, 4, 5, 6]
col_names  = ['Id', 'parrotId', 'Dates']
df1        = pd.DataFrame(columns = col_names)
df1.Id     = id_sales
df1.parrotId = [1, 2, 3, 1, 2, 3]
df1.Dates  = ['2012-12-25', '2012-08-20', '2013-07-23', '2014-01-14', '2016-02-21', '2015-10-31']

col_names2 = ['parrotId', 'months']
df2        = pd.DataFrame(columns = col_names2)
df2.parrotId = [1, 2, 3]
df2.months = [12, ('Feb, Mar, Apr'), 0]

df3 = pd.merge(df1, df2, on = 'parrotId')
df3.Dates = pd.to_datetime(df3.Dates)

# use apply to run the function on each row, astype converts boolean to int (0/1) 
df3['DateMonth'] = df3.apply(matched, axis=1).astype(int)
df3

Output:      
Id  parrotId    Dates   months          DateMonth
0   1   1   2012-12-25  12              1
1   4   1   2014-01-14  12              0
2   2   2   2012-08-20  Feb, Mar, Apr   0
3   5   2   2016-02-21  Feb, Mar, Apr   1
4   3   3   2013-07-23  0               0
5   6   3   2015-10-31  0               0

什么是鹦鹉?卢卡斯,首先,非常感谢你的帮助。不幸的是,它没有起作用。我收到一条错误消息:“未定义名称‘datetime’”。我甚至导入了datetime库,并在收到新的错误消息时再次运行了最后一步:“AttributeError:('module'datetime'没有属性'strptime','发生在索引2')“。第一次尝试时,我错误地导入了库。我以这种方式导入它,它工作得非常完美:从datetime导入datetime。再次感谢卢卡斯。卢卡斯,你对如何解决下面的错误有什么看法吗?ValueError:(‘未转换的数据仍然存在:t’,‘发生在索引16772’)@ngelo,我猜某个月的缩写中有一个拼写错误。当我这样做时:
df2.months=[12,('Febt,Mar,Apr'),0]
,我会得到类似的错误(只是错误指数不同)。如果您从某个文件中读取源数据,请仔细查看并尝试查找不正确的月份缩写格式(可能有4个字符,以t结尾)。
import pandas as pd

# define function for df.apply
def matched(row):
    if type(row['months'])==str:
        # for the case ('Feb, Mar, Apr') - get numerical representation of month from your string and return True if the 'Dates' value matches with some list item
        return row['Dates'].month in [datetime.strptime(mon.strip(), '%b').month for mon in row['months'].split(',')]  
    else:
        # for numbers - return True if months match
        return row['Dates'].month==row['months']

# df1 and df2:
id_sales   = [1, 2, 3, 4, 5, 6]
col_names  = ['Id', 'parrotId', 'Dates']
df1        = pd.DataFrame(columns = col_names)
df1.Id     = id_sales
df1.parrotId = [1, 2, 3, 1, 2, 3]
df1.Dates  = ['2012-12-25', '2012-08-20', '2013-07-23', '2014-01-14', '2016-02-21', '2015-10-31']

col_names2 = ['parrotId', 'months']
df2        = pd.DataFrame(columns = col_names2)
df2.parrotId = [1, 2, 3]
df2.months = [12, ('Feb, Mar, Apr'), 0]

df3 = pd.merge(df1, df2, on = 'parrotId')
df3.Dates = pd.to_datetime(df3.Dates)

# use apply to run the function on each row, astype converts boolean to int (0/1) 
df3['DateMonth'] = df3.apply(matched, axis=1).astype(int)
df3

Output:      
Id  parrotId    Dates   months          DateMonth
0   1   1   2012-12-25  12              1
1   4   1   2014-01-14  12              0
2   2   2   2012-08-20  Feb, Mar, Apr   0
3   5   2   2016-02-21  Feb, Mar, Apr   1
4   3   3   2013-07-23  0               0
5   6   3   2015-10-31  0               0