Python 数据争用为时间序列格式
这里您将找到示例csv文件,下面是我使用的python代码 公司、州、产品、年度、1月、2月、3月、4月、5月、6月、7月、8月、9月、10月、11月、12月 Fred,NJ,A,2017111,7,69152192218246,98329,59191,43 弗雷德,新泽西州,A,2018391317241115,40300295288,89178,73110 Fred,NJ,A,2019345,30271,78,10,45185349237,70197389 佐治亚州乔治,B,2017260,55258389218726324191398266,42 佐治亚州乔治,B,2018260,1025639921635664,10256399231287 乔治,亚利桑那州,B,2019173,10360,45,16297,25,1134,35294193 杰米,纽约州,加利福尼亚州,2017360,18127,96175,63,40194118332128,26 纽约州杰米,C,2018,70175301259160,89,3314115117281 纽约州杰米,C,2019259,11,221221219,7123378,14327,95325 美国犹他州芭芭拉市,邮编:2017338,278176,80338268,52240383217225 美国犹他州芭芭拉市,邮编2018316242129158238,9723426662223287,21 美国犹他州芭芭拉市,邮编:2019378130106,70,92,77389189140126193188 加利福尼亚州盖蒂,2017,50,5,77,47,8169163,58324186208,57 加利福尼亚州盖蒂,2018,53,92177111,11238,96,85129396,75,84 盖蒂,加利福尼亚州,东,2019355,2543031953141168,97,30,25310 巴巴拉,犹他州,F,2017167192,37,53390357340,48389202,92,51 美国犹他州芭芭拉市,邮编2018177265359350318359359,93393251,51255 美国犹他州巴巴拉,2019,35302234306156,13122354,0214,81,86 布鲁泽,或,A,201714623124624611502249225213394140 布鲁斯,或,A,2018244385,36,86270381309,9816321337139 布鲁泽,或,A,201918333137188235308,10130143,31193,4 盖蒂,加利福尼亚州,B,2017152273,60162,68,53334204204238262220223 盖蒂,加利福尼亚州,B,2018161221286,80182261387208165303220101 盖蒂,加利福尼亚州,B,2019310218332349144,3,2,82132199375257 考虑到每个客户每年都被记录在一个单独的记录上,并且是在1月到12月。我首先创建了一个代码来传播数据。我遇到的问题是将新数据放入这样的时间序列布局中 日期、单位、公司、州、产品 我想用熊猫鱼饵或鱼堆 让我知道你的想法 Python代码Python 数据争用为时间序列格式,python,time-series,data-wrangling,Python,Time Series,Data Wrangling,这里您将找到示例csv文件,下面是我使用的python代码 公司、州、产品、年度、1月、2月、3月、4月、5月、6月、7月、8月、9月、10月、11月、12月 Fred,NJ,A,2017111,7,69152192218246,98329,59191,43 弗雷德,新泽西州,A,2018391317241115,40300295288,89178,73110 Fred,NJ,A,2019345,30271,78,10,45185349237,70197389 佐治亚州乔治,B,2017260,
import pandas as pd
df = pd.read_csv ('Product.csv')
print (df)
df.loc[(df['Year']==2017), '01-01-2017'] = (df['Jan'])
df.loc[(df['Year']==2017), '02-01-2017'] = (df['Feb'])
df.loc[(df['Year']==2017), '03-01-2017'] = (df['Mar'])
df.loc[(df['Year']==2017), '04-01-2017'] = (df['Apr'])
df.loc[(df['Year']==2017), '05-01-2017'] = (df['May'])
df.loc[(df['Year']==2017), '06-01-2017'] = (df['Jun'])
df.loc[(df['Year']==2017), '07-01-2017'] = (df['Jul'])
df.loc[(df['Year']==2017), '08-01-2017'] = (df['Aug'])
df.loc[(df['Year']==2017), '09-01-2017'] = (df['Sep'])
df.loc[(df['Year']==2017), '10-01-2017'] = (df['Oct'])
df.loc[(df['Year']==2017), '11-01-2017'] = (df['Nov'])
df.loc[(df['Year']==2017), '12-01-2017'] = (df['Dec'])
df.loc[(df['Year']==2018), '01-01-2018'] = (df['Jan'])
df.loc[(df['Year']==2018), '02-01-2018'] = (df['Feb'])
df.loc[(df['Year']==2018), '03-01-2018'] = (df['Mar'])
df.loc[(df['Year']==2018), '04-01-2018'] = (df['Apr'])
df.loc[(df['Year']==2018), '05-01-2018'] = (df['May'])
df.loc[(df['Year']==2018), '06-01-2018'] = (df['Jun'])
df.loc[(df['Year']==2018), '07-01-2018'] = (df['Jul'])
df.loc[(df['Year']==2018), '08-01-2018'] = (df['Aug'])
df.loc[(df['Year']==2018), '09-01-2018'] = (df['Sep'])
df.loc[(df['Year']==2018), '10-01-2018'] = (df['Oct'])
df.loc[(df['Year']==2018), '11-01-2018'] = (df['Nov'])
df.loc[(df['Year']==2018), '12-01-2018'] = (df['Dec'])
df.loc[(df['Year']==2019), '01-01-2019'] = (df['Jan'])
df.loc[(df['Year']==2019), '02-01-2019'] = (df['Feb'])
df.loc[(df['Year']==2019), '03-01-2019'] = (df['Mar'])
df.loc[(df['Year']==2019), '04-01-2019'] = (df['Apr'])
df.loc[(df['Year']==2019), '05-01-2019'] = (df['May'])
df.loc[(df['Year']==2019), '06-01-2019'] = (df['Jun'])
df.loc[(df['Year']==2019), '07-01-2019'] = (df['Jul'])
df.loc[(df['Year']==2019), '08-01-2019'] = (df['Aug'])
df.loc[(df['Year']==2019), '09-01-2019'] = (df['Sep'])
df.loc[(df['Year']==2019), '10-01-2019'] = (df['Oct'])
df.loc[(df['Year']==2019), '11-01-2019'] = (df['Nov'])
df.loc[(df['Year']==2019), '12-01-2019'] = (df['Dec'])
print (df)
以下是我将如何着手:
from io import StringIO
df_ts = pd.read_csv(StringIO("""Company,State,Product,Year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
Fred,NJ,A,2017,111,7,69,152,192,218,246,98,329,59,191,43
Fred,NJ,A,2018,391,317,241,115,40,300,295,288,89,178,73,110
Fred,NJ,A,2019,345,30,271,78,10,45,185,349,237,70,197,389
George,AZ,B,2017,260,55,258,389,218,7,263,241,191,398,266,42
George,AZ,B,2018,260,10,256,399,216,356,164,10,256,399,231,287
George,AZ,B,2019,173,10,360,45,16,297,25,1,134,35,294,193
Jamie,NY,C,2017,360,18,127,96,175,63,40,194,118,332,128,26
Jamie,NY,C,2018,70,175,301,259,160,89,3,314,115,115,117,281
Jamie,NY,C,2019,259,11,22,122,191,7,123,378,14,327,95,325
Barbara,UT,D,2017,338,,278,176,80,338,268,52,240,383,217,225
Barbara,UT,D,2018,316,242,129,158,238,97,234,266,62,223,287,21
Barbara,UT,D,2019,378,130,106,70,92,77,389,189,140,126,193,188
GETTY,CA,E,2017,50,5,77,47,8,169,163,58,324,186,208,57
GETTY,CA,E,2018,53,92,177,111,11,238,96,85,129,396,75,84
GETTY,CA,E,2019,355,,254,303,195,353,141,168,97,30,25,310
Barbara,UT,F,2017,167,192,37,53,390,357,340,48,389,202,92,51
Barbara,UT,F,2018,177,265,359,350,318,359,359,93,393,251,51,255
Barbara,UT,F,2019,35,302,234,306,156,13,122,354,0,214,81,86
Bruse,OR,A,2017,146,231,246,246,150,82,302,249,225,213,394,140
Bruse,OR,A,2018,244,385,36,86,270,381,309,98,163,321,337,139
Bruse,OR,A,2019,183,331,337,188,235,308,10,130,143,31,193,4
GETTY,CA,B,2017,152,273,60,162,68,53,334,204,238,262,220,223
GETTY,CA,B,2018,161,221,286,80,182,261,387,208,165,303,220,101
GETTY,CA,B,2019,310,218,332,349,144,3,2,82,132,199,375,257"""))
df_ts.melt(
id_vars=["Company", "State", "Product", "Year"],
var_name="month",
value_name="demand"
)
这将产生:
Company State Product Year month demand
0 Fred NJ A 2017 Jan 111.0
1 Fred NJ A 2018 Jan 391.0
2 Fred NJ A 2019 Jan 345.0
3 George AZ B 2017 Jan 260.0
4 George AZ B 2018 Jan 260.0
... ... ... ... ... ... ...
283 Bruse OR A 2018 Dec 139.0
284 Bruse OR A 2019 Dec 4.0
285 GETTY CA B 2017 Dec 223.0
286 GETTY CA B 2018 Dec 101.0
287 GETTY CA B 2019 Dec 257.0
之后,您可以进行时间操纵,如下所示:
df_ts = df_ts.set_index(df_ts.apply(lambda x: pd.to_datetime(f"{x.Year}-{x.month}"), axis=1))
df_ts = df_ts.drop(["Year", "month"], axis=1)
df_ts
这就给你留下了:
Company State Product demand
2017-01-01 Fred NJ A 111.0
2018-01-01 Fred NJ A 391.0
2019-01-01 Fred NJ A 345.0
2017-01-01 George AZ B 260.0
2018-01-01 George AZ B 260.0
... ... ... ... ...
2018-12-01 Bruse OR A 139.0
2019-12-01 Bruse OR A 4.0
2017-12-01 GETTY CA B 223.0
2018-12-01 GETTY CA B 101.0
2019-12-01 GETTY CA B 257.0
之后,您可以随意操作。欢迎使用SO。请描述您正在尝试做什么,并给出一个预期输出示例。所以我们可以帮你谢谢达曼工作得很好!!!太好了,请接受它作为答案;)。非常感谢。