Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/303.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫群发至_csv_Python_Pandas_Csv_Pandas Groupby - Fatal编程技术网

Python 熊猫群发至_csv

Python 熊猫群发至_csv,python,pandas,csv,pandas-groupby,Python,Pandas,Csv,Pandas Groupby,要将Pandas groupby数据帧输出到CSV。尝试了各种StackOverflow解决方案,但都不起作用 Python 3.6.1、0.20.1 groupby结果如下所示: id month year count week 0 9066 82 32142 895 1 7679 84 30112 749 2 8368 126 42187 872 3 11038 102 34165 976

要将Pandas groupby数据帧输出到CSV。尝试了各种StackOverflow解决方案,但都不起作用

Python 3.6.1、0.20.1

groupby结果如下所示:

id  month   year    count
week                
0   9066    82  32142   895
1   7679    84  30112   749
2   8368    126 42187   872
3   11038   102 34165   976
4   8815    117 34122   767
5   10979   163 50225   1252
6   8726    142 38159   996
7   5568    63  26143   582
想要一个看起来像

week  count
0   895
1   749
2   872
3   976
4   767
5   1252
6   996
7   582
当前代码:

week_grouped = df.groupby('week')
week_grouped.sum() #At this point you have the groupby result
week_grouped.to_csv('week_grouped.csv') #Can't do this - .to_csv is not a df function. 
阅读SO解决方案:

结果:AttributeError:无法访问“DataFrameGroupBy”对象的可调用属性“drop\u duplicates”,请尝试使用“apply”方法


结果:AttributeError:“无法访问'DataFrameGroupBy'对象的可调用属性'reset\u index',请尝试使用'apply'方法”

尝试将第二行更改为
week\u grouped=week\u grouped.sum()
并重新运行所有三行

如果在自己的Jupyter笔记本单元中运行
week\u grouped.sum()
,您将看到语句如何将输出返回到单元的输出,而不是将结果分配回
week\u grouped
。有些方法有一个
inplace=True
参数(例如,
df.sort\u值(by=col\u name,inplace=True)
),但
sum
没有

编辑:每周编号是否仅在CSV中显示一次?如果是这样,这里有一个更简单的解决方案,它不使用
groupby

df = pd.read_csv('input.csv')
df[['id', 'count']].to_csv('output.csv')

我觉得没有必要使用groupby,您可以直接删除不需要的列

df = df.drop(['month','year'], axis=1)
df.reset_index()
df.to_csv('Your path')
尝试这样做:

week_grouped = df.groupby('week')
week_grouped.sum().reset_index().to_csv('week_grouped.csv')
这将把整个数据帧写入文件。如果你只想要这两列

week_grouped = df.groupby('week')
week_grouped.sum().reset_index()[['week', 'count']].to_csv('week_grouped.csv')
下面是对原始代码的逐行解释:

# This creates a "groupby" object (not a dataframe object) 
# and you store it in the week_grouped variable.
week_grouped = df.groupby('week')

# This instructs pandas to sum up all the numeric type columns in each 
# group. This returns a dataframe where each row is the sum of the 
# group's numeric columns. You're not storing this dataframe in your 
# example.
week_grouped.sum() 

# Here you're calling the to_csv method on a groupby object... but
# that object type doesn't have that method. Dataframes have that method. 
# So we should store the previous line's result (a dataframe) into a variable 
# and then call its to_csv method.
week_grouped.to_csv('week_grouped.csv')

# Like this:
summed_weeks = week_grouped.sum()
summed_weeks.to_csv('...')

# Or with less typing simply
week_grouped.sum().to_csv('...')

Group By返回键、值对,其中键是组的标识符,值是组本身,即匹配键的原始df的子集

在您的示例中,
week\u grouped=df.groupby('week')
是一组组组(pandas.core.groupby.DataFrameGroupBy对象),您可以按如下方式详细查看这些组:

for k, gr in week_grouped:
    # do your stuff instead of print
    print(k)
    print(type(gr)) # This will output <class 'pandas.core.frame.DataFrame'>
    print(gr)
    # You can save each 'gr' in a csv as follows
    gr.to_csv('{}.csv'.format(k))
在您的示例中,您需要将函数结果指定给某个变量,因为默认情况下,对象是不可变的

some_variable = week_grouped.sum() 
some_variable.to_csv('week_grouped.csv') # This will work

基本上result.csv和week_grouped.csv是相同的

Pandas groupby生成大量信息(计数、平均值、标准差等)。如果要将所有数据保存在csv文件中,首先需要将其转换为常规数据帧:

import pandas as pd
...
...
MyGroupDataFrame = MyDataFrame.groupby('id')
pd.DataFrame(MyGroupDataFrame.describe()).to_csv("myTSVFile.tsv", sep='\t', encoding='utf-8')

谢谢为什么当sum()是to_csv语句的一部分时它会工作,而当sum()是在它自己的行上完成时它就不工作了?@kalmdown,如果这回答了您的问题,请您这样标记好吗?单击复选标记使其变为绿色。@kalmdown,我的回答回答了你的问题吗?我的答案仍然没有被标记为接受。在原始数据中,一周显示在多行上。在这种情况下,groupby用于收集周数,以便每周进行计数。顺便说一句-非常感谢您解释为什么
sum
是一个问题。感谢您的深入解释。有助于理解系统,而不仅仅是问题。应为“axis=1”…但这将输出行,但不会按周或状态分组。如果您在此处登录,希望了解如何将每个groupby保存到其自己的CSV文件,请参阅。
result = week_grouped.sum()
# This will be already one row per key and its aggregation result
result.to_csv('result.csv') 
some_variable = week_grouped.sum() 
some_variable.to_csv('week_grouped.csv') # This will work
import pandas as pd
...
...
MyGroupDataFrame = MyDataFrame.groupby('id')
pd.DataFrame(MyGroupDataFrame.describe()).to_csv("myTSVFile.tsv", sep='\t', encoding='utf-8')