Python 2.7 基于用户搜索汇总csv文件中的列_Python 2.7_Csv

Python 2.7 基于用户搜索汇总csv文件中的列

python-2.7 csv

Python 2.7 基于用户搜索汇总csv文件中的列,python-2.7,csv,Python 2.7,Csv,我有以下csv文件： data.cvs school,students,teachers,subs us-school1,10,2,0 us-school2,20,4,2 uk-school1,10,2,0 de-school1,10,3,1 de-school1,15,3,3 我正在尝试使用用户搜索学校所在国（美国或英国或德国）然后对相应的列进行汇总。（例如，美国所有学生的总和-*等）到目前为止，我能够使用原始输入进行搜索，并显示对应于国家/地区的列内容，如果有人能给我一些如何实现这一点

我有以下csv文件：

data.cvs

school,students,teachers,subs
us-school1,10,2,0
us-school2,20,4,2
uk-school1,10,2,0
de-school1,10,3,1
de-school1,15,3,3

我正在尝试使用用户搜索学校所在国（美国或英国或德国）然后对相应的列进行汇总。（例如，美国所有学生的总和-*等）到目前为止，我能够使用原始输入进行搜索，并显示对应于国家/地区的列内容，如果有人能给我一些如何实现这一点的建议，我将不胜感激

期望输出：

国家：美国

学生总数：30人

教师总数：6人

总数：2

您的问题可以通过以下方式解决：

import csv

search = raw_input('Enter school (e.g. us: ')
with open('data.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    result_countrys = {}
    for row in reader:
      students = int(row['students'])
      teachers = int(row['teachers'])
      subs = int(row['subs'])
      subs = row['subs']
      country = school[: 2]
      if country in result_countrys:
        count = result_countrys[country]
        count['students'] = count['students'] + students
        count['teachers'] = count['teachers'] + teachers
        count['subs'] = count['subs'] + subs
      else :
        dic = {}
        dic['students'] = students
        dic['teachers'] = teachers
        dic['subs'] = subs
        result_countrys[country] = dic

for k, v in result_countrys[search].iteritems():
    print("country " + str(search) + " has " + str(v) + " " + str(k))

我尝试使用这组值：

reader = [{'school': 'us-school1', 'students': 20, 'teachers': 6, 'subs': 2}, {'school': 'us-school2', 'students': 20, 'teachers': 6, 'subs': 2}, {'school': 'uk-school1', 'students': 20, 'teachers': 6, 'subs': 2}]

结果是：

Enter school (e.g. us):  us
country us has 30 students
country us has 6 teachers
country us has 2 subs

这相对容易做到——你所需要的只是一份记录来记录你的国家，然后把所有的价值加在一起：

import collections
import csv

result = {}  # store the results
with open("data.csv", "rb") as f:  # open our file
    reader = csv.DictReader(f)  # use csv.DictReader for convenience
    for row in reader:
        country = row.pop("school")[:2]  # get our country
        result[country] = result.get(country, collections.defaultdict(int))  # country group
        for column in row:  # loop through all other columns
            result[country][column] += int(row[column])  # add them together

# Now you can use or print your result by country:
for country in result:
    print("Country: {}".format(country))
    print("Total students: {}".format(result[country].get("students", 0)))
    print("Total teachers: {}".format(result[country].get("teachers", 0)))
    print("Total subs: {}\n".format(result[country].get("subs", 0)))

这也是通用的，因为您可以添加额外的数字列（例如，

看门人

：D），它将很高兴地将它们相加，但请记住，它仅适用于整数（如果您想要浮点，请将对

int

的引用替换为浮点）它期望除了学校以外的每个领域都是一个数字。

你可以用熊猫来实现这一点。查找

groupby

和

aggregate

，思考如果学校与您的数据不匹配，您希望发生什么。谢谢。。除了熊猫，还有其他方法吗？Hi@zig，让我知道这是否解决了问题：），如果解决了，请不要忘记在分数下方左上方的复选框中标记答案。我得到的结果是这样的：美国有1020名学生，注意它正在做字符串连接。。。所以我加了这个，它很有效

students=int（row['students']）

我相信通过csv模块的csv输出始终是str类型，与数据无关。所以必须添加

int（row['fieldname']）

@zig太好了，我会将它添加到答案中，很抱歉我错过了这一点太棒了，非常有用，我也喜欢这些评论，对我这样的学习者真的很有帮助……）非常好，非常清楚，还有一点：）我喜欢你的工作

import collections
import csv

result = {}  # store the results
with open("data.csv", "rb") as f:  # open our file
    reader = csv.DictReader(f)  # use csv.DictReader for convenience
    for row in reader:
        country = row.pop("school")[:2]  # get our country
        result[country] = result.get(country, collections.defaultdict(int))  # country group
        for column in row:  # loop through all other columns
            result[country][column] += int(row[column])  # add them together

# Now you can use or print your result by country:
for country in result:
    print("Country: {}".format(country))
    print("Total students: {}".format(result[country].get("students", 0)))
    print("Total teachers: {}".format(result[country].get("teachers", 0)))
    print("Total subs: {}\n".format(result[country].get("subs", 0)))