Python 2.7 基于用户搜索汇总csv文件中的列

Python 2.7 基于用户搜索汇总csv文件中的列,python-2.7,csv,Python 2.7,Csv,我有以下csv文件: data.cvs school,students,teachers,subs us-school1,10,2,0 us-school2,20,4,2 uk-school1,10,2,0 de-school1,10,3,1 de-school1,15,3,3 我正在尝试使用用户搜索学校所在国(美国或英国或德国) 然后对相应的列进行汇总。(例如,美国所有学生的总和-*等) 到目前为止,我能够使用原始输入进行搜索,并显示对应于国家/地区的列内容,如果有人能给我一些如何实现这一点

我有以下csv文件:

data.cvs

school,students,teachers,subs
us-school1,10,2,0
us-school2,20,4,2
uk-school1,10,2,0
de-school1,10,3,1
de-school1,15,3,3
我正在尝试使用用户搜索学校所在国(美国或英国或德国) 然后对相应的列进行汇总。(例如,美国所有学生的总和-*等) 到目前为止,我能够使用原始输入进行搜索,并显示对应于国家/地区的列内容,如果有人能给我一些如何实现这一点的建议,我将不胜感激

期望输出:

国家:美国

学生总数:30人

教师总数:6人

总数:2

--


您的问题可以通过以下方式解决:

import csv

search = raw_input('Enter school (e.g. us: ')
with open('data.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    result_countrys = {}
    for row in reader:
      students = int(row['students'])
      teachers = int(row['teachers'])
      subs = int(row['subs'])
      subs = row['subs']
      country = school[: 2]
      if country in result_countrys:
        count = result_countrys[country]
        count['students'] = count['students'] + students
        count['teachers'] = count['teachers'] + teachers
        count['subs'] = count['subs'] + subs
      else :
        dic = {}
        dic['students'] = students
        dic['teachers'] = teachers
        dic['subs'] = subs
        result_countrys[country] = dic

for k, v in result_countrys[search].iteritems():
    print("country " + str(search) + " has " + str(v) + " " + str(k))
我尝试使用这组值:

reader = [{'school': 'us-school1', 'students': 20, 'teachers': 6, 'subs': 2}, {'school': 'us-school2', 'students': 20, 'teachers': 6, 'subs': 2}, {'school': 'uk-school1', 'students': 20, 'teachers': 6, 'subs': 2}]
结果是:

Enter school (e.g. us):  us
country us has 30 students
country us has 6 teachers
country us has 2 subs

这相对容易做到——你所需要的只是一份记录来记录你的国家,然后把所有的价值加在一起:

import collections
import csv

result = {}  # store the results
with open("data.csv", "rb") as f:  # open our file
    reader = csv.DictReader(f)  # use csv.DictReader for convenience
    for row in reader:
        country = row.pop("school")[:2]  # get our country
        result[country] = result.get(country, collections.defaultdict(int))  # country group
        for column in row:  # loop through all other columns
            result[country][column] += int(row[column])  # add them together

# Now you can use or print your result by country:
for country in result:
    print("Country: {}".format(country))
    print("Total students: {}".format(result[country].get("students", 0)))
    print("Total teachers: {}".format(result[country].get("teachers", 0)))
    print("Total subs: {}\n".format(result[country].get("subs", 0)))

这也是通用的,因为您可以添加额外的数字列(例如,
看门人
:D),它将很高兴地将它们相加,但请记住,它仅适用于整数(如果您想要浮点,请将对
int
的引用替换为浮点)它期望除了学校以外的每个领域都是一个数字。

你可以用熊猫来实现这一点。查找
groupby
aggregate
,思考如果学校与您的数据不匹配,您希望发生什么。谢谢。。除了熊猫,还有其他方法吗?Hi@zig,让我知道这是否解决了问题:),如果解决了,请不要忘记在分数下方左上方的复选框中标记答案。我得到的结果是这样的:美国有1020名学生,注意它正在做字符串连接。。。所以我加了这个,它很有效
students=int(row['students'])
我相信通过csv模块的csv输出始终是str类型,与数据无关。所以必须添加
int(row['fieldname'])
@zig太好了,我会将它添加到答案中,很抱歉我错过了这一点太棒了,非常有用,我也喜欢这些评论,对我这样的学习者真的很有帮助……)非常好,非常清楚,还有一点:)我喜欢你的工作
import collections
import csv

result = {}  # store the results
with open("data.csv", "rb") as f:  # open our file
    reader = csv.DictReader(f)  # use csv.DictReader for convenience
    for row in reader:
        country = row.pop("school")[:2]  # get our country
        result[country] = result.get(country, collections.defaultdict(int))  # country group
        for column in row:  # loop through all other columns
            result[country][column] += int(row[column])  # add them together

# Now you can use or print your result by country:
for country in result:
    print("Country: {}".format(country))
    print("Total students: {}".format(result[country].get("students", 0)))
    print("Total teachers: {}".format(result[country].get("teachers", 0)))
    print("Total subs: {}\n".format(result[country].get("subs", 0)))