Python 如何解析csv文件并基于该数据计算统计数据_Python_Python 3.x_Csv_Pandas

Python 如何解析csv文件并基于该数据计算统计数据

python python-3.x csv pandas

Python 如何解析csv文件并基于该数据计算统计数据,python,python-3.x,csv,pandas,Python,Python 3.x,Csv,Pandas,我有一项任务，需要我用python编写一个程序，读取一个文本文件，其中包含关于人的信息（姓名、体重和身高）然后我需要程序要求用户输入姓名，然后在文本文件中查找该姓名，并打印出包含该姓名以及此人身高和体重的行然后程序必须计算出人们的平均体重和平均身高文本文件为： James，73,1.82，M 彼得，78,1.80，米杰伊，90,1.90，米贝丝，65岁，1.53岁 Mags，66,1.50，F 乔伊，62,1.34，F 到目前为止，我有一个代码，它使用用户键入的名称打印出行，但我不知

我有一项任务，需要我用python编写一个程序，读取一个文本文件，其中包含关于人的信息（姓名、体重和身高）

然后我需要程序要求用户输入姓名，然后在文本文件中查找该姓名，并打印出包含该姓名以及此人身高和体重的行

然后程序必须计算出人们的平均体重和平均身高

文本文件为：

James，73,1.82，M
彼得，78,1.80，米
杰伊，90,1.90，米
贝丝，65岁，1.53岁
Mags，66,1.50，F
乔伊，62,1.34，F

到目前为止，我有一个代码，它使用用户键入的名称打印出行，但我不知道如何分配高度和权重：

search = input("Who's information would you like to find?")
with open("HeightAndWeight.txt", "r") as f:
    for line in f:
        if search in line:
            print(line)

按照建议使用

pandas

库，可以执行以下操作：

import pandas as pd
df = pd.read_csv('people.txt', header=None, index_col=0)
df.columns = ['weight', 'height', 'sex']
print(df)

       weight  height sex
0                        
James      73    1.82   M
Peter      78    1.80   M
Jay        90    1.90   M
Beth       65    1.53   F
Mags       66    1.50   F
Joy        62    1.34   F

print(df.mean())

weight    72.333333
height     1.648333

import csv

with open('HeightAndWeight.txt', 'rb') as f_input:
    csv_input = csv.reader(f_input)
    total_weight = 0
    total_height = 0

    for index, row in enumerate(csv_input, start=1):
        total_weight += float(row[1])
        total_height += float(row[2])

    print "Average weight: {:.2f}".format(total_weight / index)
    print "Average height: {:.2f}".format(total_height / index)

答案实际上在问题的标题中：使用解析文件使用：

search = input("Who's information would you like to find?")
heights = []
weights = []
with open("HeightAndWeight.txt", "r") as f:
    for line in f:
        if search in line:
            print(line)
            heights.append(int(line.split(',')[2]))
            weights.append(int(line.split(',')[1]))
# your calculation stuff

使用逗号

，

作为分隔符，将刚找到的行拆分为四部分。然后可以使用

分割线[0]

获取第一部分（名称），使用

分割线[1]

获取第二部分（年龄），依此类推。因此，要打印出人员姓名、身高和体重：

print('The person %s is %s years old and %s meters tall.' % (splitted_line[0], splitted_line[1], splitted_line[2]))

要获得身高和年龄的平均值，您需要知道文件中有多少条目，然后将年龄和身高相加，再除以条目/人数。整个事情看起来是这样的：

search = input("Who's information would you like to find?")
total = 0
age = 0
height = 0
with open("HeightAndWeight.txt", "r") as f:
for line in f:
    total += 1
    splitted_line = line.split(',', 4)
    age += int(splitted_line[1])
    height += int(splitted_line[2]) 
    if search in line:
        print('The person %s is %s years old and %s meters tall.' % (splitted_line[0], splitted_line[1], splitted_line[2]))
average_age = age / total
average_height = height / total

这是一种简单易懂的方法。

您可以使用Python的内置模块将文件中的每一行拆分为列列表，如下所示：

import pandas as pd
df = pd.read_csv('people.txt', header=None, index_col=0)
df.columns = ['weight', 'height', 'sex']
print(df)

       weight  height sex
0                        
James      73    1.82   M
Peter      78    1.80   M
Jay        90    1.90   M
Beth       65    1.53   F
Mags       66    1.50   F
Joy        62    1.34   F

print(df.mean())

weight    72.333333
height     1.648333

import csv

with open('HeightAndWeight.txt', 'rb') as f_input:
    csv_input = csv.reader(f_input)
    total_weight = 0
    total_height = 0

    for index, row in enumerate(csv_input, start=1):
        total_weight += float(row[1])
        total_height += float(row[2])

    print "Average weight: {:.2f}".format(total_weight / index)
    print "Average height: {:.2f}".format(total_height / index)

这将显示以下输出：

Average weight: 72.33 Average height: 1.65 平均体重：72.33 平均身高：1.65

查看

str.split

。您应该尝试将任务拆分为小个子问题。例如，您是否尝试从简单的数字列表中计算平均值？您应该阅读python中的csv文件，以及生成器表达式/列表理解。或者查看

pandas

库，该库可能提供您所需的所有功能。最后（更好：首先），熟悉语言。可能会浏览一个基础教程。我已经制作了一个程序，可以根据计划中的数字列表来计算平均值。我只是在分割线和只取所需的数字方面遇到了问题。因为这是一个作业，我不认为像Panda这样的东西会被接受；）由于该网站不仅为学生提供家庭作业的解决方案，而且还帮助其他人解决类似问题，我认为这个答案非常好。在写这个答案时，标题中没有提到csv，在这种情况下，您可能需要使用

csv

lib。非常感谢您的帮助，这确实有效，而且很容易理解。不客气，如果您愿意接受这个答案，我不会介意我没有写这个标题，我是新来的，所以我想它可能是由其他人编辑的。