Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/312.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何计算字段中具有特定值的字符串的百分比_Python_Csv_Math_Percentage - Fatal编程技术网

Python 如何计算字段中具有特定值的字符串的百分比

Python 如何计算字段中具有特定值的字符串的百分比,python,csv,math,percentage,Python,Csv,Math,Percentage,我有一个用逗号分隔的CSV文件。我需要读取文件,确定具有特定值的字符串(如字段颜色中的蓝色),并计算符合条件的字符串百分比 我的代码如下: myfile = open('3517315a.csv','r') myfilecount = 0 linecount = 0 firstline = True for line in myfile: if firstline: firstline = False continue fields = l

我有一个用逗号分隔的CSV文件。我需要读取文件,确定具有特定值的字符串(如字段颜色中的蓝色),并计算符合条件的字符串百分比

我的代码如下:

myfile = open('3517315a.csv','r')

myfilecount = 0

linecount = 0

firstline = True

for line in myfile:

       if firstline:
        firstline = False
        continue
fields = line.split(',')

    linecount += 1
    count = int(fields[0])
    colour = str(fields[1])
    channels = int(fields[2])
    code = str(fields[3])
    correct = str(fields[4])
    reading = float(fields[5])

我不知道如何设置条件并计算百分比

基本上有三个步骤:

获取文件中的行数。您已经使用linecount执行了此操作 获取您的情况发生的次数。让我们以颜色为例:您已经提取了颜色,现在只需将其与您要查找的值进行比较,例如,如果颜色==蓝色 计算百分比,即发生次数/行数 它可能是这样的:

myfile = open('3517315a.csv','r')

myfilecount = 0

linecount = 0
occurences = 0

firstline = True

for line in myfile:

    if firstline:
        firstline = False
        continue

    fields = line.split(',')

    linecount += 1

    count = int(fields[0])
    colour = str(fields[1])
    channels = int(fields[2])
    code = str(fields[3])
    correct = str(fields[4])
    reading = float(fields[5])

    if colour == 'Blue':
        occurences_blue += 1

percentage_blue = occurences_blue / linecount

不过,这是一个非常基本的例子。在任何情况下,您可能都应该使用Python csv库从csv中读取字段,正如在您的帖子的评论中所建议的那样。我还希望有一些库可以更有效地解决您的问题。

试试这个:它比其他答案更容易配置,而且由于CSV模块,它可以处理所有类型的CSV文件。使用Python 3.6.1进行测试

import csv
import io # needed because our file is not really a file

CSVFILE = """name,occupation,birthyear
John,Salesman,1992
James,Intern,1997
Abe,Salesman,1983
Michael,Salesman,1994"""

f = io.StringIO(CSVFILE) # needed because our file is not really a file

# This is the name of the row we want to know about
our_row = 'occupation'
# If we want to limit the output to one value, put it here.
our_value = None # For example, try 'Intern'
# This will hold the total number of rows
row_total = 0

totals = dict()

for row in csv.DictReader(f):
    v = row[our_row]
    # If we've already come across a row with this value before, add 1 to it
    if v in totals:
        totals[v] += 1
    else: # Set this row's total value to 1
        totals[v] = 1

    row_total += 1

for k, v in totals.items():
    if our_value:
        if k != our_value: continue

    print("{}: {:.2f}%".format(k, v/row_total*100))
输出:

Salesman: 75.00%
Intern: 25.00%

如果您愿意使用第三方模块,那么我强烈建议您使用Pandas。代码大致如下:

import pandas as pd

df = pd.read_csv("my_data.csv")
blues = len(df[df.colour == "blue"])
percentage = blues / len(df)
print(f"{percentage}% of the colours are blue")

也许你应该改用csv模块。假设csv的格式始终一致,这项工作最好由pandas处理。你能提供你的数据片段吗?使用defaultdict来保存数据会更干净totals@silel如果你不介意,为什么?totals={}也有同样的缺点吗?使用a,您不需要检查您之前是否遇到过某个值,您将始终添加1。还有一个特殊的格式说明符,但我只是在吹毛求疵。