Pandas 如何过滤x数字以上的值?
我试图计算数学成绩合格(70分或更高)的学生的百分比。我使用的表格是学校数据 我尝试在列上使用条件Pandas 如何过滤x数字以上的值?,pandas,Pandas,我试图计算数学成绩合格(70分或更高)的学生的百分比。我使用的表格是学校数据 我尝试在列上使用条件 passing_math= school_data_complete.[['math_score'] > 70] passing_math.sum() File "<ipython-input-42-5d92405eb6b2>", line 14 passing_math= school_data_complete.[['math_score'] > 70]
passing_math= school_data_complete.[['math_score'] > 70]
passing_math.sum()
File "<ipython-input-42-5d92405eb6b2>", line 14
passing_math= school_data_complete.[['math_score'] > 70]
^
SyntaxError: invalid syntax
passing_math=school_data_complete.[['math_score']>70]
通过_math.sum()
文件“”,第14行
通过数学=学校数据完成。[[“数学分数”]>70]
^
SyntaxError:无效语法
试试看
或
我不知道我是否已经完全理解了这个问题,但是第一部分你要寻找的是
math_score
列的条件检查,它可以实现如下
从给定数据集中采样数据帧:
结果输出:
或
另一种基于比较值在数据帧中传递布尔标志的方法如下所示
>>> df.assign(new_math=(df['math_score'] > 70))
School ID Student ID budget gender grade math_score reading_score school_name size student_name type new_math
0 0 0 1910635 M 9th 79 66 Huang High Shool 2917 Paul Bradly District True
1 0 1 1910635 M 12th 61 94 Huang High Shool 2917 Victor Smith District False
2 0 2 1910635 M 12th 60 90 Huang High Shool 2917 Kvin Rod District False
3 0 3 1910635 M 12th 58 67 Huang High Shool 2917 Dr. Richard District False
4 0 4 1910635 M 12th 82 71 Huang High Shool 2917 Nicol S District True
passing_math=school\u data\u complete[['math\u score']>70]。copy()
谢谢我得到了:TypeError:'>'在'list'和'int'的实例之间不支持'>'你能给我们看看你的示例数据吗?好的,当然添加到描述passing_math=school\u data\u complete['math\u score']>70好的,它通过了,但返回的值是“学生ID学生姓名性别年级…”@Demagorgon先生(学校数据完成['math\u分数]>70)。数值计数(normalize=True)
passing_math= school_data_complete.query('math_score>70')
(school_data_complete['math_score'] > 70).value_counts(normalize=True)
>>> df
School ID Student ID budget gender grade math_score reading_score school_name size student_name type
0 0 0 1910635 M 9th 79 66 Huang High Shool 2917 Paul Bradly District
1 0 1 1910635 M 12th 61 94 Huang High Shool 2917 Victor Smith District
2 0 2 1910635 M 12th 60 90 Huang High Shool 2917 Kvin Rod District
3 0 3 1910635 M 12th 58 67 Huang High Shool 2917 Dr. Richard District
4 0 4 1910635 M 12th 82 71 Huang High Shool 2917 Nicol S District
>>> df [ df['math_score'] > 70 ]
School ID Student ID budget gender grade math_score reading_score school_name size student_name type
0 0 0 1910635 M 9th 79 66 Huang High Shool 2917 Paul Bradly District
4 0 4 1910635 M 12th 82 71 Huang High Shool 2917 Nicol S District
>>> df.loc[df.math_score > 70]
School ID Student ID budget gender grade math_score reading_score school_name size student_name type
0 0 0 1910635 M 9th 79 66 Huang High Shool 2917 Paul Bradly District
4 0 4 1910635 M 12th 82 71 Huang High Shool 2917 Nicol S District
>>> df.assign(new_math=(df['math_score'] > 70))
School ID Student ID budget gender grade math_score reading_score school_name size student_name type new_math
0 0 0 1910635 M 9th 79 66 Huang High Shool 2917 Paul Bradly District True
1 0 1 1910635 M 12th 61 94 Huang High Shool 2917 Victor Smith District False
2 0 2 1910635 M 12th 60 90 Huang High Shool 2917 Kvin Rod District False
3 0 3 1910635 M 12th 58 67 Huang High Shool 2917 Dr. Richard District False
4 0 4 1910635 M 12th 82 71 Huang High Shool 2917 Nicol S District True