Python 尝试绘制相关矩阵时出错
我试图从熊猫数据框中绘制相关矩阵Python 尝试绘制相关矩阵时出错,python,pandas,matplotlib,Python,Pandas,Matplotlib,我试图从熊猫数据框中绘制相关矩阵 import matplotlib.pyplot as plt import pandas as pd data = pd.read_csv('data_for_corelation.csv', delimiter=';') df = pd.DataFrame(data,columns=['A','B']) plt.matshow(df.corr()) plt.show() 但我在这一行遇到了一个错误: plt.matshow(df.corr()) 错误是
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data_for_corelation.csv', delimiter=';')
df = pd.DataFrame(data,columns=['A','B'])
plt.matshow(df.corr())
plt.show()
但我在这一行遇到了一个错误:
plt.matshow(df.corr())
错误是:
/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py in figaspect(arg)
2759 if isarray:
2760 nr, nc = arg.shape[:2]
-> 2761 arr_ratio = nr / nc
2762 else:
2763 arr_ratio = arg
ZeroDivisionError: division by zero
样本数据:
print(df.head(10))
A B
0 249,640704 1,019356
1 242,324502 0,647166
2 243,495232 0,644257
3 243,310156 0,81684
4 243,511297 1,050207
5 239,435233 1,340164
6 240,091439 1,836193
7 241,08975 1,540461
8 237,017175 1,244953
9 236,141326 1,210147
df.describe()
我该怎么修 如果我对您的示例数据运行这行检查代码:
print(df.head(10))
A B
0 249,640704 1,019356
1 242,324502 0,647166
2 243,495232 0,644257
3 243,310156 0,81684
4 243,511297 1,050207
5 239,435233 1,340164
6 240,091439 1,836193
7 241,08975 1,540461
8 237,017175 1,244953
9 236,141326 1,210147
df.describe()
我明白了:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 A 10 non-null object
1 B 10 non-null object
dtypes: object(2)
memory usage: 144.0+ bytes
如果我再次检查数据帧,在此修复之后,我会得到:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 A 10 non-null float64
1 B 10 non-null float64
dtypes: float64(2)
memory usage: 224.0 bytes
范围索引:10个条目,0到9
数据列(共2列):
#列非空计数数据类型
--- ------ -------------- -----
0一个10非空浮点64
1 B 10非空浮点64
数据类型:float64(2)
内存使用:224.0字节
如您所见,这次您的列被解释为“数字”(float
)。然后可以执行相关矩阵:
您是否尝试过pd.read_csv(…,decimal=“,”)
来考虑您的数据使用的是十进制逗号(而不是默认的点)?您是否检查了dataframe列的类型?您是否尝试打印(df.corr())以检查它是否有意义?