Python 为什么数据来自sklearn PCA多个不同于原始数据的PCA_分量_Python_Machine Learning_Scikit Learn_Pca

Python 为什么数据来自sklearn PCA多个不同于原始数据的PCA_分量

python machine-learning scikit-learn

Python 为什么数据来自sklearn PCA多个不同于原始数据的PCA_分量,python,machine-learning,scikit-learn,pca,Python,Machine Learning,Scikit Learn,Pca,我现在正在尝试分解数据这是我的密码： import xlrd import xlrd import xlwt import numpy as np from sklearn.decomposition import PCA import matplotlib.pyplot as plt data = xlrd.open_workbook('x.xlsx') sh=data.sheet_by_index(1) num_rows = sh.nrows -1 num_cells = sh.ncols

我现在正在尝试分解数据

这是我的密码：

import xlrd
import xlrd
import xlwt
import numpy as np
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
data = xlrd.open_workbook('x.xlsx')
sh=data.sheet_by_index(1)
num_rows = sh.nrows -1
num_cells = sh.ncols -1
inputData = np.empty([sh.nrows - 1, sh.ncols])
curr_row = -1
while curr_row < num_rows: # for each row
    curr_row += 1
    row = sh.row(curr_row)
    if curr_row > 0: # don't want the first row because those are labels
        for col_ind, el in enumerate(row):
            inputData[curr_row - 1, col_ind] = el.value

print(inputData.shape)
pca = PCA(n_components=3)
newData = pca.fit_transform(inputData)
print(inputData - np.dot(newData, pca.components_))

导入xlrd
导入xlrd
导入xlwt
将numpy作为np导入
从sklearn.decomposition导入PCA
将matplotlib.pyplot作为plt导入
data=xlrd.open_工作簿（'x.xlsx'）
sh=数据表按索引（1）
行数=sh.nrows-1
num_cells=sh.ncols-1
inputData=np.empty（[sh.nrows-1，sh.ncols]）
当前行=-1
而curr_row0:#不需要第一行，因为它们是标签
对于列索引，枚举中的el（行）：
inputData[当前行-1，列索引]=el.value
打印（inputData.shape）
pca=pca（n_分量=3）
newData=pca.fit\u变换（inputData）
打印（inputData-np.dot（newData，pca.components）

我认为inputData和np.dot（newData，pca.components_）之间的差异应该很小，但问题是结果似乎与原始数据相差很远

你能帮我吗？

你需要把平均数加回去。要进行重建：

rec = np.dot(newData, pca.components_) + pca.mean_

print(inputData - rec)