Machine learning 如何获得“skbio”PCoA（主坐标分析）结果？_Machine Learning_Linear Algebra_Scikits_Dimensionality Reduction_Skbio

Machine learning 如何获得“skbio”PCoA（主坐标分析）结果？

machine-learning

Machine learning 如何获得“skbio”PCoA（主坐标分析）结果？,machine-learning,linear-algebra,scikits,dimensionality-reduction,skbio,Machine Learning,Linear Algebra,Scikits,Dimensionality Reduction,Skbio,我正在查看skbio的PCoA方法（如下所列）的属性。我不熟悉这个API，我希望能够得到特征向量和投影到新轴上的原始点，类似于sklearn.decomposition.PCA中的拟合变换，这样我就可以创建一些PC_1与PC_2风格的图。我知道了如何获得eigvals和比例(解释但是功能返回为无那是因为它是测试版吗如果有任何教程使用此功能，将不胜感激。我是scikit learn的超级粉丝，我想开始使用更多的scikit的产品 | Attributes | ---------- |

我正在查看

skbio的PCoA
方法（如下所列）的属性。我不熟悉这个API
，我希望能够得到特征向量
和投影到新轴上的原始点，类似于sklearn.decomposition.PCA中的拟合变换
，这样我就可以创建一些PC_1与PC_2
风格的图。我知道了如何获得eigvals
和比例(解释
但是功能
返回为无

那是因为它是测试版吗
如果有任何教程使用此功能，将不胜感激。我是scikit learn
的超级粉丝，我想开始使用更多的scikit的
产品
|  Attributes
 |  ----------
 |  short_method_name : str
 |      Abbreviated ordination method name.
 |  long_method_name : str
 |      Ordination method name.
 |  eigvals : pd.Series
 |      The resulting eigenvalues.  The index corresponds to the ordination
 |      axis labels
 |  samples : pd.DataFrame
 |      The position of the samples in the ordination space, row-indexed by the
 |      sample id.
 |  features : pd.DataFrame
 |      The position of the features in the ordination space, row-indexed by
 |      the feature id.
 |  biplot_scores : pd.DataFrame
 |      Correlation coefficients of the samples with respect to the features.
 |  sample_constraints : pd.DataFrame
 |      Site constraints (linear combinations of constraining variables):
 |      coordinates of the sites in the space of the explanatory variables X.
 |      These are the fitted site scores
 |  proportion_explained : pd.Series
 |      Proportion explained by each of the dimensions in the ordination space.
 |      The index corresponds to the ordination axis labels

下面是我生成主成分分析
对象的代码
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn import decomposition
import seaborn as sns; sns.set_style("whitegrid", {'axes.grid' : False})
import skbio
from scipy.spatial import distance

%matplotlib inline
np.random.seed(0)

# Iris dataset
DF_data = pd.DataFrame(load_iris().data, 
                       index = ["iris_%d" % i for i in range(load_iris().data.shape[0])],
                       columns = load_iris().feature_names)
n,m = DF_data.shape
# print(n,m)
# 150 4

Se_targets = pd.Series(load_iris().target, 
                       index = ["iris_%d" % i for i in range(load_iris().data.shape[0])], 
                       name = "Species")

# Scaling mean = 0, var = 1
DF_standard = pd.DataFrame(StandardScaler().fit_transform(DF_data), 
                           index = DF_data.index,
                           columns = DF_data.columns)

# Distance Matrix
Ar_dist = distance.squareform(distance.pdist(DF_standard.T, metric="braycurtis")) # (m x m) distance measure
DM_dist = skbio.stats.distance.DistanceMatrix(Ar_dist, ids=DF_standard.columns)
PCoA = skbio.stats.ordination.pcoa(DM_dist)

您可以使用OrdinationResults.samples
访问转换后的样本坐标。这将返回一个按样本ID（即距离矩阵中的ID）索引的pandas.DataFrame
行。由于主坐标分析对样本的距离矩阵进行操作，因此转换后的特征坐标（OrdinationResults.features
）不可用。scikit bio中接受样本x特征表作为输入的其他排序方法将具有可用的转换特征坐标（例如CA、CCA、RDA）
旁注：distance.squareform
调用是不必要的，因为skbio.DistanceMatrix
支持正方形或向量形式的数组。
我相信。samples
没有返回任何内容。我可以再试一次，我会确保更新我的skbio
。我一直在读关于PCoA的书，很多资料都很神秘。就PCA而言，其步骤是否相同，但在距离矩阵而不是协方差矩阵上进行特征分解？。由pcoa
生成的排序结果
需要样本
。如果您仍然获得无
，请将问题发布在网站上好吗？我的理解是，PCoA应用于距离矩阵，允许使用非欧几里德距离度量，而PCA应用于特征表，并使用欧几里德距离。因此，在欧几里德距离矩阵上运行PCoA相当于PCA。排序方法的有用资源。DF=skbio.OrdinationResults（long_method_name=“TESTING”，short_method_name=“test”，eigvals=PCoA.eigvals，samples=DF_data）DF.samples
返回未转换的原始数据。我做得不对吗？是的。您不需要直接构造skbio.OrdinationResults
对象，它只保存排序方法的结果。scikit bio中的每个排序方法都会为您创建此结果对象，您可以从中访问结果。使用该函数在skbio.DistanceMatrix
对象上运行PCoA。您将收到一个skbio.OrdinationResults
对象，您可以在该对象上调用.samples
，以检索转换后的样本坐标。没问题，乐意帮助！