Python 平面上的图决策树分裂_Python_Scikit Learn_Tree_Visualization

Python 平面上的图决策树分裂

python scikit-learn tree

Python 平面上的图决策树分裂,python,scikit-learn,tree,visualization,Python,Scikit Learn,Tree,Visualization,我正在用sklearn拟合一个包含2个变量的回归树。我想想象一下这棵树是如何把飞机分开的。我认为这个平面上的瓷砖对应于树叶，它们的颜色对应于树叶中因变量的平均值。有没有现成的图书馆可以这样做？否则，如果您知道如何使用matplotlib等工具轻松绘制瓷砖，您可以使用示例构建所需的瓷砖试试这个例子： from sklearn.tree import plot_tree from sklearn.datasets import make_classification from sklearn.t

我正在用

sklearn

拟合一个包含2个变量的回归树。我想想象一下这棵树是如何把飞机分开的。我认为这个平面上的瓷砖对应于树叶，它们的颜色对应于树叶中因变量的平均值。有没有现成的图书馆可以这样做？否则，如果您知道如何使用matplotlib等工具轻松绘制瓷砖，您可以使用示例构建所需的瓷砖

试试这个例子：


from sklearn.tree import plot_tree
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt
import numpy as np


X, y = make_classification(n_samples=1000,n_features=2, 
n_redundant=0, n_clusters_per_class=1, random_state=4)
labels = ['type_A', 'type_B']
clf = DecisionTreeClassifier(max_depth=3).fit(X, y)

# Parameters
n_classes = 2
plot_colors = "ryb"
plot_step = 0.02

# Plot the decision boundary
plt.figure()

x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, plot_step),
                     np.arange(y_min, y_max, plot_step))
plt.tight_layout(h_pad=0.5, w_pad=0.5, pad=2.5)

Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
cs = plt.contourf(xx, yy, Z, cmap=plt.cm.RdYlBu)

plt.xlabel('feature_1')
plt.ylabel('feature_2')

# Plot the training points
for i, color in zip(range(n_classes), plot_colors):
    idx = np.where(y == i)
    plt.scatter(X[idx, 0], X[idx, 1], c=color, label=labels,
                cmap=plt.cm.RdYlBu, edgecolor='black', s=15)

f, ax = plt.subplots(figsize=(15, 7))
plot_tree(clf, filled=True, feature_names=['feature_1', 'feature_2'],
          ax=ax, fontsize=6,
          class_names=labels)

plt.show()

更新：

对于回归问题

from sklearn.tree import plot_tree
from sklearn.datasets import make_regression
from sklearn.tree import DecisionTreeRegressor
from matplotlib import pyplot as plt
from matplotlib.pyplot import cm
import numpy as np


X, y = make_regression(n_samples=1000, n_features=2,n_informative=2,
                       random_state=0)
reg = DecisionTreeRegressor(max_depth=4).fit(X, y)

# Parameters
plot_colors = "ryb"
plot_step = 0.02

# Plot the decision boundary
f, axes =plt.subplots(ncols=2,figsize=(30, 7))

x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, plot_step),
                     np.arange(y_min, y_max, plot_step))
plt.tight_layout(h_pad=0.5, w_pad=0.5, pad=2.5)

Z = reg.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
cs = plt.contourf(xx, yy, Z, cmap=plt.cm.Blues)

plt.xlabel('feature_1')
plt.ylabel('feature_2')

axes[1].scatter(X[:, 0], X[:, 1], c=y,
            cmap='Oranges', edgecolor='black', s=15)

plot_tree(reg, filled=True, feature_names=['feature_1', 'feature_2'],
          ax=axes[0], fontsize=3,
          class_names='Target')

plt.show()

具有绘制决策树的功能。注意，单个决策树具有很高的可变性，很可能会根据数据的子样本而变化。对于许多树（想想随机森林），变异性降低了，但另一方面，以图形方式分析数千棵树的价值也降低了。@SergeyBushmanov，谢谢你，但我不认为这有什么帮助……是的，它看起来更像我需要的，虽然在我的例子中，它是一棵回归树（我在问题中增加了精度）。我相信

plt.contour

和

np.meshgrid

会对我有很大帮助。我会调查这一切的。谢谢很高兴这有帮助。我现在添加了一个回归示例。过来看。如果它解决了你的问题，请接受答案。完美！谢谢@文卡塔查兰在我的例子中，我使用的是playtenis数据集，我的x有几个属性决定y是否会打网球，因为x有几个属性，我如何将其可视化？在本例中，您使用了两个特性使用t-sne来降低维度。