Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby-on-rails-4/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 随机_状态并一起洗牌_Python_Scikit Learn_Shuffle - Fatal编程技术网

Python 随机_状态并一起洗牌

Python 随机_状态并一起洗牌,python,scikit-learn,shuffle,Python,Scikit Learn,Shuffle,在这里,我对同时使用random\u state和shuffle感到有些困惑。我想分割数据,而不洗牌它。在我看来,当我将shuffle设置为False时,无论我为random_state选择的数字是多少,我都有相同的输出(random_state 42或2、7、17等的分割是相同的)。为什么? 但是,如果shuffle是真的,那么对于不同的随机状态,我有不同的输出(分割),这是有意义的 X_train, X_test, y_train, y_test = train_test_split(X,

在这里,我对同时使用
random\u state
shuffle
感到有些困惑。我想分割数据,而不洗牌它。在我看来,当我将shuffle设置为False时,无论我为random_state选择的数字是多少,我都有相同的输出(random_state 42或2、7、17等的分割是相同的)。为什么?

但是,如果shuffle是真的,那么对于不同的随机状态,我有不同的输出(分割),这是有意义的

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25,random_state=42)

如果将
shuffle
设置为False,
train\u test\u split
只按原始顺序读取数据。因此,完全忽略参数
random_state

例如:

X = [k for k in range(0, 50)] # create array with numbers ranging from 0 to 49
y = X # just for testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42, shuffle=False)

print(X_train) // prints [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36]
一旦将
shuffle
设置为True,
random\u state
将用作随机数生成器的种子。结果,您的数据集被随机分为训练集和测试集

随机_状态=42的示例:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42, shuffle=True)

print(X_train) // prints [8, 3, 6, 41, 46, 47, 15, 9, 16, 24, 34, 31, 0, 44, 27, 33, 5, 29, 11, 36, 1, 21, 2, 43, 35, 23, 40, 10, 22, 18, 49, 20, 7, 42, 14, 28, 38]
随机_状态=44的示例:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=44, shuffle=True)

print(X_train) // prints [13, 11, 2, 12, 34, 41, 30, 16, 39, 28, 24, 8, 18, 9, 4, 10, 0, 19, 21, 29, 14, 1, 48, 38, 7, 43, 25, 22, 23, 42, 46, 49, 32, 3, 45, 35, 20]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=44, shuffle=True)

print(X_train) // prints [13, 11, 2, 12, 34, 41, 30, 16, 39, 28, 24, 8, 18, 9, 4, 10, 0, 19, 21, 29, 14, 1, 48, 38, 7, 43, 25, 22, 23, 42, 46, 49, 32, 3, 45, 35, 20]