Python 导入CSV，重塑变量'；logistic回归的s数组_Python_Numpy_Statistics_Regression_Reshape

Python 导入CSV，重塑变量'；logistic回归的s数组

python numpy statistics

Python 导入CSV，重塑变量'；logistic回归的s数组,python,numpy,statistics,regression,reshape,Python,Numpy,Statistics,Regression,Reshape,我希望大家在新冠肺炎大流行期间都能保持安全。我是Python新手，有一个关于将数据从CSV导入Python的快速问题，用于进行简单的逻辑回归分析，其中因变量是二进制的，自变量是连续的我导入了一个CSV文件，然后希望使用一个变量（活动）作为自变量，另一个变量（烟雾）作为响应变量。我可以将CSV文件加载到Python中，但每次我尝试生成逻辑回归模型来预测运动产生的烟雾时，我都会得到一个错误，即运动必须重新设置为一列（二维），因为它当前是一维的 import matplotlib.pyplot as

我希望大家在新冠肺炎大流行期间都能保持安全。我是Python新手，有一个关于将数据从CSV导入Python的快速问题，用于进行简单的逻辑回归分析，其中因变量是二进制的，自变量是连续的

我导入了一个CSV文件，然后希望使用一个变量（活动）作为自变量，另一个变量（烟雾）作为响应变量。我可以将CSV文件加载到Python中，但每次我尝试生成逻辑回归模型来预测运动产生的烟雾时，我都会得到一个错误，即运动必须重新设置为一列（二维），因为它当前是一维的

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
data = pd.read_csv('Pulse.csv') # Read the data from the CSV file
x = data['Active'] # Load the values from Exercise into the independent variable
x = np.array.reshape(-1,1)
y = data['Smoke'] # The dependent variable is set as Smoke

我一直收到以下错误消息：

ValueError:应为2D数组，而应为1D数组：数组=[97.82.88.106.78.109.66.68.100.70.98.140.105.84。 134. 117. 100. 108. 76. 86. 110. 65. 85. 80. 87. 133. 125. 61. 117. 90. 110. 68. 102. 67. 112. 86. 85. 66. 73. 85. 110. 97. 93. 86. 80. 96. 74. 124. 78. 93. 80. 80. 92. 69. 82. 88. 74. 74. 75. 120. 105. 104. 99. 113. 67. 125. 133. 98. 80. 91. 76. 78. 94. 150. 92. 96. 68. 82. 102. 69. 65. 84. 86. 84. 116. 88. 65. 101. 89. 128. 68. 90. 80. 80. 98. 90. 82. 97. 90. 98. 88. 94. 92. 96. 80. 66. 110. 87. 88. 94. 96. 89. 74. 111. 81. 98. 99. 65. 95. 127. 76. 102. 88. 125. 72. 76. 112. 69. 101. 72. 112. 81. 90. 96. 66. 114. 71. 75. 102. 138. 85. 80. 107. 119. 98. 95. 95. 76. 96. 102. 82. 99. 80. 83. 102. 102. 106. 79. 80. 79. 110. 144. 80. 97. 60. 80. 108. 107. 51. 68. 80. 80. 60. 64. 87. 110. 110. 82. 154. 139. 86. 95. 112. 120. 79. 64. 84. 65. 60. 79. 79. 70. 75. 107. 78. 74. 80. 121. 120. 96. 75. 106. 88. 91. 98. 63. 95. 85. 83. 92. 81. 89. 103. 110. 78. 122. 122. 71. 65. 92. 93. 88. 90. 56. 95. 83. 97. 105. 82. 102. 87. 81.]. 使用数组重塑数据。如果数据具有单个特征或数组，则重塑（-1，1）。如果数据包含单个样本，则重塑（1，-1）

以下是完整的更新代码，其中有错误（2020年12月4日）： *我无法在此文档中输入错误日志，因此我已将其复制并粘贴到此公共Google文档中：

此外，以下是CSV文件：

试试这个：

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

data = pd.read_csv('Pulse.csv') # Read the data from the CSV file
x = data['Active'] # Load the values from Exercise into the independent variable
y = data['Smoke'] # The dependent variable is set as Smoke

lr = LogisticRegression().fit(x.values.reshape(-1,1), y)

试试这个：

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

data = pd.read_csv('Pulse.csv') # Read the data from the CSV file
x = data['Active'] # Load the values from Exercise into the independent variable
y = data['Smoke'] # The dependent variable is set as Smoke

lr = LogisticRegression().fit(x.values.reshape(-1,1), y)

下面的代码应该可以工作：

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
data = pd.read_csv('Pulse.csv')
x = pd.DataFrame(data['Smoke'])
y = data['Smoke']
lr = LogisticRegression()
lr.fit(x,y)
p_pred = lr.predict_proba(x)
y_pred = lr.predict(x)
score_ = lr.score(x,y)
conf_m = confusion_matrix(y,y_pred)
report = classification_report(y,y_pred)

print(score_)
0.8836206896551724

print(conf_m)
[[204   2]
 [ 25   1]]

下面的代码应该可以工作：

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
data = pd.read_csv('Pulse.csv')
x = pd.DataFrame(data['Smoke'])
y = data['Smoke']
lr = LogisticRegression()
lr.fit(x,y)
p_pred = lr.predict_proba(x)
y_pred = lr.predict(x)
score_ = lr.score(x,y)
conf_m = confusion_matrix(y,y_pred)
report = classification_report(y,y_pred)

print(score_)
0.8836206896551724

print(conf_m)
[[204   2]
 [ 25   1]]

请不要使用这一行

x=np.array.reformate（-1,1）

谢谢您的建议。我试过了，但结果是一样的：“ValueError:预期为2D数组，改为1D数组。”您能否添加完整的代码，其中还包括模型拟合部分？亲爱的ManojK，感谢您的耐心和支持。我已经用我可以使用的全部代码更新了这个问题，我还复制并粘贴了错误日志（当我试图提交它时，它在这里不被接受）到一个Google文档中。如果您有任何建议，我们将不胜感激。请检查我下面的答案。请不要使用此行

x=np.array.reformate（-1,1）

p_pred=lr.predict_proba（x）y_pred=lr.predict（x）score_u=lr.score（x，y）conf_m=conflusion_matrix（y，y_pred）report=classification_report（y，y_pred）

这不是错误，是代码。请注意，您必须使用x.values.reforme（-1,1）而不是xDear Cristian，非常感谢您的持续支持。我试过你的建议，但没能避免这个错误。我已经用整个代码更新了这个问题，以及在尝试运行它之后生成的错误日志。如果您有任何建议，我们将不胜感激，谢谢。输入提供的命令后，我收到以下错误：

p_pred=lr.predict_proba（x）y_pred=lr.predict（x）score_u=lr.score（x，y）conf_m=conflusion_matrix（y，y_pred）report=classification_report（y，y_pred）

这不是错误，是代码。请注意，您必须使用x.values.reforme（-1,1）而不是xDear Cristian，非常感谢您的持续支持。我试过你的建议，但没能避免这个错误。我已经用整个代码更新了这个问题，以及在尝试运行它之后生成的错误日志。如果您有任何建议，我们将不胜感激。亲爱的ManojK，非常感谢您的耐心和持续支持。。。不幸的是，我无法让它工作。这里是PDF的公共链接：这里是CSV文件：真诚地，ciel_azzuroSee我的更新代码，现在运行良好，只是更改了这一行：

x=pd.DataFrame（data['Smoke'））

由于

是一个

系列

现在它被转换成了一个

数据帧

，所以它出现了错误。我非常感谢您宝贵的时间和洞察力。分析结果与另一个计算软件（SPSS）的结果相匹配。谢谢你。亲爱的ManojK，我只是想让你知道，我一直在参考这一页，你对未来逻辑回归模型的建议，你的建议仍然非常有用。再次感谢。太好了，如果您还有任何问题，请告诉我。亲爱的ManojK，非常感谢您的耐心和持续支持。。。不幸的是，我无法让它工作。以下是PDF:的公共链接，以下是CSV文件：