Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/351.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在Python中使用梯度下降法查找2个参数?_Python_Numpy_Gradient Descent - Fatal编程技术网

如何在Python中使用梯度下降法查找2个参数?

如何在Python中使用梯度下降法查找2个参数?,python,numpy,gradient-descent,Python,Numpy,Gradient Descent,我有几行代码没有收敛。如果有人知道原因,我将不胜感激。原始方程用def(x,y,b,m)表示,我需要找到参数b,m np.random.seed(42) x = np.random.normal(0, 5, 100) y = 50 + 2 * x + np.random.normal(0, 2, len(x)) def f(x, y, b, m): return (1/len(x))*np.sum((y - (b + m*x))**2) # it is suppos

我有几行代码没有收敛。如果有人知道原因,我将不胜感激。原始方程用def(x,y,b,m)表示,我需要找到参数b,m

  np.random.seed(42)
  x = np.random.normal(0, 5, 100)
  y = 50 + 2 * x + np.random.normal(0, 2, len(x))

  def f(x, y, b, m):
      return (1/len(x))*np.sum((y - (b + m*x))**2) # it is supposed to be a sum operator

  def dfb(x, y, b, m): # partial derivative with respect to b
      return b - m*np.mean(x)+np.mean(y)

  def dfm(x, y, b, m): # partial derivative with respect to m
      return np.sum(x*y - b*x - m*x**2)

  b0 = np.mean(y)
  m0 = 0
  alpha = 0.0001
  beta = 0.0001
  epsilon = 0.01

  while True:

      b = b0 - alpha * dfb(x, y, b0, m0)
      m = m0 - alpha * dfm(x, y, b0, m0)

      if np.sum(np.abs(m-m0)) <= epsilon and np.sum(np.abs(b-b0)) <= epsilon:
          break
      else:
          m0 = m
          b0 = b
      print(m, f(x, y, b, m))
np.random.seed(42)
x=np.随机.正常(0,5100)
y=50+2*x+np.随机.正常(0,2,len(x))
定义f(x,y,b,m):
return(1/len(x))*np.sum((y-(b+m*x))**2)#它应该是一个求和运算符
def-dfb(x,y,b,m):#关于b的偏导数
返回b-m*np.均值(x)+np.均值(y)
def-dfm(x,y,b,m):#关于m的偏导数
返回np.和(x*y-b*x-m*x**2)
b0=np.平均值(y)
m0=0
α=0.0001
β=0.0001
ε=0.01
尽管如此:
b=b0-α*dfb(x,y,b0,m0)
m=m0-α*dfm(x,y,b0,m0)

如果np.sum(np.abs(m-m0))两个导数都有一些符号混淆:

def dfb(x, y, b, m): # partial derivative with respect to b
  # return b - m*np.mean(x)+np.mean(y)
  #          ^-------------^------ these are incorrect
  return b + m*np.mean(x) - np.mean(y)

def dfm(x, y, b, m): # partial derivative with respect to m
  #      v------ this should be negative
  return -np.sum(x*y - b*x - m*x**2)
事实上,这些导数仍然缺少一些常数:

  • dfb
    应乘以
    2
  • dfm
    应乘以
    2/len(x)
我想这还不算太糟糕,因为不管怎样,梯度都是按
alpha
缩放的,但它可能会使收敛速度变差

如果使用了正确的导数,则代码将在一次迭代后收敛:

def dfb(x, y, b, m): # partial derivative with respect to b
  return 2 * (b + m * np.mean(x) - np.mean(y))

def dfm(x, y, b, m): # partial derivative with respect to m
  # Used `mean` here since (2/len(x)) * np.sum(...)
  # is the same as 2 * np.mean(...)
  return -2 * np.mean(x * y - b * x - m * x**2)

为什么你认为迭代应该收敛?这就是方法的工作原理,所以有人告诉我。。。也许我做得不对(.I打印了
np.sum(np.abs(m-m0))
np.sum(np.abs(b-b0))
在每次迭代中,他们都很快变成了NaN。在之前的迭代中,第二个总和波动很大,从
0.006
跳到
2835422.576
,然后从
1044.909
跳到
1.22e+16
(!)。这可能是梯度在某种“沟槽”上“反弹”的迹象因为
alpha
系数太大或梯度不正确,所以在函数中使用。我认为您应该仔细检查derivatives@ForceBru谢谢!必须最小化的函数是平方和,我不知道如何在numpy中使用求和运算符。你能确认np.sum()是否工作?谢谢。