Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/289.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
最大和最小熊猫-Python_Python_Pandas - Fatal编程技术网

最大和最小熊猫-Python

最大和最小熊猫-Python,python,pandas,Python,Pandas,我有一张表,它显示了从2010年到今天某一特定股票的每日开盘、高点、低点和百分比变化 Open High Low Close Percentage Chg Date 2017-07-03 972.79 974.49 951.00 953.66 -1.481 2017-06-30 980.12 983

我有一张表,它显示了从2010年到今天某一特定股票的每日开盘、高点、低点和百分比变化

        Open     High      Low    Close  Percentage Chg
Date                                                          
2017-07-03   972.79   974.49   951.00   953.66          -1.481
2017-06-30   980.12   983.47   967.61   968.00          -0.813
2017-06-29   979.00   987.56   965.25   975.93          -1.454
我创建了一个空表

我想在空列中添加以下内容:

  • 具有前5个最大百分比Chg值的列
  • 具有底部5个最低百分比chg值的列
  • 计算列“CLOSE”方差的列

  • 谢谢,

    我编写了一个脚本,似乎满足了您的3个目标

    我使用pandas_datareader(版本0.4.0)读取亚马逊股票价格

    下面是代码及其解释:

     import pandas_datareader.data as web
    import datetime
    
    start = datetime.datetime(2010, 1, 1)
    end = datetime.datetime(2017, 7, 5)
    
    df = web.DataReader("AMZN", 'google', start, end)
    
    #variance calculation
    close_var = df['Close'].var()
    
    #make variance a column
    df['close_var_col'] = close_var
    
    #selecting the first observation of opeing price from 2010
    #base_open = int(df['Open'].head(1))
    
    #creating the percentage difference from previous day
    df['perc_chg'] = 100*(df['Close'] - df['Close'].shift(-1))/df['Close'].shift(-1)
    
    
    #creating a df of top 5 and one of bottom 5 by perc_chg and removing Nan values
    df = df.dropna(subset=['perc_chg'])
    five_highest_df = df.sort_values(by='perc_chg').tail(5)
    five_lowest_df = df.sort_values(by='perc_chg').head(5)
    
    #creating a list of the top 5 (the only way I can think to this is with a string)
    list_of_top_5 = str(list(set(list(five_highest_df['perc_chg']))))
    list_of_bot_5 = str(list(set(list(five_lowest_df['perc_chg']))))
    
    #Adding the list of the top 5 and bottom 5 to our dataframe
    df['list_of_top_5'] = list_of_top_5
    df['list_of_bot_5'] = list_of_bot_5
    
    print(df.tail(20))
    
    它不是很有效,而且我不确定它作为数据帧是否特别有用。这是您应该得到的输出

                       Open     High      Low    Close    Volume  close_var_col
        Date
        2017-06-07  1005.95  1010.25  1002.00  1010.07   2823041   53171.161256
        2017-06-08  1012.06  1013.61  1006.11  1010.27   2767857   53171.161256
        2017-06-09  1012.50  1012.99   927.00   978.31   7647692   53171.161256
        2017-06-12   967.00   975.95   945.00   964.91   9447233   53171.161256
        2017-06-13   977.99   984.50   966.10   980.79   4580011   53171.161256
        2017-06-14   988.59   990.34   966.71   976.47   3974900   53171.161256
        2017-06-15   958.70   965.73   950.86   964.17   5373865   53171.161256
        2017-06-16   996.00   999.75   982.00   987.71  11472662   53171.161256
        2017-06-19  1017.00  1017.00   989.90   995.17   5043408   53171.161256
        2017-06-20   998.00  1004.88   992.02   992.59   4076828   53171.161256
        2017-06-21   998.70  1002.72   992.65  1002.23   2922473   53171.161256
        2017-06-22  1002.23  1006.96   997.20  1001.30   2253433   53171.161256
        2017-06-23  1002.54  1004.62   998.02  1003.74   2879145   53171.161256
        2017-06-26  1008.50  1009.80   992.00   993.98   3386157   53171.161256
        2017-06-27   990.69   998.80   976.00   976.78   3782389   53171.161256
        2017-06-28   978.55   990.68   969.21   990.33   3737567   53171.161256
        2017-06-29   979.00   987.56   965.25   975.93   4302968   53171.161256
        2017-06-30   980.12   983.47   967.61   968.00   3390345   53171.161256
        2017-07-03   972.79   974.49   951.00   953.66   2909108   53171.161256
        2017-07-05   961.53   975.00   955.25   971.40   3652955   53171.161256
    
     perc_chg                                      list_of_top_5
    
    -0.699951  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.019797  [12.356073489642865, 9.0991430362990329, 10.96...
     3.266858  [12.356073489642865, 9.0991430362990329, 10.96...
     1.388731  [12.356073489642865, 9.0991430362990329, 10.96...
    -1.619103  [12.356073489642865, 9.0991430362990329, 10.96...
     0.442410  [12.356073489642865, 9.0991430362990329, 10.96...
     1.275709  [12.356073489642865, 9.0991430362990329, 10.96...
    -2.383291  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.749621  [12.356073489642865, 9.0991430362990329, 10.96...
     0.259926  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.961855  [12.356073489642865, 9.0991430362990329, 10.96...
     0.092879  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.243091  [12.356073489642865, 9.0991430362990329, 10.96...
     0.981911  [12.356073489642865, 9.0991430362990329, 10.96...
     1.760888  [12.356073489642865, 9.0991430362990329, 10.96...
    -1.368231  [12.356073489642865, 9.0991430362990329, 10.96...
     1.475516  [12.356073489642865, 9.0991430362990329, 10.96...
     0.819215  [12.356073489642865, 9.0991430362990329, 10.96...
     1.503681  [12.356073489642865, 9.0991430362990329, 10.96...
    -1.826230  [12.356073489642865, 9.0991430362990329, 10.96...
    

    我已经准备好了一个脚本,它似乎满足了你的3个目标

    我使用pandas_datareader(版本0.4.0)读取亚马逊股票价格

    下面是代码及其解释:

     import pandas_datareader.data as web
    import datetime
    
    start = datetime.datetime(2010, 1, 1)
    end = datetime.datetime(2017, 7, 5)
    
    df = web.DataReader("AMZN", 'google', start, end)
    
    #variance calculation
    close_var = df['Close'].var()
    
    #make variance a column
    df['close_var_col'] = close_var
    
    #selecting the first observation of opeing price from 2010
    #base_open = int(df['Open'].head(1))
    
    #creating the percentage difference from previous day
    df['perc_chg'] = 100*(df['Close'] - df['Close'].shift(-1))/df['Close'].shift(-1)
    
    
    #creating a df of top 5 and one of bottom 5 by perc_chg and removing Nan values
    df = df.dropna(subset=['perc_chg'])
    five_highest_df = df.sort_values(by='perc_chg').tail(5)
    five_lowest_df = df.sort_values(by='perc_chg').head(5)
    
    #creating a list of the top 5 (the only way I can think to this is with a string)
    list_of_top_5 = str(list(set(list(five_highest_df['perc_chg']))))
    list_of_bot_5 = str(list(set(list(five_lowest_df['perc_chg']))))
    
    #Adding the list of the top 5 and bottom 5 to our dataframe
    df['list_of_top_5'] = list_of_top_5
    df['list_of_bot_5'] = list_of_bot_5
    
    print(df.tail(20))
    
    它不是很有效,而且我不确定它作为数据帧是否特别有用。这是您应该得到的输出

                       Open     High      Low    Close    Volume  close_var_col
        Date
        2017-06-07  1005.95  1010.25  1002.00  1010.07   2823041   53171.161256
        2017-06-08  1012.06  1013.61  1006.11  1010.27   2767857   53171.161256
        2017-06-09  1012.50  1012.99   927.00   978.31   7647692   53171.161256
        2017-06-12   967.00   975.95   945.00   964.91   9447233   53171.161256
        2017-06-13   977.99   984.50   966.10   980.79   4580011   53171.161256
        2017-06-14   988.59   990.34   966.71   976.47   3974900   53171.161256
        2017-06-15   958.70   965.73   950.86   964.17   5373865   53171.161256
        2017-06-16   996.00   999.75   982.00   987.71  11472662   53171.161256
        2017-06-19  1017.00  1017.00   989.90   995.17   5043408   53171.161256
        2017-06-20   998.00  1004.88   992.02   992.59   4076828   53171.161256
        2017-06-21   998.70  1002.72   992.65  1002.23   2922473   53171.161256
        2017-06-22  1002.23  1006.96   997.20  1001.30   2253433   53171.161256
        2017-06-23  1002.54  1004.62   998.02  1003.74   2879145   53171.161256
        2017-06-26  1008.50  1009.80   992.00   993.98   3386157   53171.161256
        2017-06-27   990.69   998.80   976.00   976.78   3782389   53171.161256
        2017-06-28   978.55   990.68   969.21   990.33   3737567   53171.161256
        2017-06-29   979.00   987.56   965.25   975.93   4302968   53171.161256
        2017-06-30   980.12   983.47   967.61   968.00   3390345   53171.161256
        2017-07-03   972.79   974.49   951.00   953.66   2909108   53171.161256
        2017-07-05   961.53   975.00   955.25   971.40   3652955   53171.161256
    
     perc_chg                                      list_of_top_5
    
    -0.699951  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.019797  [12.356073489642865, 9.0991430362990329, 10.96...
     3.266858  [12.356073489642865, 9.0991430362990329, 10.96...
     1.388731  [12.356073489642865, 9.0991430362990329, 10.96...
    -1.619103  [12.356073489642865, 9.0991430362990329, 10.96...
     0.442410  [12.356073489642865, 9.0991430362990329, 10.96...
     1.275709  [12.356073489642865, 9.0991430362990329, 10.96...
    -2.383291  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.749621  [12.356073489642865, 9.0991430362990329, 10.96...
     0.259926  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.961855  [12.356073489642865, 9.0991430362990329, 10.96...
     0.092879  [12.356073489642865, 9.0991430362990329, 10.96...
    -0.243091  [12.356073489642865, 9.0991430362990329, 10.96...
     0.981911  [12.356073489642865, 9.0991430362990329, 10.96...
     1.760888  [12.356073489642865, 9.0991430362990329, 10.96...
    -1.368231  [12.356073489642865, 9.0991430362990329, 10.96...
     1.475516  [12.356073489642865, 9.0991430362990329, 10.96...
     0.819215  [12.356073489642865, 9.0991430362990329, 10.96...
     1.503681  [12.356073489642865, 9.0991430362990329, 10.96...
    -1.826230  [12.356073489642865, 9.0991430362990329, 10.96...
    

    你为什么要把它放在桌子上?这是三个标量值。你能详细说明一下吗?我不明白你的问题。那是三个值(数字)。因此可以将它们存储在Python变量中。数据帧通常不用于存储固定数量的值。它们用于包含列和行的大量数据。好的,假设我不想将它们存储在表中。相反,它将存储在一个变量中。我仍然无法找到找到此信息的最佳方法。为什么要将其放在表中?这是三个标量值。你能详细说明一下吗?我不明白你的问题。那是三个值(数字)。因此可以将它们存储在Python变量中。数据帧通常不用于存储固定数量的值。它们用于包含列和行的大量数据。好的,假设我不想将它们存储在表中。相反,它将存储在一个变量中。我仍然无法找到找到此信息的最佳方法。你确定OP要求pct_变更时,他指的是整个期间pct_变更,而不仅仅是每天的变更吗?如果是日常工作,熊猫数据帧有一种方法可以做到这一点……你的意思是像df['Close']-df['Clsoe'].shift(-1)这样的东西吗?现在我正在阅读OPs问题,我想你是对的。谢谢@gionni,我已经修改了它。可以了,但实际上有一个方法
    Dataframe().pct_change()
    ,除了乘以100(我个人不喜欢,但这确实是个人偏好)之外,它的作用是一样的,现在你已经熟悉了,我认为
    Dataframe().pct\u change()
    更好。我会保持原样,因为OP可能更喜欢乘以100。你确定当OP要求pct_变化时,他指的是整个周期pct_变化,而不仅仅是每天的变化吗?如果是日常工作,熊猫数据帧有一种方法可以做到这一点……你的意思是像df['Close']-df['Clsoe'].shift(-1)这样的东西吗?现在我正在阅读OPs问题,我想你是对的。谢谢@gionni,我已经修改了它。可以了,但实际上有一个方法
    Dataframe().pct_change()
    ,除了乘以100(我个人不喜欢,但这确实是个人偏好)之外,它的作用是一样的,现在你已经熟悉了,我认为
    Dataframe().pct\u change()
    更好。我会保持原样,因为OP可能更喜欢乘100。