Python—保留数据帧的最后n个值_Python_Pandas

Python—保留数据帧的最后n个值

python pandas

Python—保留数据帧的最后n个值,python,pandas,Python,Pandas,我有一个数据集，看起来像： Date Open High Low Close Adj Close Volume 0 2010-01-04 22.453505 22.625179 22.267525 22.389128 20.755877 3815500 1 2010-01-05 22.324749 22.331903 22.002861 22.145924 20.530411 4186000 2

我有一个数据集，看起来像：

         Date       Open       High        Low      Close  Adj Close   Volume
0  2010-01-04  22.453505  22.625179  22.267525  22.389128  20.755877  3815500
1  2010-01-05  22.324749  22.331903  22.002861  22.145924  20.530411  4186000
2  2010-01-06  22.067240  22.174536  22.002861  22.067240  20.457466  3243700
3  2010-01-07  22.017168  22.045780  21.816881  22.038626  20.430946  3095100
4  2010-01-08  21.917025  22.067240  21.745350  22.031473  20.424318  3733900

我想使用以下代码保留最后的250或500等（取决于偏移量的值）：

def positive_return_days(portfolio,offset):
    positive_returns = pd.DataFrame(
    columns=['ticker', 'name', 'total positive', 'total days','percentage of positive days'])
    for asset in portfolio:
        print(asset.head())
        print("1. Asset name: ", asset.name)
        asset = asset.tail(offset)
        print("2. Asset name: ", asset.name)
        total_positive_days = (asset.Close - asset.Close.shift(1) > 0).sum()
        total_days = len(asset.index)
        percentage_of_positive_days = float(total_positive_days/total_days)
        print("count",(asset.Close - asset.Close.shift(1) > 0).sum())
        new_row = {'ticker':asset.name, 'name':asset.name, 'total positive':total_positive_days, 'total days':total_days,"percentage of positive days":percentage_of_positive_days}
        positive_returns = positive_returns.append(new_row, ignore_index=True)
        print("Asset: ", asset.name, "total positive days: ", total_positive_days, "total days:",len(asset.index),"percentage of positive days",percentage_of_positive_days)
    print(positive_returns.nlargest(50, 'percentage of positive days')[
                  ['ticker','percentage of positive days','total positive', 'total days']])
    print(positive_returns.loc[positive_returns['ticker']=='AAPL'])
    return positive_returns

但我得到了一个错误：

AttributeError: 'DataFrame' object has no attribute 'name'

在我使用tail函数之后。如何解决此问题？

您可以使用：

df.tail(250)

这将保留数据帧的最后250行。不需要更多的代码

例如：

df = pd.DataFrame({'day_1': [0,1,1,0,1,1,0], 'day_2': [0,0,1,1,1,1,0], 'day_3': [0,1,1,1,0,0,0], 'day_4': [0,1,0,1,0,1,0], 'day_5': [0,0,1,1,1,0,0]})

   day_1  day_2  day_3  day_4  day_5
0      0      0      0      0      0
1      1      0      1      1      0
2      1      1      1      0      1
3      0      1      1      1      1
4      1      1      0      0      1
5      1      1      0      1      0
6      0      0      0      0      0

df.tail(3)

   day_1  day_2  day_3  day_4  day_5
4      1      1      0      0      1
5      1      1      0      1      0
6      0      0      0      0      0

我想是吧？这看起来比必要的要复杂一点--让我试着重写一下。关于你的问题：

df.tail(250)

是显而易见的选择，或者

df.iloc[-250:-1]

另外，这个问题是关于名字的，我会试着在pd中写出来

df['name']['close']

这种语法有时会给您带来麻烦，而不是df.name.close

对于这个问题--

我会用新的专栏来构建这个。如果没有实际的数据集，很难做到这一点，但我只想添加新的列，例如

df['up_days'] = np.where(df['name']['close'] > df['name']['close'].shift(1),1,0).cumsum()

我有兴趣看到完整的输出。根据您的代码，

portfolio

应该是一个

DataFrame

类型对象，属性

name

可用。如果

portfolio

属于您显示的数据，则它不包含

name

属性。这可以解释您看到的错误。

请参阅columns=['name'，…]——更改了列名。