Python 熊猫-在一行不同的列中发现增长趋势_Python_Pandas_Data Analysis

Python 熊猫-在一行不同的列中发现增长趋势

python pandas

Python 熊猫-在一行不同的列中发现增长趋势,python,pandas,data-analysis,Python,Pandas,Data Analysis,我有一个印度牛奶生产的数据集。我正试图获得5个州的名单（如果有的话），在过去3年中，通过熊猫，这些州的牛奶总产量增加了 State Total10-11 Total11-12 Total13-14 Total14-15 Total15-16 Andhra Pradesh 11204 12088 13007 9656 10817 Arunachal Pradesh 28 22 43

我有一个印度牛奶生产的数据集。我正试图获得5个州的名单（如果有的话），在过去3年中，通过熊猫，这些州的牛奶总产量增加了

State        Total10-11 Total11-12  Total13-14  Total14-15  Total15-16
Andhra Pradesh    11204      12088       13007         9656      10817
Arunachal Pradesh    28         22          43           46         50
Assam               790        797         814          829        844
Bihar              6517       6643        7197         7775       8289
Chhattisgarh       1030       1118        1208         1232       1278
Goa                  60         60          68           66         54
Gujarat            9322       4089       11112        11690      12262
Haryana            6268       6661        7442         7902       8381
Himachal Pradesh   1102       1120        1151         1173       1283

预期产出：

State
Assam
Bihar
Chhattisgarh
Haryana
Himachal Pradesh

我想找出那些牛奶产量每年都有增长趋势的州。与前几年相比，随后几年的牛奶产量不应下降。预期输出状态的生产顺序是递增的，即使在生产中也没有下降一次。我有点受不了这个问题，我试过几种方法，但都不接近正确答案。解决办法是什么？提前感谢。

如果你只是在寻找一种趋势，那么我认为可视化就是答案

你可以这样做

import matplotlib.pyplot as plt
import pandas as pd

df = df.set_index('state')
df.T.plot(figsize=(10,15))

或者单独查看它们：

df.T.plot(figsize=(15,20), subplots=True,layout=(3,3))

如果你只是在寻找一种趋势，那么我认为可视化就是答案

你可以这样做

import matplotlib.pyplot as plt
import pandas as pd

df = df.set_index('state')
df.T.plot(figsize=(10,15))

或者单独查看它们：

df.T.plot(figsize=(15,20), subplots=True,layout=(3,3))

如果您正在寻找差异总是在增加，您可以使用

diff>0

和

cumsum

，即

df = df.set_index("State/UT Name")

temp = (df.T.diff() > 0).cumsum()
# Values will increment if the difference between past and present is positive 
State/UT Name  Andhra Pradesh  Arunachal Pradesh  Assam  Bihar  Chhattisgarh  \
Total10-11                  0                  0      0      0             0   
Total11-12                  1                  0      1      1             1   
Total13-14                  2                  1      2      2             2   
Total14-15                  2                  2      3      3             3   
Total15-16                  3                  3      4      4             4   

State/UT Name  Goa  Gujarat  Haryana  Himachal Pradesh  
Total10-11       0        0        0                 0  
Total11-12       0        0        1                 1  
Total13-14       1        1        2                 2  
Total14-15       1        2        3                 3  
Total15-16       1        3        4                 4  

# The one with max sum is the one that kept increasing over time 
temp.sum().nlargest(10)

State/UT Name
Assam                10
Bihar                10
Chhattisgarh         10
Haryana              10
Himachal Pradesh     10
Andhra Pradesh        8
Arunachal Pradesh     6
Gujarat               6
Goa                   3

如果你想知道州名的话

states = temp.sum().nlargest(5).index.tolist()

['Assam', 'Bihar', 'Chhattisgarh', 'Haryana', 'Himachal_Pradesh']

如果您正在寻找差异总是在增加，您可以使用

diff>0

和

cumsum

，即

df = df.set_index("State/UT Name")

temp = (df.T.diff() > 0).cumsum()
# Values will increment if the difference between past and present is positive 
State/UT Name  Andhra Pradesh  Arunachal Pradesh  Assam  Bihar  Chhattisgarh  \
Total10-11                  0                  0      0      0             0   
Total11-12                  1                  0      1      1             1   
Total13-14                  2                  1      2      2             2   
Total14-15                  2                  2      3      3             3   
Total15-16                  3                  3      4      4             4   

State/UT Name  Goa  Gujarat  Haryana  Himachal Pradesh  
Total10-11       0        0        0                 0  
Total11-12       0        0        1                 1  
Total13-14       1        1        2                 2  
Total14-15       1        2        3                 3  
Total15-16       1        3        4                 4  

# The one with max sum is the one that kept increasing over time 
temp.sum().nlargest(10)

State/UT Name
Assam                10
Bihar                10
Chhattisgarh         10
Haryana              10
Himachal Pradesh     10
Andhra Pradesh        8
Arunachal Pradesh     6
Gujarat               6
Goa                   3

如果你想知道州名的话

states = temp.sum().nlargest(5).index.tolist()

['Assam', 'Bihar', 'Chhattisgarh', 'Haryana', 'Himachal_Pradesh']

你的努力，以及你对更多Clarity的期望产量。你这里所说的增加趋势是什么意思？我的意思是如果牛奶总产量每年都在增加（在随后的几年中没有减少）。就像阿萨姆印控“阿鲁纳恰尔邦”，牛奶产量每年都在增加。你的努力，你期待的产量是什么？你的意思是，这里的趋势是什么？我的意思是，如果每年的牛奶总产量都在增加（在随后的几年里没有减少）。就像在阿鲁纳恰尔邦，阿萨姆的牛奶产量每年都在增加。显然，我在寻找一个国家的名字，在那里，一年的生产量没有减少，而且比过去几年增加了。names@SrinidhiPatil检查编辑，希望你理解我所做的。这是我的方法。可能会有更好的。很明显，我正在寻找一个州的名称，它的产量在一年内没有下降，只是与前几年相比继续上升。指数是你所在的州names@SrinidhiPatil检查编辑，希望你理解我所做的。这是我的方法。也许会有更好的