Python 创建现有csv的新列,使用从数据透视表向量值分割中获得的百分比值?

Python 创建现有csv的新列,使用从数据透视表向量值分割中获得的百分比值?,python,pandas,numpy,pivot-table,Python,Pandas,Numpy,Pivot Table,我想为现有csv创建新列。此列是通过除法获得的百分比,乘以100个单位,如下所示(查看完整代码上的注释箭头): dfb['cm_target\u perc']=cm_inc/[dfb['cm_target']*100*len(cm_inc) 我想要的是生成一个新的列,其中每个值都应该通过将数据透视表的向量cm_inc除以dfb['cm_target',],它的值是每行40乘以100得到 以下是我的Jupyter笔记本的完整代码: from plotly.offline import init_n

我想为现有csv创建新列。此列是通过除法获得的百分比,乘以100个单位,如下所示(查看完整代码上的注释箭头): dfb['cm_target\u perc']=cm_inc/[dfb['cm_target']*100*len(cm_inc)

我想要的是生成一个新的列,其中每个值都应该通过将数据透视表的向量cm_inc除以dfb['cm_target',],它的值是每行40乘以100得到

以下是我的Jupyter笔记本的完整代码:

from plotly.offline import init_notebook_mode, iplot
from plotly import graph_objs as go
init_notebook_mode(connected = True)
import pandas as pd
import numpy as np
from datetime import timedelta, datetime, tzinfo
import time
from datetime import datetime as dt


dfb=pd.read_csv('https://www.dropbox.com/s/90y07129zn351z9/test_data.csv?dl=1', encoding="latin-1", infer_datetime_format=True, parse_dates=['date'], skipinitialspace=True)
dfb["date"]=pd.to_datetime(dfb['date']) 

dfb["site"]=dfb["site"].astype("category")
cm_inc=dfb[dfb.site == 5].pivot_table(index='date', values = 'site', aggfunc = {  'site' : 'count' }  )
dfb['cm_target'] = [40]*len(dfb)

#===>>>#dfb['cm_target_perc']=cm_inc/[dfb['cm_target']*100*len(cm_inc)

dfb.to_csv('test_data.csv', index=False)


indexes =pd.to_datetime(cm_inc.index) 

dates_indexes = pd.to_datetime(cm_inc.index) 

data = [
    go.Bar(x=indexes, 
           y=dfb['cm_target'],
           text=dfb['cm_target'],
           textposition = 'auto',
           name='Target Site A', 
           base=0
          ),
    go.Bar(x=indexes, 
           y=cm_inc['site'],
           text=cm_inc['site'],
           textposition = 'auto',
           name='Enroll Site A', 
           base=0,
           #width=2  # Width value varies depending on number of samples in data
           )
]

layout = go.Layout(
    barmode='stack',
    xaxis=dict(
        showticklabels=True,
        ticktext=dates_indexes,
        tickvals=[i for i in indexes],
    )
)

fig = dict(data = data, layout = layout)
iplot(fig, show_link=False)
问题:如何更改并修复此错误: ValueError:传递的项目数错误1239,放置意味着1


提前感谢。

虽然它不是新的专栏,但它可以提供以下所需的结果:

cm_achived_perc=cm_inc.loc[:]/40*100
%matplotlib inline
cm_achived_perc.plot(kind = 'bar')
这就是你想要的吗

替换你的线路

dfb['cm_target'] = [40]*len(dfb)
dfb['cm_target_perc']=cm_inc/[dfb['cm_target']*100*len(cm_inc)

给我这个dfb

           site         received             sent  cm_target  cm_inc  cm_target_perc
date                                                                                
2018-07-10    2              NaN              NaN         58    20.0       34.482759
2018-07-10    2              NaN              NaN         63    20.0       31.746032
2018-07-11    2              NaN              NaN         67    20.0       29.850746
2018-07-11    2              NaN              NaN        100    20.0       20.000000

如果您的
dfb['cm\u target']
都是40,为什么不干脆
dfb['cm\u target\u perc']=cm\u inc/40*100*len(cm\u inc)
instead@iamanigeeit,cm_inc.是pivot_表向量。你说它的数据帧,那么如何调用它的列呢?谢谢你可以做
cm_inc.site
cm_inc['site']
cm_inc
是一个有一列的数据框)@iamanigeet,它接受但只创建一个有空数据的新列。但我需要绘制两个条形图,不仅仅是这个。
           site         received             sent  cm_target  cm_inc  cm_target_perc
date                                                                                
2018-07-10    2              NaN              NaN         58    20.0       34.482759
2018-07-10    2              NaN              NaN         63    20.0       31.746032
2018-07-11    2              NaN              NaN         67    20.0       29.850746
2018-07-11    2              NaN              NaN        100    20.0       20.000000