Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/303.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从一列中添加或派生两列_Python_Sql_Dataframe_Pyspark - Fatal编程技术网

Python 从一列中添加或派生两列

Python 从一列中添加或派生两列,python,sql,dataframe,pyspark,Python,Sql,Dataframe,Pyspark,给定以下数据帧: - 如何在上述数据框中再添加两列 大宗报价 例如,第1列有“是”答案的计数,第2列有“否”答案的计数 -+------------------+--------------------++--------------------+ |Customer_ID|Project_ID|QUESTION_TYP|ANSWER|col_1|col_2 +---------+----------+-------------+---------+------+-------------

给定以下数据帧:

-

如何在上述数据框中再添加两列

大宗报价

例如,第1列有“是”答案的计数,第2列有“否”答案的计数

 -+------------------+--------------------++--------------------+
|Customer_ID|Project_ID|QUESTION_TYP|ANSWER|col_1|col_2
+---------+----------+-------------+---------+------+-------------
   1            1         2nd QUES      YES.    2     0
   1            2          2nd QUES     NO      0.    1
   1            2          2nd Ques.    Yes     1.    0

请帮忙。我已经尝试了大多数解决方案,但得到的结果是肯定或否定,但我希望按行。请帮助这看起来像窗口函数。假设您希望通过
客户id
进行计数:

select t.*,
       sum(case when answer = 'YES' then 1 else 0 end) over (partition by customer_id) as col_1,
       sum(case when answer = 'NO' then 1 else 0 end) over (partition by customer_id) as col_2
from t;
或者,如果您想对所有数据进行计数,只需将
(按客户id划分)
替换为
()

注意:这会将计数放在两列中。如果只希望列中的计数与行中的答案匹配:

select t.*,
       (case when answer = 'YES'
             then sum(case when answer = 'YES' then 1 else 0 end) over (partition by customer_id)
             else 0
        end) as col_1,
       (case when answer = 'NO'
             then sum(case when answer = 'NO' then 1 else 0 end) over (partition by customer_id)
             else 0
        end) as col_2
from t;

然而,我认为这两项都有意义。

您尝试过什么?
select t.*,
       (case when answer = 'YES'
             then sum(case when answer = 'YES' then 1 else 0 end) over (partition by customer_id)
             else 0
        end) as col_1,
       (case when answer = 'NO'
             then sum(case when answer = 'NO' then 1 else 0 end) over (partition by customer_id)
             else 0
        end) as col_2
from t;