Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 如何在PySpark dataframe中找到该列中存在的所有唯一值的列分布?_Python 3.x_Pyspark_Apache Spark Sql_Pyspark Dataframes - Fatal编程技术网

Python 3.x 如何在PySpark dataframe中找到该列中存在的所有唯一值的列分布?

Python 3.x 如何在PySpark dataframe中找到该列中存在的所有唯一值的列分布?,python-3.x,pyspark,apache-spark-sql,pyspark-dataframes,Python 3.x,Pyspark,Apache Spark Sql,Pyspark Dataframes,我有一个Pypark数据框- df = spark.createDataFrame([ ("u1", 0), ("u2", 0), ("u3", 1), ("u4", 2), ("u5", 3), ("u6", 2),], ['user_id', 'medals']) df.show() 输出- +-------+--

我有一个Pypark数据框-

df = spark.createDataFrame([
    ("u1", 0),
    ("u2", 0),
    ("u3", 1),
    ("u4", 2),
    ("u5", 3),
    ("u6", 2),],
    ['user_id', 'medals'])

df.show()
输出-

+-------+------+
|user_id|medals|
+-------+------+
|     u1|     0|
|     u2|     0|
|     u3|     1|
|     u4|     2|
|     u5|     3|
|     u6|     2|
+-------+------+
我想获得所有用户的奖牌列的分布情况。因此,如果在奖牌列中有n个唯一的值,我希望在输出数据框中有n个列,其中有相应数量的用户获得了那么多奖牌

上面给出的数据的输出应该如下所示-

+------- +--------+--------+--------+
|medals_0|medals_1|medals_2|medals_3|
+--------+--------+--------+--------+
|       2|       1|       2|       1|
+--------+--------+--------+--------+
如何实现这一点?

这是一个简单的问题:

df.groupBy().pivot(“奖牌”).count().show()
+---+---+---+---+
|  0|  1|  2|  3|
+---+---+---+---+
|  2|  1|  2|  1|
+---+---+---+---+

如果您需要一些化妆品在列名中添加单词奖牌,则可以执行以下操作:

aclements\u df=df.groupBy().pivot(“aclements”).count()
对于列中的col\u df.列:
奖牌u df=奖牌u df.with column重命名(col,“奖牌”{}.format(col))
奖牌展示(
+--------+--------+--------+--------+
|奖牌0 |奖牌1 |奖牌2 |奖牌3|
+--------+--------+--------+--------+
|       2|       1|       2|       1|
+--------+--------+--------+--------+