Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/333.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 重命名数据帧列名_Python_Scala_Apache Spark_Apache Spark Sql - Fatal编程技术网

Python 重命名数据帧列名

Python 重命名数据帧列名,python,scala,apache-spark,apache-spark-sql,Python,Scala,Apache Spark,Apache Spark Sql,我有一个数据帧df_1,df_2,如下所示: df_one.show() ------------- |Column_Name| ------------- |NAME | |ID | |COUNTRY | ------------- df_two.show() ------------- |_c0|_c1|_c2| ------------- |AAA|001|US | |BBB|002|UK | |CCC|003|IN | |DDD|004|F

我有一个数据帧df_1,df_2,如下所示:

df_one.show()

-------------
|Column_Name|
-------------
|NAME       |
|ID         | 
|COUNTRY    |
-------------

df_two.show()

-------------   
|_c0|_c1|_c2|
-------------
|AAA|001|US |
|BBB|002|UK |
|CCC|003|IN |
|DDD|004|FR |
-------------
------------- ----  
|NAME|ID |COUNTRY|
------------------
|AAA |001| US    |
|BBB |002| UK    |
|CCC |003| IN    |
|DDD |004| FR    |
------------------
我试图重命名dataframe df_two的列,如下所示:

df_one.show()

-------------
|Column_Name|
-------------
|NAME       |
|ID         | 
|COUNTRY    |
-------------

df_two.show()

-------------   
|_c0|_c1|_c2|
-------------
|AAA|001|US |
|BBB|002|UK |
|CCC|003|IN |
|DDD|004|FR |
-------------
------------- ----  
|NAME|ID |COUNTRY|
------------------
|AAA |001| US    |
|BBB |002| UK    |
|CCC |003| IN    |
|DDD |004| FR    |
------------------
目前,我创建了seq并得到了上述结果

val newColumn = Seq("NAME", "ID", "COUNTRY")
val df = df_two.toDF(newColumn:_*)
但现在我必须从df_one中读取列(column_Name),并分别重命名dataframe df_two的列名

我还试图从df_one读取列值,但它返回Seq[Any],我需要Seq[String]

请在此处为我提供一些代码。

试试:

df_two.columns = df_one['Column_Name']

这是Scala中的一个解决方案

由于
df_one
是一个小数据集(即使总列数为数千),因此可以
收集
作为
数组的数据帧。现在,
collect
-对数据帧进行加密将产生
行的
数组

df_one.collect
// res1: Array[org.apache.spark.sql.Row] = Array([NAME], [ID], [COUNTRY])
要展开
s(单个
字符串
),只需应用
方法:

总而言之:

val df_one = Seq(
  "NAME", "ID", "COUNTRY"
).toDF("Column_Name")

val df_two = Seq(
  ("AAA", "001", "US"),
  ("BBB", "002", "UK"),
  ("CCC", "003", "IN"),
  ("DDD", "004", "FR")
).toDF("_c0", "_c1", "_c2")

val colNames = df_one.collect.map(_.getString(0))

df_two.toDF(colNames: _*).show
// +----+---+-------+
// |NAME| ID|COUNTRY|
// +----+---+-------+
// | AAA|001|     US|
// | BBB|002|     UK|
// | CCC|003|     IN|
// | DDD|004|     FR|
// +----+---+-------+