Apache spark 从spark dataframe中选择或删除重复列_Apache Spark_Pyspark_Apache Spark Sql_Pyspark Sql - Fatal编程技术网

Apache spark 从spark dataframe中选择或删除重复列

apache-spark pyspark

Apache spark 从spark dataframe中选择或删除重复列,apache-spark,pyspark,apache-spark-sql,pyspark-sql,Apache Spark,Pyspark,Apache Spark Sql,Pyspark Sql,给定一个spark数据帧，其具有重复的列名称（例如a），我无法修改上游或源，如何选择、删除或重命名其中一列，以便检索列值 df.select（'A'）向我显示一个不明确的列错误，filter，drop，以及withColumnRename。如何选择其中一列？我在数小时的研究中发现的唯一方法是重命名列集，然后创建另一个以新集为标题的数据框例如，如果你有： >>> import pyspark >>> from pyspark.sql import SQLCon

给定一个spark数据帧，其具有重复的列名称（例如
a
），我无法修改上游或源，如何选择、删除或重命名其中一列，以便检索列值

df.select（'A'）
向我显示一个不明确的列错误，
filter
，
drop
，以及
withColumnRename
。如何选择其中一列？
我在数小时的研究中发现的唯一方法是重命名列集，然后创建另一个以新集为标题的数据框
例如，如果你有：

>>> import pyspark >>> from pyspark.sql import SQLContext >>> >>> sc = pyspark.SparkContext() >>> sqlContext = SQLContext(sc) >>> df = sqlContext([(1, 2, 3), (4, 5, 6)], ['a', 'b', 'a']) DataFrame[a: bigint, b: bigint, a: bigint] >>> df.columns ['a', 'b', 'a'] >>> df2 = df.toDF('a', 'b', 'c') >>> df2.columns ['a', 'b', 'c']
您可以使用
df.columns
获取列列表，然后使用循环重命名任何重复项以获取新列列表（不要忘记将
*new\u colu\u list
而不是
new\u colu\u list
传递到
toDF
函数，否则将抛出无效计数错误）

[pyspark]相关文章推荐

Pyspark SaveAsTable未从SQL推断架构 pyspark

Pyspark数据帧运算符“；不在“中”； pyspark

Pyspark Pypark每周事件的计算 pyspark

如何监视由pyspark启动的任务 pyspark

Pyspark 在某些匹配条件下如何左反连接 pyspark

Pyspark Spark 2.2.0在将表格加载到DF时无法连接到Phoenix 4.11.0版本 pyspark hbase

Pyspark-通过忽略空值计算组后的最小值 pyspark

绘制一个非常巨大的pyspark柱的柱状图 pyspark

Pyspark 如何筛选语言的Wikidata转储？ pyspark rdf

如何使用pyspark递归地获取存储在dbfs文件夹中的Excel文件？ pyspark

Pyspark AWS EMR中的火花步进失败，exitCode 13 pyspark

pyspark数据帧的数据类型中的不同计数 pyspark

Pyspark文件系统fs.listStatus（sc._jvm.org.apache.hadoop.fs.Path（Path））只返回第一个子目录 pyspark

使用Pyspark在数据帧的不同列上提取多个平均值和移动平均值 pyspark

如何使用Pyspark将flatmap与Dataframe中的多列一起使用 pyspark

如何检查RDD在pyspark中是否包含列表元素？ pyspark

使用PySpark中的pandas\u udf平均分配组任务 pyspark

从同一pyspark数据帧的键数组中获取值数组 pyspark

在pyspark中如何从出生日期计算年龄？ pyspark

Pyspark 在Spark数据帧中执行MapReduce pyspark mapreduce

随机文章推荐

Windows store apps 更新Windows 8.1商店应用程序中的小互动程序 windows-store-apps

Windows store apps 使用VS 2012创建应用程序时的奇怪行为水平滚动视图 windows-store-apps

Windows store apps 在Windows 8的System.Windows.dll模块中找不到System.Windows.Media.AudioSink类型 windows-store-apps

Windows store apps Yammer API是否返回RichText windows-store-apps

Windows store apps 在WinJS中右键单击时停止显示AppBar控件 windows-store-apps

Windows store apps 将UWP应用提交到Windows应用商店时出现错误1300 windows-store-apps windows-10

Windows store apps 如何使用VS2015将Xamarin UWP应用发布到windows应用商店？ windows-store-apps uwp xamarin.forms

[apache spark]相关推荐

Apache spark 我可以在Spark中的一系列值上设置一个窗口吗？
Apache Spark

Apache spark 重写默认的cookbook chef变量
Apache Spark Chef Infra

Apache spark spark checkpoint和持久化到磁盘之间有什么区别
Apache Spark

Apache spark 我的Spark应用程序在阅读Cassandra的文章时出现读取超时，我不知道如何解决这个问题
Apache Spark Cassandra Pyspark

Apache spark 25天后，阅读记录上会出现火花
Apache Spark

Apache spark 拼花地板过滤器按下是否应减少数据读取？
Apache Spark

Apache spark reduceByKey不'；不要在火花流中工作
Apache Spark Apache Kafka

Apache spark 这种情况下使用的回归算法是什么？
Apache Spark

Apache spark Spark 2.1-实例化HiveSessionState时出错
Apache Spark

Apache spark spark rdd如何展开键数组（值）对
Apache Spark

Apache spark 如何查看拼花地板元数据中的最小/最大索引？
Apache Spark

Apache spark 当Spark从文件系统读取数据时，它是否会进入驱动程序？
Apache Spark Amazon S3

Apache spark StreamingQueryListner spark结构化流媒体
Apache Spark Amazon S3

Apache spark Spark Dataframe API中的Phoenix动态列
Apache Spark Hbase

Apache spark 是否可以在Spark中创建持久视图？
Apache Spark Pyspark

Apache spark 查找给定单个门店位置的最近门店+；pyspark中的最大广播变量大小是多少？
Apache Spark Pyspark

Apache spark 正在从运行的Spark作业中删除工作节点
Apache Spark Pyspark

Apache spark 火花驱动器存储器和应用主存储器
Apache Spark Hadoop

Apache spark 作业的TaskCommitDenied（驱动程序拒绝任务提交）：<；工作#>；分区：<；分区#>；，尝试编号：<；尝试#>；
Apache Spark

Apache spark SparkSQL：是否有一个；filterPushdown“；jdbc数据源的特性
Apache Spark

Apache spark 如何为元组列表创建PySpark模式？
Apache Spark Pyspark

Apache spark 客户端模式下的Spark应用程序-初始化SparkContext时出错
Apache Spark Hadoop Windows 10

Apache spark 减去两个数组以在Pyspark中获得一个新数组
Apache Spark Pyspark

Apache spark 将数据加载到配置单元时，在字段中添加周围的引号
Apache Spark Hive

Apache spark 控制在spark thrift server中运行sql时执行器的数量
Apache Spark

Apache spark 无法将spark数据帧写入cassandra表
Apache Spark Cassandra

Apache spark 我可以触发齐柏林飞艇在一个命令中运行一个特定的块吗？
Apache Spark Hadoop Pyspark

Apache spark Spark Solr错误，可能是日志记录或hadoop主要版本
Apache Spark

Apache spark Spark数据集中的Kryo编码器v.s.RowEncoder
Apache Spark Serialization

Apache spark 如何让PySpark/Databricks作业在任务失败后继续运行并忽略坏记录
Apache Spark Pyspark

Tags

Floating Point Entity Framework 4 Url Rewriting Printing Osgi Kernel Javafx Windows 8 Ant Blazor Nestjs Calendar Salesforce Entity Framework Core Windows Mobile Usb Isabelle Ibm Cloud Linq To Sql Makefile Transactions Awk Automation Xna Playframework 2.0 Coq Ckeditor Netbeans Logging Stata Drupal Sharepoint Machine Learning Gstreamer Parameters Delphi Julia Mysql Java Me Wcf Azure Cosmosdb Autocomplete Ssrs 2008 Collections Sql Server Fluent Nhibernate Wpf Dataframe Zend Framework Seo Pyspark Razor Cluster Computing Ios4 Computer Vision Sms Arduino Timer Video Streaming Iphone Azure Active Directory Error Handling Pandas Adobe Windows Phone 8.1 Phpstorm Ms Access Mercurial Cloud Foundry Unicode Scikit Learn Drupal 7 Gis User Interface Twig Module Resharper Python Sphinx Plsql Laravel 5 Vb6 Silverlight Sql Server 2008 Ecmascript 6 C++ Cli Replace Nosql Google Calendar Api Instagram Docusignapi Teamcity Geolocation Operating System Time Complexity Safari Azure Service Fabric Zend Framework2 Passwords Lua Data Structures Notifications Button Visual Studio 2015 Vuejs2 Matplotlib Dotnetnuke Aframe Flash Assembly Vue.js Web Scraping Corda Azure Devops Excel Formula List View Canvas Dependency Injection Vb.net Drop Down Menu Ipython Elm Azure Data Factory Sip Navigation Jquery Optimization Concurrency Opencl Google Drive Api Knockout.js Kubernetes Scrapy Cucumber Webpack Reactjs Testing Mobile Tfs Applescript Function Json Server Phpunit Playframework Eclipse Https Windows Installer Heroku Nhibernate Phantomjs Arangodb Download Ide Latex Version Control Magento Yaml C# 3.0 Intellij Idea Woocommerce Dialogflow Es Coffeescript Exchange Server Sparql Spring Mvc Perforce Orchardcms Ionic Framework Xsd Colors Octave Uiview Android Studio Ember.js Marklogic Checkbox Codeigniter Angular Material Cocos2d Iphone Debian Sass Lambda Phpmyadmin Compiler Errors Oauth 2.0 Monitoring Ios Google Chrome Eclipse Rcp Microsoft Graph Api Hash Stm32 Azure Sql Database Input Search Performance Cmd Asynchronous Couchbase Visual Studio 2008

Copyright © 2024. All Rights Reserved by - Fatal编程技术网