Python 如何在pyspark中获取顺序id_Python_Python 3.x_Apache Spark_Pyspark - Fatal编程技术网

Python 如何在pyspark中获取顺序id

python python-3.x apache-spark pyspark

Python 如何在pyspark中获取顺序id,python,python-3.x,apache-spark,pyspark,Python,Python 3.x,Apache Spark,Pyspark,我有一个pyspark数据帧，ID重复且不连续。我想添加一列顺序id，即下面的第二列 id | seq 1 |1 3 |2 7 |3 3 |2 3 |2 我发现实现这一目标的唯一方法是 window = Window.orderBy(col('id')) df1 = df.select('id').distinct().withColumn('seq', F.row_number().over(window)) df.join(df1, on='id') 但这似乎不是最好的办法。有

我有一个pyspark数据帧，ID重复且不连续。我想添加一列顺序id，即下面的第二列

id | seq
1  |1
3  |2
7  |3
3  |2
3  |2

我发现实现这一目标的唯一方法是

window = Window.orderBy(col('id'))
df1 = df.select('id').distinct().withColumn('seq', F.row_number().over(window))
df.join(df1, on='id')

但这似乎不是最好的办法。有没有其他更快捷的方法来完成此任务？

使用densite\u-rank
窗口功能
示例：

from pyspark.sql.window import * from pyspark.sql.functions import * w=Window.orderBy('id') df.show() #+---+ #| id| #+---+ #| 1| #| 3| #| 3| #| 3| #| 7| #+---+ df.withColumn("seq",dense_rank().over(w)).show() #+---+---+ #| id|seq| #+---+---+ #| 1| 1| #| 3| 2| #| 3| 2| #| 3| 2| #| 7| 3| #+---+---+

[python 3.x]相关文章推荐

Python 3.x WinError 10061-无法建立连接 python-3.x

Python 3.x 将numpy矩阵的每个元素除以该行的单位向量 python-3.x numpy

Python 3.x 在python中解码时，未正确处理85位编码字符串 python-3.x character-encoding pyspark

Python 3.x 混淆矩阵图形不显示数据 python-3.x dataframe

Python 3.x 如何将动态参数作为参数传递 python-3.x tkinter

Python 3.x Python总是打印第一条语句，忽略elifs python-3.x if-statement

Python 3.x 用scrapy跟踪新闻链接 python-3.x web-scraping scrapy web-crawler

Python 3.x 如何在Sanic中获取推荐网站url python-3.x

Python 3.x 如何在google chrome中使用Selenium web驱动程序处理登录弹出窗口 python-3.x

Python 3.x 意外的pytest python3默认参数处理 python-3.x

Python 3.x datetime之间发生的次数 python-3.x pandas datetime

Python 3.x 无法使用moviepy在视频中迭代图像 python-3.x

Python 3.x 无法使用selenium（python）单击下拉列表中的一项 python-3.x selenium-webdriver

Python 3.x pyautogui未在文本字段中写入Excel单元格内容 python-3.x

Python 3.x 为什么不是'；我的HTTP代理是否连接到HTTPS网站？ python-3.x

Python 3.x 如何测试一行数据是否为十六进制？ python-3.x character-encoding

Python 3.x 基于匹配的复制值 python-3.x pandas

Python 3.x Django在文件字段中上载文件时文件名编码不正确 python-3.x django django-models file-upload

Python 3.x 需要用数据帧中的下一个值填充NaN python-3.x pandas

Python 3.x 多层感知器对mnist数据集进行分类 python-3.x neural-network

随机文章推荐

Vuejs2 获取错误“；未能装入组件：未定义模板或呈现函数"；在动态创建的vue实例上 vuejs2 vue.js

Vuejs2 Karma/Chai测试中的Vuelidate警告 vuejs2

Vuejs2 Vuejs 2@click.middle不'；行不通 vuejs2

Vuejs2 属性或方法“；绑定"；未在实例上定义，但在渲染期间引用 vuejs2

Vuejs2 Axios-仅在1次调用中删除授权标头 vuejs2

Vuejs2 未定义属性VueJS vuejs2

Vuejs2 模板未加载，显示错误[Vue warn]：编译模板时出错： vuejs2

Vuejs2 此.$validator.validateAll（）。然后（（结果）始终显示false，即使填写了所有表单输入 vuejs2

Vuejs2 Vue.js-v-model不使用动态数据跟踪更改？ vuejs2

Vuejs2 跨浏览器平台测试 vuejs2 automated-tests cypress

Vuejs2 BootstrapVue表未显示数据 vuejs2

Vuejs2 如何在Vue 2.0中仅向直接父级发送数据？ vuejs2

Vuejs2 组件安装但模板标记未在生产环境中呈现（但在开发中呈现）：Nuxtjs Vuejs Vuetifyjs Rollupjs 概要 vuejs2

Vuejs2 我能'；安装vue js组件时，不要使用querySelector选择元素 vuejs2

Vuejs2 如何将vue select与cypress一起使用 vuejs2 cypress

Vuejs2 Vue，Vuetify 2，v-data-table-如何在表格底部显示总原始数据 vuejs2

Vuejs2 Vue Google图表-日期类型不起作用 vuejs2 google-visualization

Vuejs2 VueJS更新循环中的变量 vuejs2

Vuejs2 NUXT Vuex更改状态值返回存储状态错误 vuejs2

Vuejs2 如何使用vue路由器传递自定义参数？ vuejs2

[python]相关推荐

MYSQL和python错误
Python Mysql

在python中，如何在外部列表的最后一个嵌套列表中添加项？
Python

Python 生成测试结果
Python Unit Testing

在R或Python中列出具有相同值的CSV单元格？
Python R Excel

Python TypeError:“str”对象不可调用
Python

Python 确定生成器生成的值的数量
Python Python 3.x

在python中，比较列表还是字符串更有效？
Python Python 2.7

Python 使用2个不同程序从同一文本文件读写时的同步问题
Python Python 2.7 File Io

Python 检查同一字母连续出现的两次
Python Regex

Python一行换循环
Python List For Loop

Python 清除mechanize.Browser实例中的密码存储？
Python Python 2.7

Python OpenCV.compute错误
Python Opencv

从2D数组Python生成字典？
Python Arrays Dictionary

Python “A或B或两者”的表达式组合
Python

如何使用python快速计算大型单词列表中单词的计数频率并成为字典
Python Performance List Python 3.x Dictionary

Python 使用比较运算符创建新的DataFrame列
Python Pandas Dataframe

Python 带有IP地址的Django url
Python Django

python：创建两个列表
Python List

Sklearn决策树分类器显示浮点错误Python[不是重复的]
Python

Python 查找字典中所有唯一的键对
Python Dictionary

Python for循环在下面的代码中做什么？
Python Scikit Learn

如何解决python中的“名称错误”，而数字是可以解决的，但单词不是？
Python

Python 如何有一个可变的变化量？
Python

Python elif和else命令被完全忽略，相反，无论我键入什么，即使是随机字母，我都会得到if的答案
Python Python 3.x

Python 如何获取包含字符串列表的数据帧，并在Pyspark中从这些列表创建另一个数据帧？
Python Apache Spark Pyspark

Python 如何将一个文件的多行以制表符分隔以转置文件内容？
Python Csv

Python 为什么我的所有函数都不执行，只执行第一个？
Python

Python 具有QPropertyAnimation的椭圆动画
Python

Python 如何替换包含在多个列的列表中的值
Python Pandas String Replace

如何使用python在google sheet的单个单元格中提取以|分隔的多个URL
Python

Tags

Fonts Core Data Python 3.x Combobox Synchronization Makefile Programming Languages Hibernate Jekyll Prestashop Typo3 Migration Grid Reporting Services Azure Service Fabric Jhipster Assembly Common Lisp Unit Testing Hyperlink Sqlalchemy Vba Email Collections Ibm Cloud Mysql Video Streaming .htaccess Mvvm Zend Framework Keras Unix Oauth 2.0 Sprite Kit Opencl Ruby Jsp Kubernetes Apache Pig Stanford Nlp Nativescript Image Processing Npm Cakephp Webrtc Coding Style Google Analytics Embedded Compilation Compiler Construction Ms Office Plone Spring Batch Elixir Google Sheets Osgi Windows Installer Netsuite Sql Server 2012 Raspberry Pi Sublimetext2 Events Gps Ibm Mq Computer Science Spring Spring Integration Aurelia Ant Prometheus Yaml Windows Phone 7 Sugarcrm Xpath Dependency Injection Arduino Encryption Docusignapi Fortran Transactions Shell Snmp Hbase Gridview Wolfram Mathematica Tensorflow Ios4 Stored Procedures Variables D Time Complexity Gstreamer Amp Html Salesforce Ios6 Entity Framework Less Android Ndk Datatables Parallel Processing Sitecore Rabbitmq Cassandra Ffmpeg Validation Grafana Colors Couchbase Sharepoint 2007 Gmail Keycloak Corda Nestjs Ethereum Inno Setup Big O Qt4 Selenium Antlr Xml C# 3.0 Pip Mariadb Iis 7 Jdbc Linux Dataframe Wix Twilio Ember.js Scripting Twitter Bootstrap 3 Ruby On Rails 4 Netlogo Ubuntu Ipad Methods Web Applications Gdb Groovy Asp.net Functional Programming Tabs Powerbi Github Orientdb Google Maps Api 3 Facebook Graph Api Cryptography Marklogic User Interface Glassfish Proxy Model Sms Umbraco Qt Telerik Checkbox Maps Robotframework Soap Replace Cocoa Playframework Asp.net Mvc Csv Nest Actionscript Odata Iphone Macos Merge Windows Runtime Ocaml Google Apps Script Webview Antlr4 Google Cloud Storage Mobile Firefox Laravel 5 X86 Jqgrid Uwp Unicode Phpmyadmin Api Cucumber Sql Openlayers C Reflection Spring Cloud Autohotkey Android Pointers Angular Css Tree Time

Copyright © 2024. All Rights Reserved by - Fatal编程技术网