Python 如何根据PySpark中的条件修改行的子集_Python_Pandas_Pyspark - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/332.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何根据PySpark中的条件修改行的子集_Python_Pandas_Pyspark - Fatal编程技术网

Python 如何根据PySpark中的条件修改行的子集

python pandas pyspark

Python 如何根据PySpark中的条件修改行的子集,python,pandas,pyspark,Python,Pandas,Pyspark,我正在尝试将所有MPa值转换为Pa。我在pandas中使用的代码如下所示。我如何将其转换为pyspark file_df.loc[file_df['Unit'] == 'MPa', 'Value'] = file_df['Value'] * 1000000 #coverts Value to Pa from MPa file_df.loc[file_df['Unit'] == 'MPa', 'Unit'] = 'Pa' # replace the MPa with Pa 当/时，您可以使用复制

我正在尝试将所有MPa值转换为Pa。我在pandas中使用的代码如下所示。我如何将其转换为pyspark

file_df.loc[file_df['Unit'] == 'MPa', 'Value'] = file_df['Value'] * 1000000 #coverts Value to Pa from MPa
file_df.loc[file_df['Unit'] == 'MPa', 'Unit'] = 'Pa' # replace the MPa with Pa

当/

时，您可以使用复制这些就地分配，否则
如下所示：
from pyspark.sql.functions import when, col, lit

m = sparkdf.Unit == 'MPa'
(sparkdf.withColumn("Value", when(m, col('Value')*1000).otherwise(col('Value')))
        .withColumn("Unit",  when(m, lit('Pa')).otherwise(col('Unit'))))


小型工作示例：
df = pd.DataFrame({'Unit':['MPa', 'MPb', 'MPc'],
                   'Value':[5, 4, 3]})

sparkdf = spark.createDataFrame(df)
m = sparkdf.Unit == 'MPa'

(sparkdf.withColumn("Value",  when(m, col('Value')*1000).otherwise(col('Value')))
        .withColumn("Unit",  when(m, lit('Pa')).otherwise(col('Unit')))).show()

+----+-----+
|Unit|Value|
+----+-----+
|  Pa| 5000|
| MPb|    4|
| MPc|    3|
+----+-----+

当

/

时，您可以使用复制这些就地分配，否则
如下所示：
from pyspark.sql.functions import when, col, lit

m = sparkdf.Unit == 'MPa'
(sparkdf.withColumn("Value", when(m, col('Value')*1000).otherwise(col('Value')))
        .withColumn("Unit",  when(m, lit('Pa')).otherwise(col('Unit'))))


小型工作示例：
df = pd.DataFrame({'Unit':['MPa', 'MPb', 'MPc'],
                   'Value':[5, 4, 3]})

sparkdf = spark.createDataFrame(df)
m = sparkdf.Unit == 'MPa'

(sparkdf.withColumn("Value",  when(m, col('Value')*1000).otherwise(col('Value')))
        .withColumn("Unit",  when(m, lit('Pa')).otherwise(col('Unit')))).show()

+----+-----+
|Unit|Value|
+----+-----+
|  Pa| 5000|
| MPb|    4|
| MPc|    3|
+----+-----+




[pandas]相关文章推荐



                                                        
Pandas 数据帧行删除
pandas 
Pandas 熊猫未熔化数据集
pandas 
使用pandas to_Datetime将秒转换为Datetime，而不会降低微秒精度
pandas 
Pandas 用于在CloudML上部署的TensorFlow输入管道
pandasinputtensorflow 
Pandas 在matplotlib pyplot中绘制周期序列
pandasmatplotlib 
Pandas 非方形数据帧的方形元素
pandasnumpy 
Pandas 按日期列出的数据帧平均值
pandasdataframe 
Pandas 在for循环中引用两个序列，然后根据它们的条件更新第一个序列-python
pandasfor-loopif-statement 
Pandas 将csv文件中的某些列相乘
pandasshellcsvawk 
在scikit learn中将pandas NumPy数组作为特征向量传递？
pandasscikit-learn 
Pandas Get_dummies生成的列比预期的多
pandas 
Pandas 基于日期列筛选数据框
pandasdatetime 
Pandas 日期时间检查格式
pandasdatetime 
Pandas 将列值转换为具有值的列
pandas 
Pandas 创建多索引数据帧时如何合并常用索引
pandasdataframe 
Pandas 如何从数据框中删除“库存日期”列
pandas 
pandas-合并并唯一重命名具有相同列名的两个数据帧的列
pandasdataframemerge 
Pandas 使用熊猫将数据集导入hdf文件
pandas 
Pandas 如何在写入excel时在groupby之后删除dataframe中的空值
pandasdataframe 
Pandas 了解Jupyter笔记本电脑的pd.DataFrame\uuuu repr\uuuuuu\uuuu行为
激励范例
pandasdataframeoop 
                                       





随机文章推荐



                                                        
Install4j 如何测试我的安装程序是否包含文件？
install4j 
捆绑jvm和windows存档介质类型的install4j问题
install4j 
AIX上是否支持install4j Unix安装程序？
install4j 
Install4j 为什么帮助文本的安装程序变量不可解析？
install4j 
Install4j Mac上的附加安装程序未将文件安装在正确位置，也未检测到以前安装的基本安装程序
install4j 
Install4j 安装屏幕-回滚屏障
install4j 
Install4j 如何在Ant构建自动化中选择特定介质
 P>基于C++的应用程序有两种类型的媒体（Windows和Linux）。所以我想在windows机器上构建windows media文件，在Linux机器上构建Linux media文件
install4j 
Install4j向导索引不显示左侧面板
install4j 
Install4j 重新选择组件时，文件关联屏幕不会更新
install4j


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
                                                        
                                                

                                                
                                                        Tags
                                                        
Routing
Jasper Reports
Jekyll
Xquery
Generics
Ruby On Rails 3.2
If Statement
Macros
D
Symfony1
Navigation
Umbraco
Adobe
Jakarta Ee
Sql
Cron
Marklogic
Windows Phone
Lucene
Office365
Python 3.x
Codenameone
Concurrency
Debian
Visual Studio 2015
Gatsby
Doctrine
Netlogo
Sip
Aws Lambda
Ms Office
Stripe Payments
Prometheus
Streaming
User Interface
Xsd
Mpi
Spring Mvc
Magento2
Wicket
Gruntjs
Processing
Youtube
Azure Service Fabric
Google Chrome Extension
Tensorflow
Google Chrome
Selenium Webdriver
Memory Management
Machine Learning
Sass
Twitter Bootstrap 3
Jquery Plugins
Resharper
Phpunit
Apache Nifi
Graphviz
Asp.net Mvc 3
Scripting
Ocaml
Mono
Discord.js
Msbuild
Omnet++
Mobile
Keyboard
Yii2
Enums
Xpath
Coq
Shopify
Floating Point
Sugarcrm
C#
Doctrine Orm
Aframe
Identityserver4
Azure Sql Database
Ssl
Polymer
Jwt
Git
Redux
Gradle
Awk
Drupal
Log4j
Ssh
Couchbase
Uitableview
Ruby On Rails 3
Active Directory
Windows Installer
Cloud Foundry
Kubernetes
Blackberry
Google Sheets
Amazon Redshift
Ipython
Swift
Tridion
Python 2.7
Install4j
Sharepoint
Fonts
Azure Active Directory
Firefox Addon
Jasmine
Download
Webstorm
Xamarin.ios
Sencha Touch
Mapreduce
Apache Flex
Composer Php
Ms Access
Cordova
Hbase
C++
Haskell
Sas
Autodesk Forge
Django
Azure Data Factory
Matplotlib
Linq To Sql
Stream
Big O
Rx Java
Filter
Assembly
Airflow
Less
Emacs
Binding
Spring Boot
Mqtt
Google Colaboratory
Apache Storm
Exception Handling
Bison
Phantomjs
Recursion
Com
Server
Sapui5
Testng
Autocomplete
Shiny
Passwords
Rspec
Ftp
Jupyter Notebook
Rally
Python
Editor
Elixir
Oracle Apex
Woocommerce
Zend Framework
Oracle10g
Hazelcast
Eclipse
Installation
Methods
Parameters
Jqgrid
Utf 8
Google Api
Multithreading
Sharepoint 2013
Docusignapi
Iphone
Ckeditor
Video
Google Bigquery
Cocoa
Image Processing
Tree
Solr
Jhipster
Sphinx
Deployment
Compiler Errors
Dependency Injection
Terraform
Lambda
Design Patterns
Maven
Zsh
Dictionary
Cassandra
Seo
Coldfusion
Email
Internet Explorer
Domain Driven Design
Notifications
Netbeans
Ios4
Visual Studio


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网