Python 如何从PySpark中的map方法返回空（null？）项？_Python_Apache Spark_Pyspark_Rdd - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何从PySpark中的map方法返回空（null？）项？_Python_Apache Spark_Pyspark_Rdd - Fatal编程技术网

Python 如何从PySpark中的map方法返回空（null？）项？

python apache-spark pyspark

Python 如何从PySpark中的map方法返回空（null？）项？,python,apache-spark,pyspark,rdd,Python,Apache Spark,Pyspark,Rdd,我正在使用 RDD.map(lambda line: my_method(line)) 基于我的_方法中的一个特定条件（假设行以“a”开头），我想返回一个特定的值，否则一起忽略该项现在，如果该项不满足条件，我将返回-1，然后使用另一个 RDD.filter() method to remove all the ones with -1. 通过从my_方法返回null，有没有更好的方法可以忽略这些项目？在这种情况下flatMap是您的朋友：调整my_方法，使其返回单个元素列表或空列表（或创

我正在使用

RDD.map(lambda line: my_method(line))

基于我的_方法中的一个特定条件（假设行以“a”开头），我想返回一个特定的值，否则一起忽略该项

现在，如果该项不满足条件，我将返回-1，然后使用另一个

RDD.filter() method to remove all the ones with -1.

通过从my_方法返回null，有没有更好的方法可以忽略这些项目？

在这种情况下

flatMap

是您的朋友：

调整

my_方法

，使其返回单个元素列表或空列表（或创建类似于此处的包装器）

flatMap

rdd = sc.parallelize(["aDSd", "CDd", "aCVED"])

rdd.flatMap(lambda line: my_method(line)).collect()
## ['adsd', 'acved']

如果您想根据某些条件忽略这些项，那么为什么不单独使用

过滤器？为什么要用地图？如果要对其进行转换，可以在“过滤器”的输出上使用映射。
过滤器是转换方法。由于创建新的RDD，这是一项高成本的操作
rdd = sc.parallelize(["aDSd", "CDd", "aCVED"])

rdd.flatMap(lambda line: my_method(line)).collect()
## ['adsd', 'acved']




[apache spark]相关文章推荐



                                                        
                                       





随机文章推荐



                                                        
Random 简单随机英语句子生成器
random 
Random Prolog中的随机项
randomprolog 
Random 如何在XNA中随机抽取50个相同的精灵？
randomxna 
Random 非重复随机数
random 
Random 是否有某种可靠的方法来检测来自公共PRNG的整数列表？
random 
Random 是否有随机分布的dart包？
randomstatisticsdart 
Random 理解随机数生成
random 
Random SMOTE不在随机森林中工作，用于保留样本
random 
Random PGI Fortran中的随机数生成器不太随机
randomfortran 
Random 为什么Fortran随机种子输入是一个数组？
randomfortran 
Random 在linux中在txt.file处生成随机字
randomterminal 
Random 散列到DocumentDB中的分区
randomazure-cosmosdb 
Random Tensorflow：使用不同的随机样本高效运行相同的计算图
randomtensorflow 
Random 如何在laravel 5.5中以滑块显示全天的3篇文章以及第二天的随机更改？
randomlaravel-5 
Random <；随机>；在Windows中生成相同的数字，但在Linux中不生成
random 
Random PostgreSQL获取两个时间戳之间带有小时约束的随机值
random 
Random 任何逻辑如何从具有一定概率的总体中选择代理？
random 
Random 如何生成随机时间？
randomtime 
Random 具有随机睡眠间隔的罗技游戏软件宏
random 
Random 智能合约-Chainlink VRF/@openzeppelin/truffle升级兼容性
random


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
使用Python高效地存储和访问网页
									Python
							 									Mysql
							 									Database
							 
Python Heroku的新成员——”；没有web之类的类型；
									Python
							 									Heroku
							 									Web
							 									Flask
							 
在Mac OS X上的IDE中设置Python 3的路径
									Python
							 									Ide
							 									Path
							 									Python 3.x
							 
Python 从unix使用sendmail添加附件
									Python
							 									Unix
							 
用Python编写（具有整数的函数的）无穷和
									Python
							 									Numpy
							 
用Python读写数据的最快方法？
									Python
							 									File Io
							 
Python py2exe-如何将外部文件夹嵌入已发布的包exe
									Python
							 
Python 生成重复列表而不考虑顺序
									Python
							 									Algorithm
							 
了解Python解释器-t（-tt）选项
									Python
							 									Python 2.7
							 									Tabs
							 
Python'；s time.sleep（）方法等待的时间不正确
									Python
							 									Time
							 
Python 如何通过请求提交多个具有相同帖子名称的文件？
									Python
							 									Python 3.x
							 
Python 在Mechanical Turk中创建批处理的方法
									Python
							 
Python 新样式类的运算符
									Python
							 									Class
							 									Python 2.7
							 
Python 动态过涂
									Python
							 									Matplotlib
							 									Plot
							 
Python 控制用于换行的Matplotlib图例框宽度
									Python
							 									Python 2.7
							 									Matplotlib
							 									Plot
							 
Python 编程语言中调用堆栈的实现
									Python
							 
Python 在centos 5上安装Tensorflow
									Python
							 									Centos
							 									Tensorflow
							 
Python 使用一行代码打印for循环
									Python
							 									For Loop
							 
Python 如何使用Django发布和检索blob
									Python
							 									Django
							 
Python 如何绘制事件到达时间的概率密度函数（PDF）？
									Python
							 									Numpy
							 									Plot
							 
Python django中@login\u required和@method\u decorator（login\u required）之间有什么区别
									Python
							 									Django
							 
&引用；明智地；删除Python列表中的点
									Python
							 									List
							 									Filter
							 
Python 创建新列并将其添加到sql数据库
									Python
							 									Sql
							 									Pandas
							 
Python Can'；无论我做什么，都不要将XLWings加载项安装到excel中
									Python
							 									Excel
							 									Vba
							 
在Python 3中使用Pyodbc自动检测ODBC驱动程序
									Python
							 									Sql Server
							 									Python 3.x
							 
Python keras Numpy阵列列表不是预期的大小模型
									Python
							 									Tensorflow
							 									Machine Learning
							 									Neural Network
							 									Keras
							 
如何有选择地覆盖python'；让我们帮（MYclass）做一些简短的定制？
									Python
							 									Python 2.7
							 
如何配置IPython以与普通Python REPL相同的方式执行单元块？
									Python
							 									Ipython
							 
Python 如何将初始隐藏状态传递给lstm层？
									Python
							 									Python 3.x
							 									Tensorflow
							 
Python Selenium无法打开标记为存储保留的文件
									Python
							 									Selenium
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Windows Services
Internet Explorer 8
Express
Oop
Quickbooks
Django Rest Framework
Directx
Z3
Concurrency
Resharper
Plone
System Verilog
Post
Sql Server
Scikit Learn
Asp.net Web Api
Eclipse
Mfc
Sorting
Safari
Responsive Design
F#
Centos
Julia
Uiview
Racket
Zend Framework
Llvm
Sap
Eclipse Rcp
Liferay
Content Management System
Localization
Elm
Plugins
Twitter Bootstrap
Maps
Jhipster
Canvas
Xcode
Azure Service Fabric
Video Streaming
Jsf 2
Vector
Matplotlib
Search
Drupal 6
Jasper Reports
Filesystems
Mariadb
Osgi
Mapbox
Listview
Redux
Xna
Fluent Nhibernate
Groovy
Webview
Image
Shell
Linker
Google Maps
Php
Mongodb
Iframe
Highcharts
Hbase
Meteor
Openssl
Lua
Jsf
Joomla
Tkinter
Tinymce
Cygwin
Bazel
Influxdb
Mysql
Merge
Ruby On Rails 3
Colors
Linq To Sql
Excel Formula
Xpages
Sip
Couchbase
Bootstrap 4
Windows Installer
Entity Framework
Google Visualization
Ionic Framework
Swagger
C++11
Ecmascript 6
Printing
Leaflet
For Loop
Parameters
Automation
Menu
Umbraco
Axapta
Visual Studio 2010
Logstash
Django
Virtual Machine
Combobox
Rabbitmq
Ios5
Knockout.js
Clearcase
Microsoft Graph Api
React Native
Akka
Windows Phone
Windows Phone 8.1
Openlayers 3
Apache2
Xaml
Autocomplete
Electron
Doctrine Orm
Node.js
Powerbi
Aws Lambda
Struct
Webgl
Datetime
Tfs
Workflow
Xsd
Erlang
Powershell
Mobile
Google Chrome
Less
Migration
Github
Paypal
Cassandra
Kibana
Neo4j
Openerp
Chef Infra
Exchange Server
Pine Script
Postgresql
Intellij Idea
Ftp
Cloud Foundry
Login
Facebook
Hyperlink
Cocos2d X
Orchardcms
Flutter
Charts
Zurb Foundation
Stripe Payments
Jquery
Video
Stata
EmptyTag
Network Programming
Amazon Cloudformation
Prolog
Wix
Function
Phantomjs
Notifications
Regex
Windbg
Nsis
Stm32
Antlr4
Heroku
Ios6
Db2
Boost
Chart.js
Applescript
Xamarin
Unix
Nlp
Plsql
Azure Data Factory
Perforce
Graph
Cypress
Spring Security
Jetty
Dask
Interface
C++ Cli
Iphone
Model View Controller
Asp.net Mvc 5
Pentaho
Filter
Sql Server 2012
Gulp


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网