Python 3.x 在spark应用程序中处理负面测试用例-Pyspark_Python 3.x_Apache Spark_Pyspark_Apache Spark Sql_Pyspark Dataframes - Fatal编程技术网

Python 3.x 在spark应用程序中处理负面测试用例-Pyspark

python-3.x apache-spark pyspark

Python 3.x 在spark应用程序中处理负面测试用例-Pyspark,python-3.x,apache-spark,pyspark,apache-spark-sql,pyspark-dataframes,Python 3.x,Apache Spark,Pyspark,Apache Spark Sql,Pyspark Dataframes,我有一个spark应用程序，它执行ETL工作，从Kafka主题（结构化流）读取到dataframe，dataframe将主题中的消息作为字符串读取。使用regex从字符串中提取列字段，然后对字段应用一些聚合如果Kafka主题中的消息是以特定格式给出的，那么这种方法很有效，但如果某些字段丢失，则存储为null。我如何使程序显示准确的问题，而不是一个巨大的错误我还尝试为脚本的其他单元包括输入验证，比如聚合函数检查所需的列是否存在，检查数据帧中的单元格是否为特定格式，等等在这里处理输入验证的最佳

我有一个spark应用程序，它执行ETL工作，从Kafka主题（结构化流）读取到dataframe，dataframe将主题中的消息作为字符串读取。使用regex从字符串中提取列字段，然后对字段应用一些聚合

如果Kafka主题中的消息是以特定格式给出的，那么这种方法很有效，但如果某些字段丢失，则存储为null。我如何使程序显示准确的问题，而不是一个巨大的错误

我还尝试为脚本的其他单元包括输入验证，比如聚合函数检查所需的列是否存在，检查数据帧中的单元格是否为特定格式，等等

在这里处理输入验证的最佳方法是什么？我是否使用

尝试并排除？还是应用聚合函数之前的断言




[apache spark]相关文章推荐



                                                        
Apache spark 阿帕奇火花。特定资源库
apache-spark 
Apache spark 按排序顺序将RDD收集到一个节点
apache-spark 
Apache spark Bluemix Spark即服务和表9.3
apache-sparkibm-cloudtableau-api 
Apache spark 为什么Spark rdd.count（）的结果不一致？
apache-sparkpyspark 
Apache spark spark如何读取文本格式文件
apache-sparkpyspark 
Apache spark 不要使用Spark RDD'；s是否有类似于set的东西允许快速查找？
apache-sparkpyspark 
Apache spark 火花摄取路径：“源到驱动程序到工作人员”或“源到工作人员”
apache-spark 
Apache spark Spark结构化流媒体与ElasticSearch集成
apache-spark 
Apache spark spark上的分布式DBSCAN
apache-spark 
Apache spark 水平缩放Spark的任何提示
apache-spark 
Apache spark SparkSession在Spark2.3中不可用的原因
apache-spark 
Apache spark 在纱线上运行火花直线
apache-sparkhadoop 
Apache spark Spark中区分大小写的拼花图案合并
apache-spark 
Apache spark spark读取文件时列中的多行值
apache-spark 
Apache spark 如何从spark dataframe列读取xml数据
apache-sparkpyspark 
Apache spark 为什么我不能在Spark中使用贴图功能更改节点的属性？
apache-spark 
Apache spark DirectRunner spark模式下的内存分析
apache-spark 
Apache spark 火花误差的解释与消除
apache-spark 
Apache spark 长时间运行进程的Spark事件日志
apache-spark 
Apache spark 特殊字符不会从色调表中显示，而是从配置单元外壳中显示
apache-sparkhive 
                                       





随机文章推荐



                                                        
Windows 10 覆盆子皮2物联网
windows-10 
Windows 10 Windows 10 mobile，更改状态栏颜色
windows-10 
Windows 10 如何验证UWP上的电子邮件写入？
windows-10 
Windows 10 Windows Universal中CultureInfo之间的不一致
windows-10 
Windows 10 Windows 10通用应用程序工厂安装
windows-10 
Windows 10 Cortana没有拾取命令参数
windows-10 
Windows 10 Windows 10上的Atom安装错误
windows-10atom-editor 
Windows 10 安装python时出错：'；cl.exe'；失败，退出状态为2
windows-10 
Windows 10 错误（0x0 0x0）[Golang/Window10/64位]
windows-10google-chromego 
Windows 10 最近从Adobe购买的计算机阻止IP地址
windows-10adobe 
Windows 10 在windows10和SUSE Sles 12之间配置IKE/Ipsec连接时出错
windows-10 
Windows 10 无法加载OrderRou、PeerOU。。。。证明书错误
windows-10hyperledger-fabric


                                        

                                        
                                        


                                                
                                                        [python 3.x]相关推荐
                                                        
Python 3.x Python程序中的索引错误
									Python 3.x
							 
Python 3.x 从python2移动到python3.3 shell_exec输出已停止工作
									Python 3.x
							 
Python 3.x 如何将不恰当的分数转换为混合数？
									Python 3.x
							 
Python 3.x 将结果打印为垂直列表
									Python 3.x
							 
Python 3.x 在tkInter中按下按钮时需要帮助切换标签
									Python 3.x
							 									Tkinter
							 
Python 3.x Can'；t从csv文件执行反向web搜索
									Python 3.x
							 									Csv
							 									Web Scraping
							 
Python 3.x 我可以让python程序无休止地循环吗？
									Python 3.x
							 
Python 3.x 条件xpath语句
									Python 3.x
							 									Xpath
							 									Web Scraping
							 									Scrapy
							 
Python 3.x 如何使smth在嵌套循环中脱颖而出？
									Python 3.x
							 									For Loop
							 
Python 3.x 如何运行chrome无头浏览器
									Python 3.x
							 									Selenium
							 
Python 3.x Python3模块打包文件结构和初始化文件
									Python 3.x
							 									Module
							 
Python 3.x 仍然存在'；int'；对象是不可编辑的
									Python 3.x
							 
Python 3.x 如何从QtableWidgetItem创建字典？
									Python 3.x
							 									Dictionary
							 
Python 3.x 为什么导入'；梳子'；使用numpy 1.13.3时从scipy失败？
									Python 3.x
							 									Numpy
							 									Scikit Learn
							 									Google Colaboratory
							 
Python 3.x 如何更改Pocket Sphinx的模型
									Python 3.x
							 									Machine Learning
							 
Python 3.x 值映射上的Sqlite联接列
									Python 3.x
							 									Sqlite
							 									Join
							 
Python 3.x Python字典的退出代码为0
									Python 3.x
							 
Python 3.x 从Python中的列表元素中删除特定标点符号
									Python 3.x
							 									String
							 									List
							 
Python 3.x 使用rpy2从R数据帧获取单元格值
									Python 3.x
							 									Dataframe
							 
Python 3.x 使用Flask RESTful解析非Unicode字符串
									Python 3.x
							 									Flask
							 									Unicode
							 
Python 3.x 使用Matplolib中的百分比累积计数进行对比度增强
									Python 3.x
							 									Opencv
							 									Matplotlib
							 									Image Processing
							 
Python 3.x 如何检查文件是否已关闭？蟒蛇3
									Python 3.x
							 
Python 3.x 如何使TQM进度条在KB和MB之间自动更新
									Python 3.x
							 
Python 3.x 类似rq worker的Scrapy spider调度过程
									Python 3.x
							 									Scrapy
							 
Python 3.x Python和MATLAB在从datetime计算POSIX时的分歧
									Python 3.x
							 									Matlab
							 									Datetime
							 
Python 3.x 如何在Python语言识别中自动检测语言
									Python 3.x
							 									Api
							 									Audio
							 
Python 3.x mypy能否根据当前对象的类型选择方法返回类型？
									Python 3.x
							 
Python 3.x 在python中，是否可以将具有重复值的列表转换为多集？
									Python 3.x
							 									List
							 
Python 3.x 如何在Django模型字段中获取字符串并确保其唯一
									Python 3.x
							 									Django
							 									Django Models
							 
Python 3.x 如何使用Sk learn OneHotEncoder对数据帧中的两列或更多列进行编码？
									Python 3.x
							 									Pandas
							 									Machine Learning
							 									Scikit Learn
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Visual Studio 2010
Filesystems
Memory Leaks
Machine Learning
Internet Explorer 8
Cakephp
Google Cloud Platform
Arm
Hive
Mobile
Tfs
Ibm Mq
Jmeter
Sip
Rdf
Linkedin
Vba
Model View Controller
Elixir
Cocos2d Iphone
Uiview
Ipython
Netlogo
Ftp
Apache Spark
Shopify
Pointers
Imagemagick
Dependency Injection
Apache Nifi
Pdf
Spring Batch
Octave
Scala
Salesforce
Bash
Aframe
Redux
Angular Material
Cron
Date
Usb
Json
Compiler Errors
Fonts
Visual Studio
Xquery
Delphi
Jvm
Yaml
Path
Function
Umbraco
Asp.net Mvc 3
Google Visualization
Discord.py
Logic
Xampp
Continuous Integration
Asp.net Mvc 5
Java
Kendo Ui
Jakarta Ee
Jqgrid
Layout
Sass
C++ Cli
Llvm
Marklogic
Ios7
Login
Core Data
Requirejs
Active Directory
Qt
Architecture
Filter
Visual Studio 2013
Jdbc
E Commerce
Applescript
Ember.js
Racket
Sed
Acumatica
Apache Storm
Web
Ruby On Rails 3.2
.htaccess
Entity Framework Core
Struts2
Datatables
Language Agnostic
Joomla
Jpa
Wordpress
Django Models
Cocoa Touch
Google Drive Api
Ruby On Rails 3
Vagrant
Spring Mvc
Puppet
Nsis
Character Encoding
Nest
Dojo
Bots
Hash
Youtube
Cloud
Artifactory
Ldap
Compiler Construction
Robotframework
Windows Phone 7
Cuda
Bazel
Sphinx
Spotify
Cors
Iis 7
Emacs
Macos
Loopbackjs
Numpy
Performance
Tomcat
Asp.net Core Mvc
Logging
Phpmyadmin
Mongodb
Report
Pagination
Javafx
Post
Anaconda
Azure Devops
Ssis
Iis
Zend Framework2
Tcl
Process
Playframework 2.0
Drools
Scheme
Stm32
Mod Rewrite
Keras
Vmware
Jetty
Timer
Gis
D
Gnuplot
Android Emulator
Computer Vision
Ios
Ios4
Gridview
Uitableview
Python 3.x
Design Patterns
Lucene
Synchronization
Ethereum
Drupal
Nunit
Kdb
Opengl
Entity Framework
Node.js
Interface
Mfc
Windows 10
Class
Facebook Graph Api
Google App Maker
Navigation
Spring Boot
Selenium Webdriver
Session
Cocos2d X
Mapreduce
Opencart
Phpstorm
Powershell
Apache Camel
Couchdb
Swing
Z3
Oop
Image Processing
Scroll
Opencv
Adobe
Math
Visual Studio 2017
Highcharts
Amazon Ec2
Asp.net Mvc


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网