SparkR窗口函数_R_Apache Spark_Apache Spark Sql_Window Functions_Sparkr - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/opengl/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
SparkR窗口函数_R_Apache Spark_Apache Spark Sql_Window Functions_Sparkr - Fatal编程技术网

SparkR窗口函数

r apache-spark

SparkR窗口函数,r,apache-spark,apache-spark-sql,window-functions,sparkr,R,Apache Spark,Apache Spark Sql,Window Functions,Sparkr,我发现1.6版本的SparkR已经实现了窗口功能，包括lag和rank，但是over功能还没有实现。如何在SparkR中使用像lag函数这样的窗口函数而不使用over（而不是SparkSQL方式）？有人能举个例子吗？Spark 2.0.0+ SparkR为DSL包装器提供了over，窗口.partitionBy/分区by，窗口.orderBy/orderBy和行之间的/范围函数火花火花2.0.0+ SparkR为DSL包装器提供了over，窗口.partitionBy/分区by，窗口.orde

我发现1.6版本的

SparkR

已经实现了窗口功能，包括

lag

和

rank

，但是

over

功能还没有实现。如何在

SparkR

中使用像

lag

函数这样的窗口函数而不使用

over

（而不是

SparkSQL

方式）？有人能举个例子吗？

Spark 2.0.0+

SparkR为DSL包装器提供了

over

，

窗口.partitionBy

/

分区by

，

窗口.orderBy

/

orderBy

和

行之间的/范围
函数
火花火花2.0.0+
SparkR为DSL包装器提供了over
，窗口.partitionBy
/分区by
，窗口.orderBy
/orderBy
和行之间的/范围
函数
火花
set.seed(1)

hc <- sparkRHive.init(sc)
sdf <- createDataFrame(hc, data.frame(x=1:12, y=1:3, z=rnorm(12)))
registerTempTable(sdf, "sdf")

sql(hc, "SELECT x, y, z, LAG(z) OVER (PARTITION BY y ORDER BY x) FROM sdf") %>% 
  head()

##    x y          z        _c3
## 1  1 1 -0.6264538         NA
## 2  4 1  1.5952808 -0.6264538
## 3  7 1  0.4874291  1.5952808
## 4 10 1 -0.3053884  0.4874291
## 5  2 2  0.1836433         NA
## 6  5 2  0.3295078  0.1836433

w <- Window.partitionBy("y") %>% orderBy("x")
select(sdf, over(lag(sdf$z), w))

[apache spark]相关文章推荐 Apache spark 使用Phoenix 4.5在CDH 5.4上运行Spark作业时未找到PhoenixOutputFormat apache-spark Elasticsearch sbt无法从sonatype快照解析依赖关系 apache-sparksbt Apache spark 通过纱线启动的作业读取和写入的HDFS字节 apache-spark Apache spark Spark Mlib FPGrowth作业因内存错误而失败 apache-spark Apache spark 如何在spark SQL中获取列的数据类型？ apache-sparkhive Apache spark Spark和cassandra，聚类键的范围查询 apache-sparkcassandra Apache spark Java堆空间问题 apache-sparkpyspark Apache spark 使用groupBy时spark是否进行本地聚合？ apache-spark Apache spark 根据分区日期选择拼花地板 apache-sparkpyspark Apache spark Spark SQL-如何添加两列值 apache-sparkdataframepyspark Apache spark splitspark数据帧列 apache-sparkdataframepyspark Apache spark 400:请求错误，py4j.protocol.Py4JJavaError:调用o44.save时出错 apache-sparkpysparkamazon-redshift Apache spark 如何使用spark sql在groupby中进行选择 apache-sparkpyspark Apache spark pyspark中的dataframe.write.csv速度非常慢 apache-sparkpyspark Apache spark 在单个PRODUCT命令中，可以为卡夫卡主题生成的记录数量有限制吗 apache-sparkapache-kafka Apache spark Pyspark一步聚合 apache-sparkpyspark Apache spark Spark连续处理是否支持FlatMapGroupswithState之类的有状态操作 apache-sparkstreaming Apache spark PySpark中是否有枚举类型？ apache-sparkpyspark Apache spark 火花更换柱类型失败 apache-sparkpyspark Apache spark 如何使用列表选择数据集的列<；字符串>； apache-spark 随机文章推荐

[r]相关推荐如何在R中打印层次聚类的顺序？ R 在Lyx中使用Knitr进行绘图 R 如何从R中的向量中提取特定的数字并求和 R For Loop R data.frame跨列匹配并返回最匹配的列名 R R：用空格替换引号 R String R 包未正确加载 R R 复选框中至少有一个选择 R Shiny 在R ggplot2中创建双线/双箭头 R R ggplot2：标记水平线而不将标签与序列关联 R 在R中使用attributes（）和as.numeric（）/as.factor（）等有什么区别？ R 基于部分匹配文本重塑data.frame并求和（package stringdist） R Dataframe 通过不在dplyr中工作的突变将_分组 R R 预先选择屏幕中动态DT的行 R Shiny 像javascript一样定义逗号运算符 R R 如何在换行后将语料库内容转换为矢量\"； R R data.frame不能正确显示汉字 R Machine Learning 在r中的自定义函数中保留输入的变量名 R missr包错误 R 在R中使用不等样本量运行两样本t检验 R R 查找矩阵之间的公共值，并返回具有行列位置的矩阵 R Matrix R 积分的极大值 R R formattable（）防止删除具有类似名称的列 R 使用dplyr以每个组或行的特定值获取第一列的索引 R R 当第一个和最后一个观察值等于某个值时，求和之间的某些值 R 从具有非标准求值的for循环生成绘图 R For Loop R如何在使用xlim的基本绘图时自动调整y轴 R Plot 在R中使用ggplot向条形图添加参考线 R R 正则表达式问题-同一表达式中的两个负数后面 R Regex Nlp R data.table中的函数，其中有两列作为参数 R Function Input R中的地理加权负二项回归 R Tags Spring Integration Sencha Touch Joomla Paypal Tridion Lua Loops Iis Gps Excel Formula Coffeescript Dll Udp Oracle10g Firefox Silverlight 4.0 Gatsby Chef Infra Hybris Soap If Statement Arangodb Sonarqube Polymer Syntax Yocto Umbraco Awk Service Asp.net Mvc 5 Salesforce Filesystems Scikit Learn Stream Botframework Mapping Compiler Errors Notepad++ Extjs4 Jira Windows 7 Time Complexity Windows 8 Snmp Push Notification Macos Mongodb Netsuite Reporting Services Emacs Google Analytics Antlr4 Nunit Dom Jvm Amazon Ec2 Model View Controller Pagination Templates Composer Php Google Drive Api Uitableview Report Airflow Unit Testing Json Jetty Replace Hive Drools EmptyTag Rdf Elixir Data Binding Xmpp Firefox Addon Big O Leaflet Plone Shiny Aws Lambda Mqtt Download Drupal Sorting Javascript Ldap Open Source Usb Yii2 .htaccess Jmeter Cypress Sublimetext3 Cygwin Express Postgresql Configuration Markdown Plot Razor Vb6 Linux List Webview Asterisk Sharepoint 2007 Vhdl Assembly Pascal Mfc Permissions Tabs Proxy Sap Sequelize.js Keras Safari Phpstorm Xquery Docusignapi Windows Runtime Spring Recursion Dynamic Wolfram Mathematica Grep Here Api Web Scraping Azure Cosmosdb Reflection Dialogflow Es Cakephp Command Line Jsf Coding Style Appium Kotlin Drop Down Menu Deployment Kernel Mod Rewrite Cobol View Path Mediawiki Pentaho Ios7 Binding Xsd Unity3d Discord.py Breeze Ckeditor Ibm Midrange Ios Influxdb Ip Tfs Prestashop Properties Scheme Wordpress Ant Sockets Windows Installer Swift2 Google Bigquery Android Ndk Windbg Amazon Redshift Stm32 Youtube Sql Server 2008 Knockout.js Ag Grid Dependencies Compression Perforce Kibana Google Visualization Dojo Menu Vim Phantomjs Version Control Netty Timer Orm Apache Spark Curl Wix Jquery Plugins Autodesk Forge Scala Linux Kernel Opencl Air .net 4.0 Ssis Gnuplot

Copyright © 2024. All Rights Reserved by - Fatal编程技术网