R 基于长列表删除停止字_R_Tm - Fatal编程技术网

R 基于长列表删除停止字

r

R 基于长列表删除停止字,r,tm,R,Tm,我有一个包含60000行/短语的数据框，我想将其用作停止词并从文本中删除我使用tm软件包，在阅读csv文件和停止字列表后，我使用这一行： corpus <- tm_map(corpus, removeWords, df$mylistofstopwords) 名单太大了，有什么问题吗？我可以做些什么来修复它吗？您可以通过将停止字列表拆分为多个部分来解决问题，如下所示： chunk <- 1000 i <- 0 n <- length(df$mylistofstopwor

我有一个包含60000行/短语的数据框，我想将其用作停止词并从文本中删除

我使用tm软件包，在阅读csv文件和停止字列表后，我使用这一行：

corpus <- tm_map(corpus, removeWords, df$mylistofstopwords)

名单太大了，有什么问题吗？我可以做些什么来修复它吗？

您可以通过将停止字列表拆分为多个部分来解决问题，如下所示：

chunk <- 1000
i <- 0
n <- length(df$mylistofstopwords)
while (i != n) {
    i2 <- min(i + chunk, n)
    corpus <- tm_map(corpus, removeWords, df$mylistofstopwords[(i+1):i2])
    i <- i2
}

chunk拆分列表并使用两个不同的removeWords
列表调用tm\u map
两次？@MrFlick我试图拆分整个列表，但问题还是一样的。我只试了前2000行，效果很好。我只是想知道是否有一个更有效的代码解决方案和可能的快捷方式这些很长的单词？范围是什么（nchar（df$mylistofstopwords））
？@MrFlick我试图键入范围（nchar（df$mylistofstopwords））
，但我收到此错误：nchar（df$mylistofstopwords）中的错误：'nchar（）'需要字符向量
该列不是字符类吗？类（df$mylistofstopwords）
返回什么？可能是range（nchar（as.character（df$mylistofstopwords）））或mean（）`
chunk <- 1000
i <- 0
n <- length(df$mylistofstopwords)
while (i != n) {
    i2 <- min(i + chunk, n)
    corpus <- tm_map(corpus, removeWords, df$mylistofstopwords[(i+1):i2])
    i <- i2
}

library(corpus)
x <- term_matrix(corpus, drop = df$mylistofstopwords)




[inheritance]相关文章推荐



                                                        
Inheritance Scheme中的继承类
inheritancelispscheme 
Inheritance django中类继承对多态性的使用？
inheritancedjango-models 
Inheritance Fluent NHibernate中的映射修订表
inheritancefluent-nhibernatetriggers 
Inheritance backbone.js模型继承
inheritancebackbone.js 
Inheritance 继承替代主义
inheritancedoctrine-ormzend-framework2 
Inheritance 扩展签名和原子实例化
inheritance 
Inheritance 围绕多重继承展开工作
inheritanceswift 
Inheritance 如何在基本web api控制器中提供可通过属性路由访问的方法
inheritanceasp.net-web-apirouting 
Inheritance Go-如何显式声明结构正在实现接口？
inheritancegointerface 
Inheritance C++；动态\u强制转换不返回NULL
inheritance 
Inheritance 打字稿：'；超级'；必须在访问'之前调用；这'；在派生类的构造函数中
inheritancearchitecturetypescriptwebstorm 
Inheritance 当我们重写扩展另一个抽象类的父抽象类方法时，调用哪个方法？
inheritance 
Inheritance UML类图——继承的多重性
inheritanceuml 
Inheritance Dart中什么是基于Mixin的继承？
inheritancedart 
Inheritance 无多态性的UML泛化
inheritanceuml 
                                       





随机文章推荐



                                                        
QML是否值得用于运行在600MHz左右且无GPU的嵌入式系统？
qml 
QML代码覆盖率分析
qml 
Can'；t在QML javascript文件中编辑对象属性
qml 
Qml 如何将PropertyChanges应用于名为；“目标”；
qml 
QT5：多次实例化相同的QML组件
qml 
如何从nixpkgs使用qt5时配置QML导入路径
qml


                                        

                                        
                                        


                                                
                                                        [r]相关推荐
                                                        
R 访问动态创建的对象中的列名
									R
							 
如何用中断Y轴的R绘制方框图
									R
							 									Plot
							 
R 为匹配因子创建滞后变量
									R
							 									Variables
							 									Loops
							 
R中的方向统计量
									R
							 									Function
							 
将字符串更改为R中的日期和时间
									R
							 
将数据帧列汇总为带系数列的行，并使用R
									R
							 
获取R中的最大常用词
									R
							 									Nlp
							 
R 重复记录的分组数据子集
									R
							 
是否可以将数据拉入R并将传入流直接/同时保存到硬盘驱动器？
									R
							 
为什么我在R中使用multinom（）得到的系数比我的特征更多？
									R
							 
R 向表中插入行
									R
							 
柱上条件下R中的矩阵构造
									R
							 									Matrix
							 
R 行有数据时，从长到宽和重复列
									R
							 									Vba
							 
R 通过在函数中使用反应值通知依赖函数
									R
							 									Shiny
							 
带speedglm的加权最小二乘法
库（speedglm）
df似乎指定了一种方法起作用：speedlm（y~x，data=df，weights=df$weights，method=“qr”）。但是它看起来像一个bug：在内部speedglm:：：spee
									R
							 
R 在抓取Google页面时，第一个页面在输出中重复。我该怎么办？
									R
							 
R H2O数据准备导出到POJO
									R
							 
R-如何将使用stat_summary（）的两个图形放在一个绘图中
									R
							 									Plot
							 									Graph
							 
循环遍历R中特定数量的行
									R
							 									For Loop
							 									If Statement
							 
mvrnorm（来自质量）与rmvnorm（来自mvtnorm）
									R
							 									Random
							 
R 如何基于其他变量的匹配名称使用ifelse创建多个变量？
									R
							 
如何将ggrough图表另存为.png
									R
							 
R 如何将SE错误条添加到ggplot2中的条形图中？
									R
							 
使用epiR和purr:：map或其他方法构建分层epi分析的最佳方法是什么？
									R
							 									Loops
							 
如何使用facet_wrap拆分列并绘制图形？
									R
							 
R 努力理解Q1计算
									R
							 									Statistics
							 
R 当
									R
							 
R 基于其他列查找列值之和
									R
							 									Dataframe
							 
如何根据R中的条件计算两行之间的观测值？
									R
							 
如何创建一个列/向量，该列/向量基于另一个向量按顺序排列行'；这是他的身份。R
									R
							 									Dataframe
							 									Vector
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Autohotkey
Cuda
Netty
Facebook Graph Api
Ftp
Sqlite
Android Emulator
Extjs4
Authentication
Grafana
Logstash
Latex
Xamarin.forms
Php
Swiftui
Matrix
Openlayers 3
Tfs
Webview
Azure Cosmosdb
Compression
Safari
Lambda
Twitter Bootstrap
.net 4.0
Exchange Server
Cocos2d X
Silverlight 4.0
Netsuite
Material Ui
Jsp
Xpath
Aem
Cookies
Gcc
Rss
Ffmpeg
Windbg
Yii2
Compiler Construction
Random
Airflow
Cron
Silverlight
EmptyTag
Subsonic
Socket.io
Regex
Mapping
Input
Ibm Mobilefirst
Jpa
Google Api
Sencha Touch 2
Streaming
Character Encoding
Asp.net Mvc 2
Facebook
Import
Salesforce
Azure Devops
Unity3d
Signalr
Influxdb
Jar
Iphone
Azure Functions
Llvm
Vb6
Nsis
Class
Asp.net Mvc 4
Cassandra
Amazon Ec2
Path
Com
Scrapy
Android Fragments
Checkbox
Asterisk
Module
Ethereum
Codeigniter
Curl
Windows Phone 8
Wcf
Jvm
Reactjs
Tinymce
Sequelize.js
Symfony1
Tkinter
Heroku
Plot
Cucumber
Drupal 7
Excel
Text
Jekyll
Rabbitmq
Datatables
Ios6
Jquery Plugins
Mapbox
Keyboard
Monitoring
Bash
Amazon Dynamodb
C# 3.0
Recursion
Shopify
Eclipse Rcp
Layout
Nservicebus
Pine Script
Azure Data Factory
Webgl
3d
Ant
Teradata
Express
Directx
Julia
Kentico
Azure
Yocto
Extjs
Login
Hazelcast
Camera
Maven 2
Ms Office
Mqtt
Cryptography
Model View Controller
Uwp
Fonts
Autodesk Forge
Teamcity
Markdown
Enums
Stream
React Native
View
Adobe
Canvas
Kernel
Ckeditor
Gatsby
Jquery
Office365
Sockets
Perl
Laravel 5
Apache2
Doctrine
Microsoft Graph Api
Single Sign On
Gulp
Antlr
Windows Services
Netlogo
Vmware
Swift2
Selenium Webdriver
Http
Backbone.js
Websphere
Spring Cloud
Clearcase
Reflection
Gnuplot
Openid
Migration
C# 4.0
Postman
Wpf
Requirejs
Ios7
Libgdx
Maps
Ocaml
Gitlab
Openshift
Haskell
Reference
Video Streaming
Office Js
Itext
Internationalization
Dynamic
Visual Studio 2008
Jersey
Wordpress
Macros
Join
Kubernetes
Phpmyadmin
Keycloak
Json
Apache Kafka


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网