Python 如何删除gensim中的stopwords？_Python_Gensim - Fatal编程技术网

Python 如何删除gensim中的stopwords？

python

Python 如何删除gensim中的stopwords？,python,gensim,Python,Gensim,我在数据帧的“message”列上尝试了此操作，但出现错误： df_clean['message'] = df_clean['message'].apply(lambda x: gensim.parsing.preprocessing.remove_stopwords(x)) 显然，df_clean[“message”]列包含一个单词列表，而不是一个字符串，因此出现了这样的错误：需要一个类似object的字节，list found 要解决此问题，需要使用如下方法将其再次转换为字符串： Type

我在数据帧的“message”列上尝试了此操作，但出现错误：

df_clean['message'] = df_clean['message'].apply(lambda x: gensim.parsing.preprocessing.remove_stopwords(x))

显然，

df_clean[“message”]

列包含一个单词列表，而不是一个字符串，因此出现了这样的错误：

需要一个类似object的字节，list found

要解决此问题，需要使用如下方法将其再次转换为字符串：

TypeError: decoding to str: need a bytes-like object, list found

请注意，

df_clean[“message”]

在应用上一个代码后将包含字符串对象。

这不是

gensim

问题，错误是由

pandas

引起的：列

message

中有一个值的类型是

list

，而不是

string

。下面是一个最小的

pandas

示例：

df_clean['message'] = df_clean['message'].apply(lambda x: gensim.parsing.preprocessing.remove_stopwords(" ".join(x)))

错误是，remove_stopwords需要string类型对象，并且您正在传递一个列表，因此在删除stop words之前，请检查列中的所有值是否为string类型

import pandas as pd from gensim.parsing.preprocessing import remove_stopwords df = pd.DataFrame([['one', 'two'], ['three', ['four']]], columns=['A', 'B']) df.A.apply(remove_stopwords) # works fine df.B.apply(remove_stopwords) TypeError: decoding to str: need a bytes-like object, list found

[html5 canvas]相关文章推荐

Html5 canvas 如何为HTML5画布注册onkeydown事件 html5-canvas

Html5 canvas 如何重置HTML5画布的axix html5-canvas

Html5 canvas 如何在drop event kineticjs上获取舞台坐标 html5-canvas

Html5 canvas EaselJS：使用一条线连接2个容器/形状 html5-canvas

Html5 canvas 如何从openFL框架访问createJS子级？ html5-canvas

Html5 canvas 关于性能，将ColorFilter永久应用于Bitmap.image 我有一张背景图片很大的画布/舞台我有一个在舞台上画画的形状我已使用shape.cacheCanvas将AlphaMaskFilter应用于背景图像 html5-canvas

Html5 canvas 为什么我在phaser项目中遇到类型不匹配错误，我正在使用InternetExplorer10部署游戏？ html5-canvas

Html5 canvas 基本的HTML画布绘图/动画框架（或者至少是框架的大部分）是否仍然需要，例如processing.js？ html5-canvas

Html5 canvas 将HTML5画布推到Videojs中 html5-canvas

Html5 canvas 画布未输出预期值£；画布上的文本符号，包含一个附加字符 html5-canvas

Html5 canvas 画布静态高度图表 html5-canvas chart.js

随机文章推荐

Wxpython 将数据添加到wxlistbox wxpython

Wxpython 在wxwidgets中绘制半透明窗口 wxpython

wxpython多面板到swith解决方案？ wxpython

保存应用程序的当前状态并将其加载回WXpython中的应用程序 wxpython

Wxpython 文件打开，文件保存混乱 wxpython

如何简化wxpython代码？ wxpython

Wxpython 使用wx版本3.0.3导入PyDeadObjectError wxpython

Wxpython 在Windows中用wx python显示PIL图像 wxpython

[python]相关推荐

Python 在CentOS上安装Pycurl？
Python Linux Curl Centos

是否可以使用python的默认internet浏览器打开某些web地址？
Python

Python ftplib-retrbinary失败，零字节文件超时
Python

WxFormBuilder没有Python选项
Python Wxpython

python中点星的matlab等效
Python Matlab

Python 从django注册表中删除第二个密码输入
Python Django

Nodeenv抛出没有可用的pythonvirtualenv
Python Django

调查结果；python配置--ldflags"；在窗户上
Python C++ Windows Cmake

Python Bokeh上有用于图形的水平滚动条吗？
Python D3.js

使用内置python ssl模块验证签名
Python Ssl

为什么这个python生成器函数只正确运行一次？
Python Python 3.x

Python UnicodeDecodeError:（'；utf-8'；编解码器）在Pandas中读取dta文件时出错
Python Pandas Utf 8 Stata

Python ModuleNotFoundError:没有名为'；蟒蛇'；
Python

Python-根据数组值将数组拆分为多个数组
Python Arrays Numpy

Python win32 Excel将图表粘贴为位图（粘贴特殊）？
Python Excel

向AzureML工作区添加python模块
Python Azure Scikit Learn

Python UnicodeDecodeError:&x27；utf-16-le'；
Python Ms Access

Python 为装有gunicorn和nginx的烧瓶应用程序提供服务时，pdfkit不工作
Python Nginx Flask

Python Django：如何授予用户/组在指定时间段内查看模型实例的权限
Python Django

Python:KeyError:0-函数，用于从通过循环访问的数据帧子集创建列表
Python List Loops Dynamic

Python 如何以最小的处理器消耗终止进程
Python

Python 导入错误：无法导入名称'；输入读取器pb2和x27；
Python Tensorflow

Python Pygame显示2D numpy阵列
Python Numpy

Python中不可变对象的类型是什么（对于mypy）
Python Python 3.x

Python 访问数据帧中的数据
Python Pandas

Python unittest-如何选择执行测试的url？
Python Unit Testing Selenium

Python Keras ImageDataGenerator:数据和标签形状存在问题
Python Tensorflow Image Processing Keras

Python pandas.txt文件到.csv文件缺少行
Python Excel Pandas

在python中，用嵌套循环填充二维列表会跳过并意外地重复值
Python

Python 使用多对多记录创建时链接到现有记录
Python Django Django Rest Framework

Tags

Codenameone Ansible Version Control Wicket Data Structures Ocaml Cmake Libgdx Django Biztalk Gnuplot Google App Engine Fluent Nhibernate Jquery Ui Laravel Aframe Virtualbox Apache Nifi Apache Spark Maven Cypress Xamarin.android Terraform Apache Flink Omnet++ Colors Validation Bootstrap 4 Syntax Telerik Zend Framework Drools Vmware Image Processing Sequelize.js Lucene Database Design Typo3 Talend Cuda Ip For Loop Winforms Design Patterns Properties Events Cloud Mqtt Sqlite Mongodb Hadoop Elixir Performance Excel Formula Servlets Java Me Jms C# 3.0 Twitter Bootstrap 3 Ldap Dom Ignite Merge Coding Style Filesystems Websphere Csv Coldfusion Google Chrome Datetime Gcc Struct Dns Acumatica Couchbase Apache Camel Next.js Api Racket Rx Java Architecture Listview Iphone Mvvm Sharepoint 2013 Jupyter Notebook Hive Twitter Bootstrap Speech Recognition D Module Io Mapreduce Google Plus Vbscript Cocoa Google Cloud Storage Netty Weblogic Date Hbase Java Jira Azure Functions Jwt Vector Concurrency Sockets Sencha Touch 2 Canvas Selenium Svg Postgresql Aws Lambda Caching Awk Blockchain Internationalization Pandas Batch File Spring Integration Plsql Ubuntu Prestashop Macos Inno Setup Shell Url C++11 Build Sass Spring Security Uiview Directx Mod Rewrite Dialogflow Es Mpi Breeze Notepad++ Menu Permissions Keras Gulp Ibm Midrange Object Geometry C++ Clearcase Db2 Fonts Marklogic Couchdb Adobe Compiler Construction Visual Studio Printing Html5 Canvas Documentation Hash Eclipse Rcp Azure Data Factory Mobile Redirect Cordova Sed Recursion Subsonic Operating System Ms Access Objective C Qt Opencart Maven 2 Polymer Pentaho Odoo Webview Yii2 Parameters Sqlalchemy Navigation Netbeans Google Analytics Computer Vision Visual Studio 2008 Asp.net Mvc 3 Google Cloud Firestore Virtual Machine Leaflet Powerbi Ruby On Rails 3.2 Sml Oracle10g Logic Shopify Cmd Safari Stream Chart.js Dll Maps

Copyright © 2024. All Rights Reserved by - Fatal编程技术网