Python 2.7 使用sklearn.feature\u extraction.text countvectorier时从文件中读取文档_Python 2.7_Scikit Learn - Fatal编程技术网

Python 2.7 使用sklearn.feature\u extraction.text countvectorier时从文件中读取文档

python-2.7 scikit-learn

Python 2.7 使用sklearn.feature\u extraction.text countvectorier时从文件中读取文档,python-2.7,scikit-learn,Python 2.7,Scikit Learn,我能够使用文档示例中的代码，其中fit_transform（）函数的输入是一个句子列表，即： corpus = [ 'this is the first document', 'this is the second second document', 'and the third one', 'is this the first document?' ] X=矢量器。拟合_变换（语料库）并获得预期的数据。但当我试图用文件列表或文件对象（如文档所示）替换语料库时： "

我能够使用文档示例中的代码，其中fit_transform（）函数的输入是一个句子列表，即：

corpus = [
   'this is the first document',
   'this is the second second document',
   'and the third one',
   'is this the first document?'
]

X=矢量器。拟合_变换（语料库）

并获得预期的数据。但当我试图用文件列表或文件对象（如文档所示）替换语料库时：

" 拟合（原始文档，y=无）

"

。。因此，我认为我对管道的理解中缺少了一些东西。给定一个我想对其进行矢量化计数的文件目录，我该怎么做？

如果我尝试将文件对象列表作为[open（file，'r'）输入，我得到的错误消息是文件对象没有较低的功能。

将矢量器的

输入设置为文件名或文件。它的默认值是content
，假设您已经将文件读入内存。谢谢，这就是我在解释文档时迷失的地方。实际上，我将文件直接输入到构造函数，但是没有收到任何警告，所以我没有看到它。
Learn a vocabulary dictionary of all tokens in the raw documents.
Parameters :    
raw_documents : iterable
    An iterable which yields either str, unicode or file objects.
Returns :   
self :




[scikit learn]相关文章推荐



                                                        
Scikit learn 在windows 7上安装scikit learn for python 3.3
scikit-learn 
Scikit learn 高斯过程吞噬了我的记忆
scikit-learn 
Scikit learn GridSearchCV在使用自定义分数函数时是否可以使用predict_proba？
scikit-learn 
Scikit learn sklearn中的网格搜索交叉验证
scikit-learn 
Scikit learn 为什么Cross_Val_分数与分层洗牌分割有如此大的差异？
scikit-learn 
Scikit learn 在处理VotingClassifier或网格搜索时，Sklearn中是否有GradientBoostingClassifier的类权重（或替代方法）？
scikit-learn 
Scikit learn SKL分类器与AUC方法ROC-AUC得分的差异
scikit-learn 
Scikit learn 什么'；scikit学习中预测概率和决策函数的区别是什么？
scikit-learn 
Scikit learn 使用KerasClassifier和fit_生成器
scikit-learnkeras 
Scikit learn SKTF-IDF要放弃号码吗？
scikit-learn 
Scikit learn 检查输入时出错：预期conv2d_1_输入没有形状
scikit-learndeep-learningkeras 
Scikit learn 递归特征消除的RFE计算
scikit-learn 
Scikit learn scikit学习-使用带有RandomForestClassifier.predict（）的单个字符串？
scikit-learn 
Scikit learn Jupyter笔记本模块NotFoundError:没有名为'；sklearn.impute'；
scikit-learnjupyter-notebook 
Scikit learn 基于TF-IDF的电影收视率预测
scikit-learn 
Scikit learn 如何修复sklearn fit_变换上的“元组索引超出范围”错误？
scikit-learn 
Scikit learn ValueError:对象对于所需数组[np.bincount]太深
scikit-learndeep-learning 
Scikit learn 精度、召回率、F1指标不包括标签sklearn
scikit-learn 
                                       





随机文章推荐



                                                        
Logging 使用MySQL执行存储注释
loggingmysql 
Logging 将日志输出发送到Grails1.3.2中的不同文件
logginggrailslog4j 
Logging 如何设置已部署的eclipse RCP应用程序的调试选项？
loggingeclipse-rcp 
Logging 如何配置jar的日志记录
logging 
Logging 更改现有categorySource的名称，然后在运行时设置为defaultCategory（Ent Lib 5）
logging 
Logging 在windows 8 metro应用程序中收集分析和错误的策略
loggingerror-handling 
Logging jboss日志旋转
loggingjboss 
Logging jetty和log4j的日志文件保持为空
logginglog4jjetty 
Logging 以编程方式配置logback
logging 
Logging 为了重定向kdb中的标准输出，如何引导除结果之外的所有内容？
loggingkdb 
Logging Fluentd-发送日志文件并保存它'；s格式
logging 
Logging 如何将存储日志配置到Asp Net Core中的应用程序文件夹？
loggingasp.net-core.net-core 
Logging Phoenix删除[调试]日志记录
loggingelixir 
Logging Can'；t关闭Wiremock的调试日志记录
loggingspring-boot 
Logging 如何让Robot框架记录python中调用的python方法？
loggingrobotframework 
Logging 如何配置Railo以便cflog将其数据记录到控制台
logging 
Logging 如何使用Stackdriver归档/清除GCS中的日志？
logginggoogle-cloud-platformgoogle-cloud-storage 
Logging 将kubernetes（kubeadm）日志记录从/var/log/messages中删除
loggingkubernetes 
Logging 数据库中更改的日志文件在哪里？
loggingdb2 
Logging Kubernetes-在AWS EKS Fargate中，如何将日志从一个容器发送到FluentD进行Splunk？
loggingkubernetes


                                        

                                        
                                        


                                                
                                                        [python 2.7]相关推荐
                                                        
Python 2.7 创建的Python字典的长度为'；不匹配输入文件的长度
									Python 2.7
							 									Dictionary
							 
Python 2.7 python在命令行中在PC上导入Tkinter
									Python 2.7
							 									Command Line
							 									Tkinter
							 
Python 2.7 如何在瓶子中将wsgi.url_方案设置为https？
									Python 2.7
							 									Https
							 									Openshift
							 
Python 2.7 从系列构建新的数据帧
									Python 2.7
							 									Pandas
							 
Python 2.7 通过python在google组中添加成员
									Python 2.7
							 
Python 2.7 “元组索引超出范围”读取面板
									Python 2.7
							 									Pandas
							 
Python 2.7 带argparse的非位置但必需的参数
									Python 2.7
							 
Python 2.7 使用"；的文件对象；加上；不止一次
									Python 2.7
							 									File Io
							 
Python 2.7 nltk主目录的Docker？
									Python 2.7
							 									Docker
							 
Python 2.7 Python约束（错误）消息
									Python 2.7
							 
Python 2.7 Python错误名称'；运行文件'；未在Spyder中定义
									Python 2.7
							 
Python 2.7 Pycharm查找瓶颈函数
									Python 2.7
							 									Pycharm
							 
Python 2.7 如何继承这个类的
#/usr/bin/python
#-*-编码：ISO-8859-1-*-
从gi.repository导入GLib、Gtk、Gio
导入系统
从gi.repository导入Gdk
从gi.repository.GdkPixbuf导入P
									Python 2.7
							 
Python 2.7 安装cPickle失败
									Python 2.7
							 
Python 2.7 如何求最大似然估计
									Python 2.7
							 
Python 2.7 文本文件中的Python单词搜索
									Python 2.7
							 
Python 2.7 查找组合长度大于所用变量的组合
									Python 2.7
							 									Math
							 
Python 2.7 无法在Windows上安装枕头
									Python 2.7
							 
Python 2.7 没有名为urllib3的模块-正在尝试安装pip
									Python 2.7
							 									Pip
							 
Python 2.7 字典生成空的键/值
									Python 2.7
							 									Csv
							 									Dictionary
							 
Python 2.7 Python中的简单赋值语句
									Python 2.7
							 
Python 2.7 是否可以将此python代码转换为一行
									Python 2.7
							 									For Loop
							 
Python 2.7 如何在使用pandas（ipython笔记本）创建的袖扣条形图中添加下拉框和搜索框？
									Python 2.7
							 									Pandas
							 
Python 2.7 在python中，curl响应是通过libcurl2无法读取的字符
									Python 2.7
							 
Python 2.7 如何对pyLint输出结果进行排序？
									Python 2.7
							 									Pycharm
							 
Python 2.7 AttributeError:draw\u Artister只能在缓存渲染的初始绘制之后使用
									Python 2.7
							 									Matplotlib
							 
Python 2.7 Dask读取sql表错误：'；instancemethod'；对象没有属性'__获取项目'；
									Python 2.7
							 									Dask
							 
Python 2.7 安装python包时发生egginfo错误
									Python 2.7
							 
Python 2.7 如何删除旋转&；周围的黑色边框；结果是什么？OpenCv Python
									Python 2.7
							 									Opencv
							 									Image Processing
							 
Python 2.7 如何使用python 2.7将图像上传到google云存储
									Python 2.7
							 									Google App Engine
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Flask
Azure Sql Database
Jpa
Drools
For Loop
Sockets
Notepad++
Telerik
Matrix
Imagemagick
Iphone
Laravel 5
Geolocation
Docker Compose
Blazor
Eclipse Rcp
System Verilog
Discord.js
Clojure
Opencart
Docusignapi
Alfresco
Iframe
Terminal
Webview
Parse Platform
Gridview
Rx Java
Django Models
Content Management System
Coldfusion
E Commerce
Android Fragments
Netsuite
Hive
Rss
Jsf
Python 2.7
Io
Internet Explorer 8
Marklogic
Vb6
Xampp
C
Hibernate
Axapta
Pagination
Playframework 2.0
Grails
Ruby On Rails 3.2
Google Plus
Sharepoint
Passwords
Loops
Mobile
Cmd
Objective C
Exception
Mule
Smalltalk
Mono
Tags
Embedded
Ios5
Asp.net
Fortran
Eclipse
Webpack
Timer
Google Cloud Platform
Apache Zookeeper
Xml
Ember.js
Windows 7
Macos
Dart
C++11
Yocto
Julia
Oracle11g
Influxdb
Netlogo
Model View Controller
Pandas
Azure Cosmosdb
Random
Verilog
Ssh
Filter
Nhibernate
Firefox
Amazon S3
Windows Installer
Ms Access
Sublimetext2
Vim
Z3
Ubuntu
Excel
Jms
Cassandra
Memory Management
Loopbackjs
Jdbc
Google Colaboratory
Nest
Autocomplete
Ag Grid
Couchdb
Intellij Idea
Windows Store Apps
Core Data
3d
Concurrency
Mysql
Redis
Isabelle
Wxpython
Rspec
Here Api
Artifactory
Swift3
Sed
Windows Phone
Colors
Configuration
Protractor
Wso2
Yaml
Gcc
Character Encoding
Internationalization
Pytorch
Sprite Kit
Ffmpeg
Search
Twitter Bootstrap 3
Silverlight
Linq
Scheme
Maven
Amazon Redshift
Twig
Sml
Safari
Netbeans
Wordpress
Requirejs
Javascript
Url
Select
Sql Server
Tinymce
Odoo
Linkedin
Talend
Ms Office
Sqlalchemy
Svn
Ios6
X86
Sdk
Spring Cloud
Outlook
Google Api
Air
.net 4.0
Qt4
Visual Studio 2015
Apache2
Telegram
Openid
Post
Ms Word
Python 3.x
Gwt
Llvm
Path
Ruby On Rails 4
Snmp
Variables
Jsp
Actionscript 3
Xpath
Sequelize.js
Error Handling
Pascal
Webstorm
Bluetooth
Xmpp
Apache Camel
Google Apps Script
Adobe
Arm
Visual Studio 2012
Inheritance
Moodle
Pdf
Scripting
Actions On Google
Nestjs


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网