Python 在DataFrame上为海量数据集使用str.split（expand=True）_Python_Python 3.x_String_Pandas_Dataframe - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在DataFrame上为海量数据集使用str.split（expand=True）_Python_Python 3.x_String_Pandas_Dataframe - Fatal编程技术网

Python 在DataFrame上为海量数据集使用str.split（expand=True）

python python-3.x string pandas dataframe

Python 在DataFrame上为海量数据集使用str.split（expand=True）,python,python-3.x,string,pandas,dataframe,Python,Python 3.x,String,Pandas,Dataframe,我正在处理一个数据集，它是从.txt类型的文件中以数据帧的形式读入的，~80000000-100000000行此数据帧作为单列读入，其中我必须使用df[column\u name].str.split（expand=True）来展开每个值。这将为一组空白之间的每个值提供一个唯一的列该数据集的一个警告是，读入该单列的值的数量可能不同，但如果有任何值“丢失”，则从序列的末尾开始丢失，而不是介于两者之间例如： df_in 0 0 123 203 113 32 1 555 22 155

我正在处理一个数据集，它是从.txt类型的文件中以数据帧的形式读入的，~80000000-100000000行
此数据帧作为单列读入，其中我必须使用df[column\u name].str.split（expand=True）来展开每个值。这将为一组空白之间的每个值提供一个唯一的列
该数据集的一个警告是，读入该单列的值的数量可能不同，但如果有任何值“丢失”，则从序列的末尾开始丢失，而不是介于两者之间
例如：

df_in 0 0 123 203 113 32 1 555 22 155 2 670 12
然后使用
df_out=df_in['0'].str.split（expand=True）
问题是，在我得到这些巨大的数据集之前，这种方法一直运行良好，在那里我遇到了
MemoryError
问题
有没有办法处理这些较大的数据集？也许使用多重处理

需要注意的是，保持数据帧的索引很重要。
请至少共享部分数据，以及更多程序（请参阅：）。很可能有一种方法可以解决这个问题。为什么不先在读取文本文件时将行拆分为列呢？
df_out 0 1 3 4 0 123 203 113 32 1 555 22 155 None 2 670 12 None None

[python 3.x]相关文章推荐

Python 3.x 元组属性？（Python） python-3.x

Python 3.x “导入错误模块”；pmw“；从tkinter使用Python 3 Debian python-3.x tkinter raspberry-pi

Python 3.x 在Python中不使用random.shuffle（）对字符串进行置乱 python-3.x random

Python 3.x 布尔运算 python-3.x pandas dataframe

Python 3.x 尝试从终端安装pip时出错 python-3.x bash pip

Python 3.x 如何对Flask中的多个函数使用相同的路由 python-3.x flask

Python 3.x 基于比较从另一个嵌套列表获得的值，从嵌套列表中删除项 python-3.x

Python 3.x tkinter仅通过函数工作？ python-3.x tkinter

Python 3.x Python合并目录中的excel文件 python-3.x

Python 3.x Python中字符串切片的性能问题 python-3.x string

Python 3.x 将调整大小的图像添加到帧 python-3.x

Python 3.x python：从一个实例到另一个实例的通用函数调用 python-3.x

Python 3.x 使用Python和uWSGI在调用方法中触发中间件 python-3.x

Python 3.x Can'；我找不到硒元素 python-3.x selenium web-scraping automation

Python 3.x 无法使用Docker容器中的flask-ldap3-login对公司LDAP进行身份验证 python-3.x docker flask docker-compose ldap

Python 3.x 在gunicorn服务器中，如何设置客户端\u max\u body\u size 0m python-3.x flask

Python 3.x python3.8谷歌搜索结果在urllib.error.HTTPError:HTTP错误429:请求太多 python-3.x

Python 3.x youtube\u dl错误：请求的格式不可用 python-3.x

Python 3.x SQLAlchemy:表具有多个外键约束关系 python-3.x sqlalchemy

Python 3.x 无法查询"；Python教程摘要“：必须是；“主题”；实例 python-3.x django django-models

随机文章推荐

Magento WS-I兼容v2 API WSDL web服务SOAP-ERROR:编码：对象没有'；sessionId'；财产 magento

Magento前端缺少UPS运送方法 magento

Magento负载平衡-附加许可证 magento

Magento 如何创建具有关联产品的可配置产品？ magento

Magento:在body类中显示自定义属性 magento

Magento管理CMS页面内容在错误选项卡中 magento content-management-system

magento如何在结帐页面中更改购物车帐户 magento

Magento 如何将带有自定义选项的简单产品分配给可配置产品&；将其与Configuralbe产品一起显示 magento

Magento 从phtml到控制器的表单操作 magento

在magento社区版中使用信用备忘录 magento

在Ultimate module Creator中显示为magento创建的自定义模块 magento

magento fooman google analytics在一个页面中添加两次代码 magento

Magento-获取属性选项值和数量 magento

Magento 如何更改顶部链接 magento

Magento 如何获得那些只有在愿望清单中有产品的客户？ magento

Magento：以编程方式添加子块 magento

Magento 优惠券错误地检测到我购物车中的钱 magento

在magento的管理面板中无法单击自定义菜单 magento

Magento 如何根据多个门店设置不同的价格、可见性和产品库存？ magento

Magento 更新自定义选项双重保存所需问题 magento magento2

[python]相关推荐

Tags

Uitableview Localization Testng Itext Lambda Umbraco Big O Doctrine Orm Tinymce Mips Windbg Robotframework Couchbase Msbuild Meteor Wpf Fonts Logstash Objective C Jquery Plugins Jpa List User Interface Ionic2 Instagram Cobol Jsf Ios5 Exception Handling Ldap Sql Server 2008 X86 Webgl Selenium Webdriver Vb.net Time Ckeditor Web Crawler Dart Axapta Pyspark Core Data Actions On Google F# Activemq Sharepoint 2013 Servlets Magento Webrtc Awk Javascript C# Stm32 Firefox Addon Charts Weblogic Video Streaming Pip Highcharts Prestashop Nest Tensorflow Vector Routes Chart.js Github Gtk Windows 8 Sharepoint Perforce Flask Phantomjs Apache Class Mfc Mongoose Hbase Nhibernate Stream Camera Stripe Payments Redis Bash Winapi Assembly Keyboard Sugarcrm Aws Lambda Git Google Drive Api Methods Xaml C Eclipse Encryption Puppet Cuda Usb Reflection Mobile Vagrant Gruntjs Sonarqube Grails Android Emulator Continuous Integration Domain Driven Design Google Chrome Polymer Hybris Sql Server .net Core Json Nginx Julia Time Complexity Solr Iphone Oracle Apex Rally Intellij Idea Macros Drupal 7 Asp.net Mvc 2 Activerecord Computer Science Xcode Jsp Office Js Notifications Blackberry Replace Heroku Quickbooks Acumatica Django Models Tridion Internet Explorer 8 Tree Ios8 Google Plus Linkedin Compiler Construction Rest Ios6 Caching Programming Languages Language Agnostic Openstack Db2 Jms Netty Microsoft Graph Api Arm Merge Responsive Design Nestjs Tfs Octave Push Notification Azure Unity3d Gmail Ftp Types Mediawiki Netlogo Gdb Silverstripe Udp .net 4.0 Jquery Ui C# 4.0 Cloud Indexing Jmeter Apache Kafka Raspberry Pi Web Scraping Image Azure Active Directory Ajax Dictionary Mysql Ruby On Rails 4 Graphics Windows Services Android Studio Entity Framework Arduino Visual Studio Code Jboss Ember.js System Verilog Processing Knockout.js Java 8 Process C++ Cli Apache2 Ravendb

Copyright © 2024. All Rights Reserved by - Fatal编程技术网