Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/333.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将数据帧的每行保存到txt文件_Python_Pandas_Numpy_Hdf5 - Fatal编程技术网

Python 将数据帧的每行保存到txt文件

Python 将数据帧的每行保存到txt文件,python,pandas,numpy,hdf5,Python,Pandas,Numpy,Hdf5,因此,我从HDF5文件中打开一个数据集,如下所示: import pandas as pd import numpy as np data1 = pd.read_hdf('sport.hdf5', usecols=['category','title','images','link','date','desc']) category title images \ 0 raket Kevi

因此,我从HDF5文件中打开一个数据集,如下所示:

import pandas as pd
import numpy as np

data1 = pd.read_hdf('sport.hdf5', usecols=['category','title','images','link','date','desc'])
category                                              title  images  \
0      raket  Kevin/Marcus Langsung Fokus ke Kejuaraan Dunia...     NaN   
1         f1         Vettel Menangi GP Inggris yang Penuh Drama     NaN   
2     others  Semangat 'Semakin di Depan' Warnai Kejuaraan M...     NaN   
5  sepakbola             Roberto Martinez Mengejar Status Elite     NaN   
6  sepakbola  Nyaris Separuh Gol Piala Dunia 2018 Lahir dari...     NaN   

                                                link  \
0  https://sport.detik.com/raket/d-4104834/kevinm...   
1  https://sport.detik.com/f1/d-4104788/vettel-me...   
2  https://sport.detik.com/sport-lain/d-4105193/s...   
5  https://sport.detik.com/sepakbola/berita/d-410...   
6  https://sport.detik.com/sepakbola/berita/d-410...   

                             date  \
0   Senin 09 Juli 2018, 00:31 WIB   
1  Minggu 08 Juli 2018, 22:35 WIB   
2   Senin 09 Juli 2018, 11:15 WIB   
5   Senin 09 Juli 2018, 12:35 WIB   
6   Senin 09 Juli 2018, 12:51 WIB   

                                                desc  
0   - Setelah  , Kevin Sanjaya/Marcus Gideon suda...  
1   - Driver Ferrari   keluar sebagai pemenang Gr...  
2   - Kejuaraan Dunia Motocross Grand Prix (MXGP)...  
5             -   bisa jadi mulai kerap diperbinc...  
6             - Berakhirnya perempatfinal Piala D... 
它将为我提供如下输出:

import pandas as pd
import numpy as np

data1 = pd.read_hdf('sport.hdf5', usecols=['category','title','images','link','date','desc'])
category                                              title  images  \
0      raket  Kevin/Marcus Langsung Fokus ke Kejuaraan Dunia...     NaN   
1         f1         Vettel Menangi GP Inggris yang Penuh Drama     NaN   
2     others  Semangat 'Semakin di Depan' Warnai Kejuaraan M...     NaN   
5  sepakbola             Roberto Martinez Mengejar Status Elite     NaN   
6  sepakbola  Nyaris Separuh Gol Piala Dunia 2018 Lahir dari...     NaN   

                                                link  \
0  https://sport.detik.com/raket/d-4104834/kevinm...   
1  https://sport.detik.com/f1/d-4104788/vettel-me...   
2  https://sport.detik.com/sport-lain/d-4105193/s...   
5  https://sport.detik.com/sepakbola/berita/d-410...   
6  https://sport.detik.com/sepakbola/berita/d-410...   

                             date  \
0   Senin 09 Juli 2018, 00:31 WIB   
1  Minggu 08 Juli 2018, 22:35 WIB   
2   Senin 09 Juli 2018, 11:15 WIB   
5   Senin 09 Juli 2018, 12:35 WIB   
6   Senin 09 Juli 2018, 12:51 WIB   

                                                desc  
0   - Setelah  , Kevin Sanjaya/Marcus Gideon suda...  
1   - Driver Ferrari   keluar sebagai pemenang Gr...  
2   - Kejuaraan Dunia Motocross Grand Prix (MXGP)...  
5             -   bisa jadi mulai kerap diperbinc...  
6             - Berakhirnya perempatfinal Piala D... 

现在,我需要用标题title保存包含desc的每一行,我使用下面的代码:

np.savetxt(data1['title']+'.txt', data1['desc'], fmt='%s')
但是,结果是这样的:

Traceback (most recent call last):
  File "index.py", line 23, in <module>
    np.savetxt(data1['title']+'.txt', data1['desc'], fmt='%s')
  File "/home/adminsvr/tf-py3/lib/python3.5/site-packages/numpy/lib/npyio.py", line 1187, in savetxt
    if fname.endswith('.gz'):
  File "/home/adminsvr/tf-py3/lib/python3.5/site-packages/pandas/core/generic.py", line 3614, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'endswith'
回溯(最近一次呼叫最后一次):
文件“index.py”,第23行,在
np.savetxt(data1['title']+'.txt',data1['desc'],fmt='%s')
savetxt中的文件“/home/adminsvr/tf-py3/lib/python3.5/site packages/numpy/lib/npyio.py”,第1187行
如果fname.endswith('.gz'):
文件“/home/adminsvr/tf-py3/lib/python3.5/site packages/pandas/core/generic.py”,第3614行,位于__
返回对象。\uuuGetAttribute(self,name)
AttributeError:“Series”对象没有属性“endswith”

有什么解决方案或想法吗?

工作数小时后,以下是解决问题的想法:

首先,对Data1 dataframe的行进行迭代。不要忘记添加将返回行选择的属性ItErrors。别忘了定义索引和行

要为每一行创建文件,请定义后跟(row[title])的目录,使其成为动态的

但是,目录result/还不存在。用户通过makedir来实现它

最后,在txt文件中写入(row[desc])

我们开始:

import os

for idx,row in data1.iterrows():

    filename = "result/"+str(row['title'])+".txt"
    os.makedirs(os.path.dirname(filename), exist_ok=True)
    with open(filename, "w+") as f:
        f.write(row['desc'])
    f.close()

    print (idx)

请发布打印结果(data1.head())。编辑显示data1.head()的结果