Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/311.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 拆分数据帧并保存为txt文件_Python_Pandas - Fatal编程技术网

Python 拆分数据帧并保存为txt文件

Python 拆分数据帧并保存为txt文件,python,pandas,Python,Pandas,我有这样一个数据帧: Histogram DN Npts Total Percent Acc Pct Band 1 -0.054741 1 1 0.0250 0.0250 Bin=0.00233 -0.052404 0 1 0.0000 0.0250 -0.050067 0 1 0.00

我有这样一个数据帧:

  Histogram           DN     Npts    Total   Percent   Acc Pct
  Band 1       -0.054741        1        1    0.0250    0.0250
  Bin=0.00233  -0.052404        0        1    0.0000    0.0250
               -0.050067        0        1    0.0000    0.0250
               -0.047730        0        1    0.0000    0.0250
               -0.045393        0        1    0.0000    0.0250
               -0.043056        0        1    0.0000    0.0250
               -0.040719        0        1    0.0000    0.0250
  Histogram           DN     Npts    Total   Percent   Acc Pct
  Band 2        0.000000      346      346    9.5186    9.5186
  Bin=0.00203   0.002038        0      346    0.0000    9.5186
                0.004076        0      346    0.0000    9.5186
                0.006114        0      346    0.0000    9.5186
                0.008152        0      346    0.0000    9.5186
                0.010189        0      346    0.0000    9.5186
                0.012227        0      346    0.0000    9.5186
我想根据单词直方图出现的时间(在本例中,每8行)对其进行分割。我可以这样分割它:

np.array_split(df,8)
但是如果有一种方法可以在关键字上实现它,我会更喜欢它。然后我想将每个分割保存到自己的文本文件中。有办法做到这一点吗

df.head().to_json()
返回:

{"Histogram  ":{"0":"Band 1     ","1":"Bin=0.00233","2":"           ","3":"           ","4":"           "},"       DN":{"0":"-0.054741","1":"-0.052404","2":"-0.050067","3":"-0.047730","4":"-0.045393"},"   Npts":{"0":"      1","1":"      0","2":"      0","3":"      0","4":"      0"},"  Total":{"0":"      1","1":"      1","2":"      1","3":"      1","4":"      1"}," Percent":{"0":"  0.0250","1":"  0.0000","2":"  0.0000","3":"  0.0000","4":"  0.0000"}," Acc Pct":{"0":"  0.0250","1":"  0.0250","2":"  0.0250","3":"  0.0250","4":"  0.0250"}}

首先,您应该规范化列名,此时它们包含空格(这解释了您前面看到的KeyError):

要按乐队分组,我会使用cumsum:

In [14]: df1  # similar to your example
Out[14]:
         DN  Npts  Total  Acc Pct  Percent    Histogram
0 -0.054741     1      1    0.025    0.025  Band 1
1 -0.052404     0      1    0.025    0.000  Bin=0.00233
2 -0.050067     0      1    0.025    0.000
3 -0.047730     0      1    0.025    0.000
4 -0.045393     0      1    0.025    0.000
5 -0.054741     1      1    0.025    0.025  Band 2
6 -0.052404     0      1    0.025    0.000  Bin=0.00233
7 -0.050067     0      1    0.025    0.000
8 -0.047730     0      1    0.025    0.000
9 -0.045393     0      1    0.025    0.000

In [15]: df1["Histogram"].str.startswith("Band").cumsum()
Out[15]:
0    1
1    1
2    1
3    1
4    1
5    2
6    2
7    2
8    2
9    2
Name: Histogram, dtype: int64
您可以将其用于groupby(这是您希望拆分的方式):

现在,您可以在空闲时提取/清洁:

In [21]: g.get_group(1)
Out[21]:
         DN  Npts  Total  Acc Pct  Percent    Histogram
0 -0.054741     1      1    0.025    0.025  Band 1
1 -0.052404     0      1    0.025    0.000  Bin=0.00233
2 -0.050067     0      1    0.025    0.000
3 -0.047730     0      1    0.025    0.000
4 -0.045393     0      1    0.025    0.000

In [22]: [x for _, x in g]
Out[22]:
[         DN  Npts  Total  Acc Pct  Percent    Histogram
 0 -0.054741     1      1    0.025    0.025  Band 1
 1 -0.052404     0      1    0.025    0.000  Bin=0.00233
 2 -0.050067     0      1    0.025    0.000
 3 -0.047730     0      1    0.025    0.000
 4 -0.045393     0      1    0.025    0.000             ,
          DN  Npts  Total  Acc Pct  Percent    Histogram
 5 -0.054741     1      1    0.025    0.025  Band 2
 6 -0.052404     0      1    0.025    0.000  Bin=0.00233
 7 -0.050067     0      1    0.025    0.000
 8 -0.047730     0      1    0.025    0.000
 9 -0.045393     0      1    0.025    0.000             ]

这将过滤dataframe txt并为直方图创建新的txt文件:

count = 1
# used in the naming of the new txt files

txtFile = "his.txt"
# histogram text file

splitTxt = " Histogram           DN     Npts    Total   Percent   Acc Pct"
# string used to split the lines of code into sections/blocks

with open(txtFile,"r") as myResults:

   blocks = myResults.read()

for contents in blocks.split(splitTxt)[1:]:

    lines = contents.split('\n')

    with open('Results_{}.txt'.format(count), 'w') as op:

        op.writelines('{}'.format(splitTxt))

        for i in range(8):

            op.writelines('{}\n'.format(lines[i]))

    count = count + 1

你有这个数据作为文本吗?如果是,很容易,这些数据最初来自一个文本文件。小心,如果你继续删除/重新发布,你将自动被禁止提问。对不起,我正在尝试找到不同的方法来实现这一点,因为我找不到另一种方法,我想将所有内容保存到文本中,然后进行装箱可能会奏效,所以这实际上有点不同。你能读入内存中的所有行,循环这些行并使用类似“if Histogram in line:”
In [21]: g.get_group(1)
Out[21]:
         DN  Npts  Total  Acc Pct  Percent    Histogram
0 -0.054741     1      1    0.025    0.025  Band 1
1 -0.052404     0      1    0.025    0.000  Bin=0.00233
2 -0.050067     0      1    0.025    0.000
3 -0.047730     0      1    0.025    0.000
4 -0.045393     0      1    0.025    0.000

In [22]: [x for _, x in g]
Out[22]:
[         DN  Npts  Total  Acc Pct  Percent    Histogram
 0 -0.054741     1      1    0.025    0.025  Band 1
 1 -0.052404     0      1    0.025    0.000  Bin=0.00233
 2 -0.050067     0      1    0.025    0.000
 3 -0.047730     0      1    0.025    0.000
 4 -0.045393     0      1    0.025    0.000             ,
          DN  Npts  Total  Acc Pct  Percent    Histogram
 5 -0.054741     1      1    0.025    0.025  Band 2
 6 -0.052404     0      1    0.025    0.000  Bin=0.00233
 7 -0.050067     0      1    0.025    0.000
 8 -0.047730     0      1    0.025    0.000
 9 -0.045393     0      1    0.025    0.000             ]
count = 1
# used in the naming of the new txt files

txtFile = "his.txt"
# histogram text file

splitTxt = " Histogram           DN     Npts    Total   Percent   Acc Pct"
# string used to split the lines of code into sections/blocks

with open(txtFile,"r") as myResults:

   blocks = myResults.read()

for contents in blocks.split(splitTxt)[1:]:

    lines = contents.split('\n')

    with open('Results_{}.txt'.format(count), 'w') as op:

        op.writelines('{}'.format(splitTxt))

        for i in range(8):

            op.writelines('{}\n'.format(lines[i]))

    count = count + 1