Python 将文件的节写入其他两个文件

Python 将文件的节写入其他两个文件,python,file,Python,File,所以我有一个文件,其中包含我正在编写的程序的组合实例(数字列表)。 然后,我继续将所有带有@的行放在培训和测试文件中。现在我想把28709个实例放到我的训练文件中,然后把文件的其余实例放到测试文件中 当我这样做时,使用以下代码: import itertools # Splits the training and testing instances # with the newly reduced attributes training = open('training.txt', 'w')

所以我有一个文件,其中包含我正在编写的程序的组合实例(数字列表)。 然后,我继续将所有带有@的行放在培训和测试文件中。现在我想把28709个实例放到我的训练文件中,然后把文件的其余实例放到测试文件中

当我这样做时,使用以下代码:

import itertools

# Splits the training and testing instances
# with the newly reduced attributes

training = open('training.txt', 'w')
testing = open('testing.txt', 'w')

linecount = 0

with open('combined.txt', 'r') as f:
    for l in f:
        if not l.startswith('@'):
            break
        else:
            training.write(l)
            testing.write(l)
            linecount += 1

with open('combined.txt', 'r') as f:
    newcount = 0
    for l in f:
        while(newcount < linecount):
            f.next()
            newcount += 1

        if linecount > (linecount + 28709):
            testing.write(l)
        else:
            training.write(l)
        linecount += 1
    '''# Write 28,709 instances to training set
    for l in itertools.islice(f, linecount, linecount + 28709):
        training.write(l)
    # Write rest of instances to testing set
    for i in xrange(linecount + 28710):
        f.next()
    for l in f:
        testing.write(l)'''
导入itertools
#分割训练和测试实例
#使用新减少的属性
training=open('training.txt','w')
testing=open('testing.txt','w')
行数=0
以open('combined.txt','r')作为f:
对于f中的l:
如果不是l.startswith('@'):
打破
其他:
培训.写作(l)
测试。写入(l)
行数+=1
以open('combined.txt','r')作为f:
newcount=0
对于f中的l:
而(newcount(linecount+28709):
测试。写入(l)
其他:
培训.写作(l)
行数+=1
''将28709个实例写入训练集
对于itertools.islice中的l(f,行数,行数+28709):
培训.写作(l)
#将其余实例写入测试集
对于X范围内的i(测线计数+28710):
f、 下一个()
对于f中的l:
测试。写下(l)“”“
。。它不会对训练集执行所有实例,也不会输出任何测试集。可以在此处找到原始的组合文件(太大,无法粘贴到此处):

编辑:所有@符号行都应同时位于这两个位置。然后,最后一个“@”之后的前28709行应该在训练文件中,其余的应该在测试文件中


谢谢

这应该能满足你的需要。我在代码中添加了注释来解释我所做的更改

# Splits the training and testing instances
# with the newly reduced attributes

training = open('training.txt', 'w')
testing = open('testing.txt', 'w')

linecount = 0

with open('combined.txt', 'r') as f:
    for l in f:
        if not l.startswith('@'):
            break
        else:
            training.write(l)
            testing.write(l)
        # increment every time to get position of last '@' symbol
        # can't skip lines in between '@'' symbols
        linecount += 1

val = 28709

with open('combined.txt', 'r') as f:
    # skip first n lines up to last '@' symbol
    for _ in range(linecount):
        f.next()

    # write first 28709 lines after last '@' symbol to training file
    new_linecount = 0
    for l in f:
        if new_linecount >= val:
            testing.write(l)
        else:
            training.write(l)
        new_linecount += 1
    '''# Write 28,709 instances to training set
    for l in itertools.islice(f, linecount, linecount + 28709):
        training.write(l)
    # Write rest of instances to testing set
    for i in xrange(linecount + 28710):
        f.next()
    for l in f:
        testing.write(l)'''

我对测试和培训文件中应该包含的内容感到困惑。所有@symbol行都应该在这两种格式中。那么您想让没有@symbol的行只在测试文件中?@AlexF所有@symbol行都应该在这两个文件中。然后,最后一个“@”之后的前28709行应该在训练文件中,其余的应该在测试文件中