Python 将文件的节写入其他两个文件_Python_File

Python 将文件的节写入其他两个文件

python file

Python 将文件的节写入其他两个文件,python,file,Python,File,所以我有一个文件，其中包含我正在编写的程序的组合实例（数字列表）。然后，我继续将所有带有@的行放在培训和测试文件中。现在我想把28709个实例放到我的训练文件中，然后把文件的其余实例放到测试文件中当我这样做时，使用以下代码： import itertools # Splits the training and testing instances # with the newly reduced attributes training = open('training.txt', 'w')

所以我有一个文件，其中包含我正在编写的程序的组合实例（数字列表）。然后，我继续将所有带有@的行放在培训和测试文件中。现在我想把28709个实例放到我的训练文件中，然后把文件的其余实例放到测试文件中

当我这样做时，使用以下代码：

import itertools

# Splits the training and testing instances
# with the newly reduced attributes

training = open('training.txt', 'w')
testing = open('testing.txt', 'w')

linecount = 0

with open('combined.txt', 'r') as f:
    for l in f:
        if not l.startswith('@'):
            break
        else:
            training.write(l)
            testing.write(l)
            linecount += 1

with open('combined.txt', 'r') as f:
    newcount = 0
    for l in f:
        while(newcount < linecount):
            f.next()
            newcount += 1

        if linecount > (linecount + 28709):
            testing.write(l)
        else:
            training.write(l)
        linecount += 1
    '''# Write 28,709 instances to training set
    for l in itertools.islice(f, linecount, linecount + 28709):
        training.write(l)
    # Write rest of instances to testing set
    for i in xrange(linecount + 28710):
        f.next()
    for l in f:
        testing.write(l)'''

导入itertools
#分割训练和测试实例
#使用新减少的属性
training=open（'training.txt'，'w'）
testing=open（'testing.txt'，'w'）
行数=0
以open（'combined.txt'，'r'）作为f：
对于f中的l：
如果不是l.startswith（'@'）：
打破
其他：
培训.写作（l）
测试。写入（l）
行数+=1
以open（'combined.txt'，'r'）作为f：
newcount=0
对于f中的l：
而（newcount（linecount+28709）：
测试。写入（l）
其他：
培训.写作（l）
行数+=1
''将28709个实例写入训练集
对于itertools.islice中的l（f，行数，行数+28709）：
培训.写作（l）
#将其余实例写入测试集
对于X范围内的i（测线计数+28710）：
f、 下一个（）
对于f中的l：
测试。写下（l）“”“

。。它不会对训练集执行所有实例，也不会输出任何测试集。可以在此处找到原始的组合文件（太大，无法粘贴到此处）：

编辑：所有@符号行都应同时位于这两个位置。然后，最后一个“@”之后的前28709行应该在训练文件中，其余的应该在测试文件中

谢谢

这应该能满足你的需要。我在代码中添加了注释来解释我所做的更改

# Splits the training and testing instances
# with the newly reduced attributes

training = open('training.txt', 'w')
testing = open('testing.txt', 'w')

linecount = 0

with open('combined.txt', 'r') as f:
    for l in f:
        if not l.startswith('@'):
            break
        else:
            training.write(l)
            testing.write(l)
        # increment every time to get position of last '@' symbol
        # can't skip lines in between '@'' symbols
        linecount += 1

val = 28709

with open('combined.txt', 'r') as f:
    # skip first n lines up to last '@' symbol
    for _ in range(linecount):
        f.next()

    # write first 28709 lines after last '@' symbol to training file
    new_linecount = 0
    for l in f:
        if new_linecount >= val:
            testing.write(l)
        else:
            training.write(l)
        new_linecount += 1
    '''# Write 28,709 instances to training set
    for l in itertools.islice(f, linecount, linecount + 28709):
        training.write(l)
    # Write rest of instances to testing set
    for i in xrange(linecount + 28710):
        f.next()
    for l in f:
        testing.write(l)'''

我对测试和培训文件中应该包含的内容感到困惑。所有@symbol行都应该在这两种格式中。那么您想让没有@symbol的行只在测试文件中？@AlexF所有@symbol行都应该在这两个文件中。然后，最后一个“@”之后的前28709行应该在训练文件中，其余的应该在测试文件中