Python 3.x I'；我试图解析来自可视基因组区域的短语，并将数据拆分为训练集和测试集_Python 3.x_Argparse

Python 3.x I'；我试图解析来自可视基因组区域的短语，并将数据拆分为训练集和测试集

python-3.x

Python 3.x I'；我试图解析来自可视基因组区域的短语，并将数据拆分为训练集和测试集,python-3.x,argparse,Python 3.x,Argparse,我得到这个错误： from sklearn.model_selection import train_test_split parser = argparse.ArgumentParser(description='Splits visual_genome file into training and test sets.') parser.add_argument('phrase', metavar='phrase', type=str, help='Path to visual_g

我得到这个错误：

from sklearn.model_selection import train_test_split
    
parser = argparse.ArgumentParser(description='Splits visual_genome file into training and test sets.')
parser.add_argument('phrase', metavar='phrase', type=str, help='Path to visual_genome annotations file.')
parser.add_argument('train', type=str, help='Where to store visual_genome training annotations')
parser.add_argument('test', type=str, help='Where to store visual_genome test annotations')
parser.add_argument('-s', dest='split', type=float, required=True, help="A percentage of a split; a number in (0, 1)")
args = parser.parse_args()
    
def save_visual_genome(file, id, x, y,width, height,phrase,images):
    with open(file, 'wt', encoding='UTF-8') as vg:
        json.dump({ 'id': id, 'x': x, 'y': y, 'width': width, 'height': height,'phrase': phrase,'image': image}, vg, indent=2, sort_keys=True)
    
def main(args):
    with open(args.phrase, 'rt', encoding='UTF-8') as phrase:
        vg = json.load(phrase)
        id = vg['id']
        x = vg['x']
        y = vg['y']
        width = vg['width']
        height = vg['height']
        phrase = vg['phrase']
        image = vg['image']
    
        a,b = train_test_split(phrase, train_size=args.split)
    
        save_visual_genome(args.train, id, x, y, width, height,a, image)
        save_visual_genome(args.test, id, x, y, width, height, b, image)
    
        print("Saved {} entries in {} and {} in {}".format(len(a), args.train, len(b), args.test))
    
if __name__ == "__main__":
    main(args)

这里可能存在什么问题？是否有更有效或更简单的方法来执行相同的任务？

尝试替换

id = vg['id']    # this line throws the error
TypeError: list indices must be integers or slices, not str

与

也

train_test_split方法将numpy数组作为输入。检查
两次分配同一变量名（即短语）时要小心

vg = json.load(phrase)

vg = json.loads(phrase.read())