Keras 来自多个列名的数据帧的流馈入y_col会生成类型错误

Keras 来自多个列名的数据帧的流馈入y_col会生成类型错误,keras,Keras,我使用数据框中的流来解决14个可能标签的多标签分类问题,所有列名都以字符串格式放置在列表中,例如: columns = ["No Finding", "Enlarged Cardiomediastinum", "Cardiomegaly", "Lung Opacity", "Lung Lesion","Edema", "Consolidation", "Pneumonia", "Atelectasis", "Pneumothorax", "Pleural Effusion", "Ple

我使用数据框中的流来解决14个可能标签的多标签分类问题,所有列名都以字符串格式放置在列表中,例如:

columns = ["No Finding", "Enlarged Cardiomediastinum", "Cardiomegaly", "Lung Opacity", "Lung      Lesion","Edema", "Consolidation", "Pneumonia", "Atelectasis", "Pneumothorax", "Pleural Effusion", "Pleural Other", "Fracture", "Support Devices"]
train_generator=datagen.flow_from_dataframe(
dataframe=df[:178731],
directory='/home/admin1/Downloads/',
x_col='Path',
y_col=columns,
batch_size=batch_size,
seed=42,
shuffle=True,
target_size=(224, 224))
然后将列表名称(列)输入y_col,例如:

columns = ["No Finding", "Enlarged Cardiomediastinum", "Cardiomegaly", "Lung Opacity", "Lung      Lesion","Edema", "Consolidation", "Pneumonia", "Atelectasis", "Pneumothorax", "Pleural Effusion", "Pleural Other", "Fracture", "Support Devices"]
train_generator=datagen.flow_from_dataframe(
dataframe=df[:178731],
directory='/home/admin1/Downloads/',
x_col='Path',
y_col=columns,
batch_size=batch_size,
seed=42,
shuffle=True,
target_size=(224, 224))
我得到了这个错误:

TypeError: If class_mode="categorical", y_col="['No Finding', 'Enlarged Cardiomediastinum', 'Cardiomegaly', 'Lung Opacity', 'Lung Lesion', 'Edema', 'Consolidation', 'Pneumonia', 'Atelectasis', 'Pneumothorax', 'Pleural Effusion', 'Pleural Other', 'Fracture', 'Support Devices']" column values must be type string, list or tuple.
我已经尝试过之前提出的解决方案:

df['No Finding'] = df['No Finding'].astype(str)
df['Enlarged Cardiomediastinum'] = df['Enlarged Cardiomediastinum'].astype(str)
df['Cardiomegaly'] = df['Cardiomegaly'].astype(str)
df['Lung Opacity'] = df['Lung Opacity'].astype(str)
df['Lung Lesion'] = df['Lung Lesion'].astype(str)
df['Edema'] = df['Edema'].astype(str)
df['Consolidation'] = df['Consolidation'].astype(str)
df['Pneumonia'] = df['Pneumonia'].astype(str)
df['Atelectasis'] = df['Atelectasis'].astype(str)
df['Pneumothorax'] = df['Pneumothorax'].astype(str)
df['Pleural Effusion'] = df['Pleural Effusion'].astype(str)
df['Pleural Other'] = df['Pleural Other'].astype(str)
df['Fracture'] = df['Fracture'].astype(str)
df['Support Devices'] = df['Support Devices'].astype(str)
它只在我向y_col输入单个列名时起作用。我使用的是keras 2.2.4,我已经卸载了keras.preprocessing并安装了github版本。“目录流”功能似乎不支持使用默认类模式作为“分类”以列表格式将多个列名馈送到y_col,因为这是一个多标签分类问题。我怀疑类型问题源于pandas数据帧值仅转换为对象,keras预处理数据帧迭代器代码仅允许字符串、列表或元组,但pandas不直接转换为字符串仅转换为对象。下面是我的代码:

df=pd.read_csv('/home/admin1/Downloads/CheXpert-v1.0/train.csv')

df = df.replace(np.nan, 0)
df['No Finding'].head()

df['No Finding'] = df['No Finding'].astype(str)
df['Enlarged Cardiomediastinum'] = df['Enlarged Cardiomediastinum'].astype(str)
df['Cardiomegaly'] = df['Cardiomegaly'].astype(str)
df['Lung Opacity'] = df['Lung Opacity'].astype(str)
df['Lung Lesion'] = df['Lung Lesion'].astype(str)
df['Edema'] = df['Edema'].astype(str)
df['Consolidation'] = df['Consolidation'].astype(str)
df['Pneumonia'] = df['Pneumonia'].astype(str)
df['Atelectasis'] = df['Atelectasis'].astype(str)
df['Pneumothorax'] = df['Pneumothorax'].astype(str)
df['Pleural Effusion'] = df['Pleural Effusion'].astype(str)
df['Pleural Other'] = df['Pleural Other'].astype(str)
df['Fracture'] = df['Fracture'].astype(str)
df['Support Devices'] = df['Support Devices'].astype(str)
df['Age'] = df['Age'].astype(str)

df.dtypes

columns=["No Finding", "Enlarged Cardiomediastinum", "Cardiomegaly", "Lung Opacity",
"Lung Lesion","Edema", "Consolidation", "Pneumonia", "Atelectasis",
"Pneumothorax", "Pleural Effusion", "Pleural Other", "Fracture",
"Support Devices"]

datagen=ImageDataGenerator(rescale=1./255.)
test_datagen=ImageDataGenerator(rescale=1./255.)

train_generator=datagen.flow_from_dataframe(
dataframe=df[:178731],
directory='/home/admin1/Downloads/',
x_col='Path',
y_col=columns,
batch_size=batch_size,
seed=42,
shuffle=True,
target_size=(224, 224))

我也遇到了同样的问题,通过将class_mode参数更改为“other”可以解决这个问题。在跟踪了来自\u dataframe()的流\u的tensorflow文档中的几个链接后,我遇到了一些问题

因此,根据上面的内容,您只需直接将class_模式设置为“other”,它就可以工作了

train_generator=datagen.flow_from_dataframe(
dataframe=df[:178731],
directory='/home/admin1/Downloads/',
x_col='Path',
y_col=columns,
batch_size=batch_size,
class_mode='raw'
seed=42,
shuffle=True,
target_size=(224, 224))
不过,我应该说,我在tensorflow或keras文档中都没有看到提到class_模式“other”。然而,它似乎确实起作用,所以我现在正在使用它


编辑:我已经意识到,在当前版本的keras中,“其他”是被贬低的。我已经更新了上面的代码,以反映新的正确的class_模式,该模式应为“raw”。

我已经将对象添加到预处理文件的类型列表中,但现在我得到了一个关键错误:KeyError:[‘无发现’、‘心脏纵隔肿大’、‘心脏肿大’、‘肺混浊’、‘肺损伤’、‘水肿’、‘实变’、‘肺炎’、‘肺不张’、‘气胸’、‘胸腔积液’、‘其他胸膜’、‘骨折’、‘支撑装置’]