Python 将类别嵌套列表放入数据框架中

Python 将类别嵌套列表放入数据框架中,python,pandas,Python,Pandas,我有三个级别的分类数据,我需要将它们转换成一个数据框架,在上面的类别上有重复的标签。我有以下“主要”、“次要”和“第三级”的列表: main_labels = ['Certain infectious and parasitic diseases','Neoplasms'] main_icds = ['A00-B99','C00-D49'] sub_labels = ['Intestinal infectious diseases','Tuberculosis','Malignant neopl

我有三个级别的分类数据,我需要将它们转换成一个数据框架,在上面的类别上有重复的标签。我有以下“主要”、“次要”和“第三级”的列表:

main_labels = ['Certain infectious and parasitic diseases','Neoplasms']
main_icds = ['A00-B99','C00-D49']
sub_labels = ['Intestinal infectious diseases','Tuberculosis','Malignant neoplasms of lip, oral cavity and pharynx','Malignant neoplasms of digestive organs']
sub_icds = ['A00-A09','A15-A19','C00-C14','C15-C26']
ter_labels = ['Cholera','Typhoid and paratyphoid fevers','Respiratory tuberculosis','Tuberculosis of nervous system','Malignant neoplasm of lip','Malignant neoplasm of base of tongue','Malignant neoplasm of esophagus','Malignant neoplasm of stomach']
ter_icds = ['A00','A01','A15','A17','C00','C01','C15','C16']
出于说明和示例目的,我需要它们在熊猫数据框中的外观如下所示。如果我能做到这一点,我可以添加标签值


看起来很容易,但我被难倒了。非常感谢您的帮助。我试图搜索历史帖子,但在找到合适的关键词以接近我要做的事情时遇到了困难。谢谢

我认为最好的方法是从三元分类开始,然后找到它的子分类和主分类。python允许在字母数字字符串上使用不等式,因此这应该非常健壮

import pandas as pd

main_icds = ['A00-B99','C00-D49']
sub_icds = ['A00-A09','A15-A19','C00-C14','C15-C26']
ter_icds = ['A00','A01','A15','A17','C00','C01','C15','C16']

#split on '-' to get bounds for each category
subs = [sub.split('-') for sub in sub_icds]
mains = [main.split('-') for main in main_icds]

df = pd.DataFrame({'ter_icd':ter_icds})
df['sub_icd'] = [sub_icd for ter in ter_icds 
                     for sub_icd,sub in zip(sub_icds,subs) 
                         if (ter >= sub[0]) & (ter <= sub[1])]
df['main_icd'] = [main_icd for ter in ter_icds 
                      for main_icd,main in zip(main_icds,mains)
                          if (ter >= main[0]) & (ter <= main[1])]
将熊猫作为pd导入
主要ICD=['A00-B99','C00-D49']
sub_icds=['A00-A09'、'A15-A19'、'C00-C14'、'C15-C26']
ter_icds=['A00'、'A01'、'A15'、'A17'、'C00'、'C01'、'C15'、'C16']
#在“-”上拆分以获取每个类别的边界
sub=[sub ICD中sub的sub.split('-')]
干管=[干管中的干管拆分('-')]
df=pd.DataFrame({'teru-icd':teru-icds})
df['sub_icd']=[ter-in-ter_icd的sub_icd
对于sub_icd,zip中的sub_icd(sub_icd,sub)

如果(ter>=sub[0])&(ter=main[0])&(ter非常好!!谢谢!!