Python 如何显示其类别中的所有关键字？_Python_Pandas_Dataframe_Nlp

Python 如何显示其类别中的所有关键字？

python pandas dataframe nlp

Python 如何显示其类别中的所有关键字？,python,pandas,dataframe,nlp,Python,Pandas,Dataframe,Nlp,这是我在bank1.txt文件中的数据集 Keyword:Category ccn:fintech credit:fintech smart:fintech 这是我在bank2.txt文件中的数据集 Keyword:Category mcm:mcm switching:switching pul-sim:pulsa transfer:transfer debit sms:money transfer 我想做什么 Keyword Category_all mcm

这是我在bank1.txt文件中的数据集

Keyword:Category
ccn:fintech
credit:fintech
smart:fintech

这是我在bank2.txt文件中的数据集

Keyword:Category
mcm:mcm
switching:switching
pul-sim:pulsa
transfer:transfer
debit sms:money transfer

我想做什么

 Keyword     Category_all
 mcm           mcm
 switching     switching
 pul-sim       pulsa
 transfer      transfer
 debit sms     money transfer
 ccn           fintech
 credit        fintech
 smart         fintech

我所做的是

with open('entity_dict.txt') as f:  //bank.txt
    content = f.readlines() 
    content = [x.strip() for x in content ]

def ambil(inp):
    try:
        out = []
        for x in content:      
            if x in inp:
                out.append(x)

        if len(out) == 0:
            return 'other'
        else:
            output = ' '.join(out)
            return output

    except:
        return 'other'

frame_institution['Keyword'] = frame_institution['description'].apply(ambil)
fintech = pd.read_csv('bank.txt', sep=":")
frame_Keyword = pd.merge(frame_institution, fintech, on='Keyword')

对于bank2.txt，代码为

with open('entity_dict2.txt') as f: 
    content2 = f.readlines()
    content2 = [x.strip() for x in content2 ]

def ambil2(inp):
    try:
        out = []
        for x in content2:      
            if x in inp:
                out.append(x)

        if len(out) == 0:
            return 'other'
        else:
            output = ' '.join(out)
            return output
    except:
        return 'other'

frame_institution['Keyword2'] =   frame_institution['description'].apply(ambil2) 
fintech2 = pd.read_csv('bank2.txt', sep=":")
frame_Keyword2 = pd.merge(frame_institution, fintech, on='Keyword')
frame_Keyword2 = pd.merge(frame_Keyword2, fintech2, on='Keyword2')

然后我会过滤一些关键词：

frame_Keyword2[frame_Keyword2['category_all'] == 'pulsa']

实际结果是：

Keyword     Category_all
 mcm           mcm
 switching     switching
 ccn           fintech
 credit        fintech
 smart         fintech

但是

类别中没有出现'pulsa'
，'transfer'
，'money transfer'
。我想有更好的办法解决这个问题
`
只需尝试合并：
数据帧1:
>>> df1
  Keyword Category
0     ccn  fintech
1  credit  fintech
2   smart  fintech

>>> df2
     Keyword        Category
0        mcm             mcm
1  switching       switching
2    pul-sim           pulsa
3   transfer        transfer
4  debit sms  money transfer

数据帧2:
>>> df1
  Keyword Category
0     ccn  fintech
1  credit  fintech
2   smart  fintech

>>> df2
     Keyword        Category
0        mcm             mcm
1  switching       switching
2    pul-sim           pulsa
3   transfer        transfer
4  debit sms  money transfer

结果，合并外部
>>> pd.merge(df1, df2, how='outer')
     Keyword        Category
0        ccn         fintech
1     credit         fintech
2      smart         fintech
3        mcm             mcm
4  switching       switching
5    pul-sim           pulsa
6   transfer        transfer
7  debit sms  money transfer

如果有人为了类似的查询而挂在这里，下面添加的另一个解决方案只是为了子孙后代：
使用DataFrame.append（）
方法：
df1.append(df2, ignore_index=True)

与pd.concat（）一起
或者创建一个农场，然后进行康卡特：
frames = [df1,df2]
pd.concat(frames, ignore_index=True)

只需尝试合并：
数据帧1:
>>> df1
  Keyword Category
0     ccn  fintech
1  credit  fintech
2   smart  fintech

>>> df2
     Keyword        Category
0        mcm             mcm
1  switching       switching
2    pul-sim           pulsa
3   transfer        transfer
4  debit sms  money transfer

数据帧2:
>>> df1
  Keyword Category
0     ccn  fintech
1  credit  fintech
2   smart  fintech

>>> df2
     Keyword        Category
0        mcm             mcm
1  switching       switching
2    pul-sim           pulsa
3   transfer        transfer
4  debit sms  money transfer

结果，合并外部
>>> pd.merge(df1, df2, how='outer')
     Keyword        Category
0        ccn         fintech
1     credit         fintech
2      smart         fintech
3        mcm             mcm
4  switching       switching
5    pul-sim           pulsa
6   transfer        transfer
7  debit sms  money transfer

如果有人为了类似的查询而挂在这里，下面添加的另一个解决方案只是为了子孙后代：
使用DataFrame.append（）
方法：
df1.append(df2, ignore_index=True)

与pd.concat（）一起
或者创建一个农场，然后进行康卡特：
frames = [df1,df2]
pd.concat(frames, ignore_index=True)

尝试使用df1=pd.read\u csv（'bank1.txt'，sep='：'）
读取您的数据集，与其他数据集类似。然后使用重复链接。我已更正了您的缩进，请验证。尝试使用df1=pd.read\u csv（'bank1.txt'，sep='：'）
读取您的数据集，类似地，其他数据集也是如此。然后使用重复链接。我已更正您的缩进，请验证。