Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/314.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在熊猫中,如何搜索单词和短语来创建新的数据框架?_Python_Pandas - Fatal编程技术网

Python 在熊猫中,如何搜索单词和短语来创建新的数据框架?

Python 在熊猫中,如何搜索单词和短语来创建新的数据框架?,python,pandas,Python,Pandas,在Python3和pandas中,我有以下数据帧: bens_gerais_candidatos_2014.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 6400 entries, 0 to 6399 Data columns (total 12 columns): uf_x 6400 non-null object cargo 6400 non-null obj

在Python3和pandas中,我有以下数据帧:

bens_gerais_candidatos_2014.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 6400 entries, 0 to 6399
Data columns (total 12 columns):
uf_x               6400 non-null object
cargo              6400 non-null object
nome_completo      6400 non-null object
sequencial         6400 non-null object
cpf                6400 non-null object
nome_urna          6400 non-null object
partido_eleicao    6400 non-null object
situacao           6400 non-null object
uf_y               6400 non-null object
descricao          6400 non-null object
detalhe            6400 non-null object
valor              6400 non-null float64
dtypes: float64(1), object(11)
memory usage: 650.0+ KB
等等。然后将这些行与几个合并行合并:

areas1 = pd.merge(parte1, parte2, left_on='cpf', right_on='cpf', how='outer')
areas2 = pd.merge(areas1, parte3, left_on='cpf', right_on='cpf', how='outer')

请问,有没有其他更简单的方法来查找单词和短语以创建新的数据框架

不重复行-例如,在某些情况下,“LOTE RURAL”出现在一行中,而在其他情况下,“LOTE RURAL”与“FAZENDA”一起出现,或者只有“FAZENDA”出现。像这样:

"LOTE RURAL 42"
"LOTE RURAL 38, DENOMINADO FAZENDA CATARINA"
"FAZENDA ÁGUA VERMELHA"
我认为你可以做到:

str_choice = "LOTE RURAL|FAZENDA|IMOVEL RURAL" 
bens_gerais_candidatos_2014[bens_gerais_candidatos_2014['detalhe'].\
                               str.contains(str_choice, na=False)]
符号
|
str_choice
中的意思是“或”,因此它可以获得您要查找的所有不同的单词,添加您需要的
|

我想您可以:

str_choice = "LOTE RURAL|FAZENDA|IMOVEL RURAL" 
bens_gerais_candidatos_2014[bens_gerais_candidatos_2014['detalhe'].\
                               str.contains(str_choice, na=False)]

符号
|
str_choice
中表示“或”,因此它可以获取您要查找的所有不同单词,添加您需要的
|

您可以尝试以下代码:

search_list = ["LOTE RURAL","FAZENDA","IMOVEL RURAL","GLEBA","AREA RURAL","AREA NO LOTEAMENTO"]

mask = bens_gerais_candidatos_2014['detalhe'].str.contains('|'.join(search_list))

您可以尝试以下代码:

search_list = ["LOTE RURAL","FAZENDA","IMOVEL RURAL","GLEBA","AREA RURAL","AREA NO LOTEAMENTO"]

mask = bens_gerais_candidatos_2014['detalhe'].str.contains('|'.join(search_list))