Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/357.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 合并IP地址列上的两个数据帧_Python_Pandas_Ip - Fatal编程技术网

Python 合并IP地址列上的两个数据帧

Python 合并IP地址列上的两个数据帧,python,pandas,ip,Python,Pandas,Ip,我有两个数据帧,df1和df2,结构如下: ip_address property_A 1.1.1.1 AAA 1.2.2.2 BBB 1.3.3.3 CCC ... ... 1.255.255.255.255 ZZZ ip_address property_B 1.1.1.1 YRG 1.2.2

我有两个数据帧,
df1
df2
,结构如下:

ip_address            property_A
1.1.1.1               AAA
1.2.2.2               BBB
1.3.3.3               CCC
...                   ...
1.255.255.255.255     ZZZ

ip_address            property_B
1.1.1.1               YRG
1.2.2.2               HJK
1.3.3.3               KJH
...                   ...
1.255.255.255.255     TYU
我想把它们合并到“ip_地址”列中。 由于该列中包含的数据的性质,此命令失败:

pd.merge(df1, df2, on='ip_address', how='inner')

>> dtype: object does not appear to be an IPv4 or IPv6 address
>> AddressValueError: Expected 4 octets in [...]
一种可能的解决方案是使用
ipaddress
模块将IP地址转换为整数,如本例所示:

import ipaddress
int(ipaddress.IPv4Address('192.168.0.1'))

>> 3232235521
为了有效地执行此操作,我尝试了以下命令:

import numpy as np
import pandas as pd
df1['int_ip'] = np.nan
df1.int_ip = int(ipaddress.IPv4Address(df1.ip_address))
但是,即使此命令也失败:

pd.merge(df1, df2, on='ip_address', how='inner')

>> dtype: object does not appear to be an IPv4 or IPv6 address
>> AddressValueError: Expected 4 octets in [...]
唯一可行的方法是:

for i in range(0, df1.shape[0]):
    df1.int_ip[i] = int(ipaddress.IPv4Address(df1.ip_address[i]))
但这一次效率极低

你有更好的方法吗

d = {'ip_address': ['1.1.1.1', '2.2.2.2','3.3.3.3','1.255.255.255'], 'property_A': ['AAA','BBB','CCC','ZZZ']}
df1 = pd.DataFrame(data=d)
b = {'ip_address': ['1.1.1.1', '2.2.2.2','3.3.3.3','1.255.255.255'], 'property_B': ['YRG','HJK','KJH','TYU']}
df2 = pd.DataFrame(data=b)
我想试试这个:

df3= df1.merge(df2.set_index('ip_address'),
               left_on=df1.ip_address,
               right_index=True)

df1
    ip_address    property_A
0   1.1.1.1       AAA
1   2.2.2.2       BBB
2   3.3.3.3       CCC
3   1.255.255.255 ZZZ

df2    
    ip_address    property_B
0   1.1.1.1       YRG
1   2.2.2.2       HJK
2   3.3.3.3       KJH
3   1.255.255.255 TYU

df3
    ip_address    property_A    property_B
0   1.1.1.1       AAA           YRG
1   2.2.2.2       BBB           HJK
2   3.3.3.3       CCC           KJH
3   1.255.255.255 ZZZ           TYU

尝试使用
apply
功能。类似于
df1.ip\u address.apply(lambda x:ipaddress.IPv4Address(x))
什么数据类型是
ip\u address
pd.merge()
如果它是字符串,则对我有效<代码>df1.loc[:,'ip_address'].astype(str)应该可以工作。