Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将用户代理列解析为多个列_Python_Pandas_User Agent - Fatal编程技术网

Python 将用户代理列解析为多个列

Python 将用户代理列解析为多个列,python,pandas,user-agent,Python,Pandas,User Agent,我有一个http请求日志的数据框架。唯一相关的列是我试图解析的userAgent列。我正在使用ua_解析器。这会将每个userAgent转换为嵌套字典,如下所示: >>> from ua_parser import user_agent_parser >>> user_agent_parser.Parse('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, li

我有一个http请求日志的数据框架。唯一相关的列是我试图解析的userAgent列。我正在使用ua_解析器。这会将每个userAgent转换为嵌套字典,如下所示:

>>> from ua_parser import user_agent_parser
>>> user_agent_parser.Parse('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36')
{
     'device': {'brand': None, 
                'model': None, 
                'family': 'Other'}, 
     'os': {'major': '10', 
            'patch_minor': None, 
            'minor': '10', 
            'family': 'Mac OS X', 
            'patch': '5'}, 
     'user_agent': {'major': '55', 
                    'minor': '0', 
                    'family': 'Chrome', 
                    'patch': '2883'}, 
     'string': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'
}
我正试图使用user\u agent\u parser的结果在日志数据帧上创建4个额外的列。我想要设备品牌、设备型号、操作系统系列和用户代理系列

不幸的是,当我将其存储为numpy数组时,我无法访问字典索引:

>>> parsed_ua = logs['userAgent'].apply(user_agent_parser.Parse)
>>> logs['device_brand'] = parsed_ua['device']['brand']
KeyError: 'device'
我尝试将其转换为数据帧,以便将解析的_ua与日志合并。不幸的是,这会将每个字典写入一列

>>> pd.DataFrame(parsed_ua)
userAgent
0   {u'device': {u'brand': None, u'model': None, u...
1   {u'device': {u'brand': None, u'model': None, u...
2   {u'device': {u'brand': None, u'model': None, u...
3   {u'device': {u'brand': None, u'model': None, u...
4   {u'device': {u'brand': None, u'model': None, u...
如何解析userAgent列并将结果写入多个列?

您可以使用以下方法:


除了您已经完成的工作之外,您还可以使用以下各项的lambda:


谢谢这是一个很好的解决方案,这远没有那么周而复始。非常感谢。
In [146]: pd.io.json.json_normalize(parsed_ua)
Out[146]:
  device.brand device.family device.model os.family os.major os.minor  \
0         None         Other         None  Mac OS X       10       10

  os.patch os.patch_minor                                   string  \
0        5           None  Mozilla/5.0 (Macintosh; Intel Mac OS...

  user_agent.family user_agent.major user_agent.minor user_agent.patch
0            Chrome               55                0             2883
ua = logs['userAgent'].apply(lambda ua: user_agent_parser.Parse(ua))

logs['device_brand'] = ua.apply(lambda x: x['device']['brand'])
logs['device_model'] = ua.apply(lambda x: x['device']['model'])
logs['os_family'] = ua.apply(lambda x: x['os']['family'])
logs['user_agent_family'] = ua.apply(lambda x: x['user_agent']['family'])