在python中的json文件中存储具有子列表的列表af字符串

在python中的json文件中存储具有子列表的列表af字符串,python,json,python-2.7,Python,Json,Python 2.7,我正在使用python,我有如下数据: RedHat Enterprise Linux ES 2.1 IA64 RedHat Enterprise Linux ES 2.1 Red Hat Enterprise Linux AS 2.1 Linux kernel 2.6.9 Linux kernel 2.6.8 rc3 Linux kernel 2.6.8 rc1 + Ubuntu Ubuntu Linux 4.1 ppc + Ubuntu Ubuntu Linux 4.1 i

我正在使用python,我有如下数据:

RedHat Enterprise Linux ES 2.1 IA64
RedHat Enterprise Linux ES 2.1
Red Hat Enterprise Linux AS 2.1
Linux kernel 2.6.9 
Linux kernel 2.6.8 rc3
Linux kernel 2.6.8 rc1
    + Ubuntu Ubuntu Linux 4.1 ppc
    + Ubuntu Ubuntu Linux 4.1 ia64
Linux kernel 2.6.8 
我想将这些信息存储在一个json文件中,但我不知道如何存储! 就像我有一个RedHat列表,Linuxes和Ubuntus是我列表中Linux内核2.6.8 rc1的子列表,如下所示:

{"RedHat Enterprise Linux ES 2.1 IA64":{} ,"RedHat Enterprise Linux ES 2.1":{} ,"Red Hat Enterprise":{"Linux AS 2.1","Linux kernel 2.6.9","Linux kernel 2.6.8 rc3","Linux kernel 2.6.8 rc1"},"Linux kernel 2.6.8":{}}
这是我的全部线索:

'RedHat Enterprise Linux WS  2.1 IA64RedHat Enterprise Linux WS  2.1RedHat Enterprise Linux ES  2.1 IA64RedHat Enterprise Linux ES  2.1Red Hat Enterprise Linux AS  2.1 IA64Red Hat Enterprise Linux AS  2.1Linux kernel 2.6.9 Linux kernel 2.6.8 rc3Linux kernel 2.6.8 rc2Linux kernel 2.6.8 rc1+ Ubuntu Ubuntu Linux 4.1 ppc+ Ubuntu Ubuntu Linux 4.1 ia64+ Ubuntu Ubuntu Linux 4.1 ia32Linux kernel 2.6.8 Linux kernel 2.6.7 rc1Linux kernel 2.6.7 Linux kernel 2.6.6 rc1Linux kernel 2.6.6 Linux kernel 2.6.5 Linux kernel 2.6.4 Linux kernel 2.6.3 Linux kernel 2.6.2 Linux kernel 2.6.1 -rc2Linux kernel 2.6.1 -rc1Linux kernel 2.6.1 Linux kernel 2.6 .10Linux kernel 2.6 -test9-CVSLinux kernel 2.6 -test9Linux kernel 2.6 -test8Linux kernel 2.6 -test7Linux kernel 2.6 -test6Linux kernel 2.6 -test5Linux kernel 2.6 -test4Linux kernel 2.6 -test3Linux kernel 2.6 -test2Linux kernel 2.6 -test11Linux kernel 2.6 -test10Linux kernel 2.6 -test1Linux kernel 2.6 Linux kernel 2.4.28 + Trustix Secure Enterprise Linux 2.0 + Trustix Secure Linux 2.2 + Trustix Secure Linux 2.1 + Trustix Secure Linux 2.0 Linux kernel 2.4.27 -pre5Linux kernel 2.4.27 -pre4Linux kernel 2.4.27 -pre3Linux kernel 2.4.27 -pre2Linux kernel 2.4.27 -pre1Linux kernel 2.4.27 Linux kernel 2.4.26 Linux kernel 2.4.25 Linux kernel 2.4.24 -ow1Linux kernel 2.4.24 Linux kernel 2.4.23 -pre9Linux kernel 2.4.23 -ow2Linux kernel 2.4.23 + Trustix Secure Linux 2.0 Linux kernel 2.4.22 + Devil-Linux Devil-Linux 1.0.5 + Devil-Linux Devil-Linux 1.0.4 + Mandriva Linux Mandrake 9.2  amd64+ Mandriva Linux Mandrake 9.2 + Red Hat Fedora  Core1+ Slackware Linux 9.1 Linux kernel 2.4.21 pre7Linux kernel 2.4.21 pre4Linux kernel 2.4.21 pre1Linux kernel 2.4.21 + Conectiva Linux 9.0 + Mandriva Linux Mandrake 9.1 ppc+ Mandriva Linux Mandrake 9.1 + Red Hat Enterprise Linux AS  3+ RedHat Desktop 3.0 + RedHat Enterprise Linux ES  3+ RedHat Enterprise Linux WS  3+ S.u.S.E. Linux Personal 9.0 x86_64+ S.u.S.E. Linux Personal 9.0 + SuSE SUSE Linux Enterprise Server  8Linux kernel 2.4.20 Linux kernel 2.4.19 -pre6Linux kernel 2.4.19 -pre5Linux kernel 2.4.19 -pre4Linux kernel 2.4.19 -pre3Linux kernel 2.4.19 -pre2Linux kernel 2.4.19 -pre1Linux kernel 2.4.19 + Conectiva Linux 8.0 + Conectiva Linux Enterprise Edition 1.0 + MandrakeSoft Corporate Server 2.1  x86_64+ MandrakeSoft Corporate Server 2.1 + MandrakeSoft Multi Network Firewall 2.0 + Mandriva Linux Mandrake 9.0 + S.u.S.E. Linux 8.1 + Slackware Linux  -current+ SuSE SUSE Linux Enterprise Server  8+ SuSE SUSE Linux Enterprise Server  7Linux kernel 2.4.18 pre-8Linux kernel 2.4.18 pre-7Linux kernel 2.4.18 pre-6Linux kernel 2.4.18 pre-5Linux kernel 2.4.18 pre-4Linux kernel 2.4.18 pre-3Linux kernel 2.4.18 pre-2Linux kernel 2.4.18 pre-1Linux kernel 2.4.18  x86Linux kernel 2.4.18 + Astaro Security Linux 2.0 23+ Astaro Security Linux 2.0 16+ Debian Linux 3.0  sparc+ Debian Linux 3.0  s/390+ Debian Linux 3.0  ppc+ Debian Linux 3.0  mipsel+ Debian Linux 3.0  mips+ Debian Linux 3.0  m68k+ Debian Linux 3.0  ia-64+ Debian Linux 3.0  ia-32+ Debian Linux 3.0  hppa+ Debian Linux 3.0  arm+ Debian Linux 3.0  alpha+ Mandriva Linux Mandrake 8.2 + Mandriva Linux Mandrake 8.1 + Mandriva Linux Mandrake 8.0 + Red Hat Enterprise Linux AS  2.1 IA64+ RedHat Advanced Workstation for the Itanium Processor 2.1 IA64+ RedHat Advanced Workstation for the Itanium Processor 2.1 + RedHat Linux 8.0 + RedHat Linux 7.3 + S.u.S.E. Linux 8.1 + S.u.S.E. Linux 8.0 + S.u.S.E. Linux 7.3 + S.u.S.E. Linux 7.2 + S.u.S.E. Linux 7.1 + S.u.S.E. Linux Connectivity Server  + S.u.S.E. Linux Database Server  0+ S.u.S.E. Linux Firewall on CD  + S.u.S.E. Linux Office Server  + S.u.S.E. Linux Openexchange Server  + S.u.S.E. Linux Personal 8.2 + S.u.S.E. SuSE eMail Server 3.1 + S.u.S.E. SuSE eMail Server III  + SuSE SUSE Linux Enterprise Server  8+ SuSE SUSE Linux Enterprise Server  7+ Turbolinux Turbolinux Server 8.0 + Turbolinux Turbolinux Server 7.0 + Turbolinux Turbolinux Workstation 8.0 + Turbolinux Turbolinux Workstation 7.0 Linux kernel 2.4.17 Linux kernel 2.4.16 Linux kernel 2.4.15 Linux kernel 2.4.14 Linux kernel 2.4.13 + Caldera OpenLinux Server 3.1.1 + Caldera OpenLinux Workstation 3.1.1 Linux kernel 2.4.12 + Conectiva Linux 7.0 Linux kernel 2.4.11 Linux kernel 2.4.10 Linux kernel 2.4.9 + Red Hat Enterprise Linux AS  2.1 IA64+ Red Hat Enterprise Linux AS  2.1+ RedHat Enterprise Linux ES  2.1 IA64+ RedHat Enterprise Linux ES  2.1+ RedHat Enterprise Linux WS  2.1 IA64+ RedHat Enterprise Linux WS  2.1+ RedHat Linux 7.2  ia64+ RedHat Linux 7.2  i386+ RedHat Linux 7.2  alpha+ RedHat Linux 7.1  ia64+ RedHat Linux 7.1  i386+ RedHat Linux 7.1  alpha+ Sun Linux 5.0.5 + Sun Linux 5.0.3 + Sun Linux 5.0 Linux kernel 2.4.8 + Mandriva Linux Mandrake 8.2 + Mandriva Linux Mandrake 8.1 + Mandriva Linux Mandrake 8.0 Linux kernel 2.4.7 + RedHat Linux 7.2 + S.u.S.E. Linux 7.2 + S.u.S.E. Linux 7.1 Linux kernel 2.4.6 Linux kernel 2.4.5 + Slackware Linux 8.0 Linux kernel 2.4.4 + S.u.S.E. Linux 7.2 Linux kernel 2.4.3 + Mandriva Linux Mandrake 8.0  ppc+ Mandriva Linux Mandrake 8.0 Linux kernel 2.4.2 Linux kernel 2.4.1 Linux kernel 2.4 .0-test9Linux kernel 2.4 .0-test8Linux kernel 2.4 .0-test7Linux kernel 2.4 .0-test6Linux kernel 2.4 .0-test5Linux kernel 2.4 .0-test4Linux kernel 2.4 .0-test3Linux kernel 2.4 .0-test2Linux kernel 2.4 .0-test12Linux kernel 2.4 .0-test11Linux kernel 2.4 .0-test10Linux kernel 2.4 .0-test1Linux kernel 2.4 Debian Linux 3.1  sparcDebian Linux 3.1  s/390Debian Linux 3.1  ppcDebian Linux 3.1  mipselDebian Linux 3.1  mipsDebian Linux 3.1  m68kDebian Linux 3.1  ia-64Debian Linux 3.1  ia-32Debian Linux 3.1  hppaDebian Linux 3.1  armDebian Linux 3.1  amd64Debian Linux 3.1  alphaDebian Linux 3.1 Debian Linux 3.0  sparcDebian Linux 3.0  s/390Debian Linux 3.0  ppcDebian Linux 3.0  mipselDebian Linux 3.0  mipsDebian Linux 3.0  m68kDebian Linux 3.0  ia-64Debian Linux 3.0  ia-32Debian Linux 3.0  hppaDebian Linux 3.0  armDebian Linux 3.0  alphaDebian Linux 3.0'

我应该解析它,其中+是一个子字符串。

我讨论了您试图解决的问题,还讨论了一篇示例SecurityFocus Bid文章(在本例中为-SecurityFocus.com/Bid/20959)。这里的想法是使用类似于BeautifulSoup的刮刀从网页中提取文本。然后可以解析该文本,将信息转换为JSON对象,然后将其转储到文件中。 SecurityFocus上的TexInfo文件中的信息包含单个标记中的所有易受攻击的操作系统列表。操作系统风格的相关内核(例如SuSE Linux 8.0)出现在其下方,前面有一个+符号(例如+Linux内核2.4.5)。+符号实际上不是一个简单的+符号,而是类似于\n\t\t\t\t\t+的符号。这就需要在将字符串转换为JSON之前对其进行处理。下面的代码片段为url执行此任务

从bs4导入美化组
导入urllib2
导入json
response=urlib2.urlopen(r'http://www.securityfocus.com/bid/20959')
html=response.read()
soup=BeautifulSoup(html)
div_元素=soup.find(id=“漏洞”)
tr_element=div_element.find_all(valign=“top”)
td_元素=tr_元素[1]。查找所有(“td”)
操作系统名称列表=[]
对于td_元素[1]中的os_名称。剥离的_字符串:
操作系统名称列表。附加(操作系统名称)
相关的内核索引=[]
[相关的内核索引。如果操作系统名称列表[i],则为范围(0,len(操作系统名称列表))中的i追加(i)。StartWith(+')]
对于范围(0,len(相关的内核索引))中的i:
操作系统名称列表[相关内核索引[i]]=操作系统名称列表[相关内核索引[i]-i-1]+'-'+''。连接(操作系统名称列表[相关内核索引[i]].split()[1:]
#循环浏览修改后的列表,创建操作系统名称字典以及相应的内核关系
漏洞_os_映射={}
对于操作系统名称列表中的操作系统名称项:
相关的_内核=[]
os_name_components=os_name_entry.split('-'))
如果没有,则漏洞\u os\u映射。具有\u密钥(os\u名称\u组件[0]):
漏洞\u os\u映射[os\u名称\u组件[0]]=相关内核
elif len(操作系统名称和组件)>1:
漏洞\u os\u映射[os\u名称\u组件[0]]。追加(os\u名称\u组件[1])
#创建一个模板名为-vulnerability_list_u.json的文件
漏洞列表文件=打开('vulnerability\u list_20959.json','w')
dump(漏洞映射、漏洞列表文件)

我希望这能让您了解如何执行任务。

您的问题是什么?从哪里检索数据?DB还是平面文件?@Hyperboreus他想把所有的东西都放在json文件中。请编辑这个问题以包含输入字符串,因为为了创建所需的json,必须将其拆分。不幸的是,我看不到输入和输出之间的任何逻辑:一些行被拆分,其他行没有。乌本托斯完全消失了。Linux内核2.6.9到2.6.8RC1将RHEL作为2.1使用,但2.6.8没有。您的输出以何种方式与您的输入相关?
from bs4 import BeautifulSoup
import urllib2
import json

response = urllib2.urlopen(r'http://www.securityfocus.com/bid/20959')
html = response.read()
soup = BeautifulSoup(html)
div_element = soup.find(id="vulnerability")
tr_element = div_element.find_all(valign="top")

td_elements =  tr_element[1].find_all("td")

os_names_list = []
for os_name in td_elements[1].stripped_strings:
    os_names_list.append(os_name)

related_kernel_indices = []
[related_kernel_indices.append(i) for i in range(0,len(os_names_list)) if os_names_list[i].startswith('+')]
for i in range(0,len(related_kernel_indices)):
    os_names_list[related_kernel_indices[i]] = os_names_list[related_kernel_indices[i] - i - 1] + '-' + " ".join(os_names_list[related_kernel_indices[i]].split()[1:])


#loop through the modified list and create a dictionary of OS names along with the correspoding kernel relations
vulnerability_os_mapping = {}

for os_name_entry in os_names_list:
    related_kernels = []
    os_name_components = os_name_entry.split('-')
    if not vulnerability_os_mapping.has_key(os_name_components[0]):
        vulnerability_os_mapping[os_name_components[0]] = related_kernels
    elif len(os_name_components) > 1:
        vulnerability_os_mapping[os_name_components[0]].append(os_name_components[1])

#create a file with a template name - vulnerability_list_<bid_id>.json
vulnerability_list_file = open('vulnerability_list_20959.json','w')
json.dump(vulnerability_os_mapping, vulnerability_list_file)