如何在多个文件中搜索3个单独的字符串,并用python将它们打印到excel文件中?

如何在多个文件中搜索3个单独的字符串,并用python将它们打印到excel文件中?,python,regex,excel,string,file,Python,Regex,Excel,String,File,我有一个脚本,在打印到excel文件之前寻找4个单独的字符串。前3个是单独单行上的字符串,我用正则表达式搜索,第4个是代码块,我用beautiful soup 4搜索。我能得到《美丽的汤》的文本,但由于某种原因,前3篇没有 import xlwt from xlwt import Workbook import os from bs4 import BeautifulSoup import re from os import listdir fileNumber = 1 cve = ""

我有一个脚本,在打印到excel文件之前寻找4个单独的字符串。前3个是单独单行上的字符串,我用正则表达式搜索,第4个是代码块,我用beautiful soup 4搜索。我能得到《美丽的汤》的文本,但由于某种原因,前3篇没有

import xlwt 
from xlwt import Workbook 
import os
from bs4 import BeautifulSoup
import re
from os import listdir

fileNumber = 1
cve = ""
titlePrint = ""
titleStrip = ""
date = ""
code = ""
col = 0
row = 0


directory = "/Users/Documents/databasescript/web_scrape_db/exploits_test_folder"

for filename in os.listdir(directory):
    if filename.endswith(".txt"):
        with open(str(fileNumber) + ".txt") as f:

            for line in f:
                #CVE

                if re.search(r'https://nvd.nist.gov/vuln/detail/', line):
                    cve = line[118:131]
                    print"found 'https://nvd.nist.gov/vuln/detail/'"

                #Title
                if '<h1 class="card-title text-secondary text-center"' in line:
                    titlePrint = f.next().translate(None, '&#039;').strip()
                    print "found title"
                #Date
                if '<meta property="article:published_time"' in line:
                    date = line[53:63]
                    print "found date"

                if fileNumber == 6:
                    break       
                #Source Code
                soup = BeautifulSoup(open("/Users/Documents/databasescript/web_scrape_db/exploits_test_folder/"+(str(fileNumber))+".txt"), "html.parser")




                #increment file number      
                fileNumber+=1
导入xlwt
从xlwt导入工作簿
导入操作系统
从bs4导入BeautifulSoup
进口稀土
从操作系统导入listdir
fileNumber=1
cve=“”
titlePrint=“”
titleStrip=“”
date=“”
code=“”
col=0
行=0
directory=“/Users/Documents/databasescript/web\u scrape\u db/exploits\u test\u文件夹”
对于os.listdir(目录)中的文件名:
如果filename.endswith(“.txt”):
将open(str(fileNumber)+“.txt”)作为f:
对于f中的行:
#CVE
如果重新搜索(r'https://nvd.nist.gov/vuln/detail/,第行):
cve=行[118:131]
打印“找到”https://nvd.nist.gov/vuln/detail/'"
#头衔

如果“脚本可能正在执行
fileNumber==6
条件,并在到达您要查找的3个字符串之前中断循环

您的脚本只在第一个.txt文件的前6行中搜索,然后每隔一个.txt文件搜索一行


也许你想把
fileNumber+=1
移回到
if filename.endswith(“.txt”):
条件中?

考虑简化这一点。这是关于写入xlsx或查找字符串的问题。@user1558604查找字符串,因为它找不到字符串。我可以写入excel文件,但它找不到前3个字符串CVE、Title和DATE。请编辑您的问题,并提供一个最小的、可复制的示例,其中不包含任何xlsx代码。@user1558604我编辑过它
https://nvd.nist.gov/vuln/detail/
<h1 class="card-title text-secondary text-center"
<meta property="article:published_time"