Python 将字符串列表转换为列表列表

Python 将字符串列表转换为列表列表,python,Python,我有一个字符串列表,我正试图将其转换为列表列表。我的 字符串列表如下所示 ['[[try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try', 'become', 'man', 'value]', '[look', 'deep', 'into', 'nature', 'and', 'then', 'you', 'will', 'understand', 'everything', 'better]', '[the', 'tr

我有一个字符串列表,我正试图将其转换为列表列表。我的 字符串列表如下所示

['[[try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try', 
'become', 'man', 'value]', '[look', 'deep', 'into', 'nature', 'and', 'then', 
'you', 'will', 'understand', 'everything', 'better]', '[the', 'true', 
'sign', 'intelligence', 'not', 'knowledge', 'but', 'imagination]', '[we', 
'cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 
'used', 'when', 'created', 'them]', '[weakness', 'attitude', 'becomes', 
'weakness', 'character]', '["you', 'cant', 'blame', 'gravity', 'for', 
'falling', 'love"]', '[the', 'difference', 'between', 'stupidity', 'and',
'genius', 'that', 'genius', 'has', 'its', 'limits]]']
我的期望输出将如下所示:

 [[['try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try',
 'become', 'man', 'value], [look', 'deep', 'into', 'nature', 'and', 'then',
 'you', 'will', 'understand', 'everything', 'better], [the', 'true', 'sign', 
 'intelligence', 'not', 'knowledge', 'but', 'imagination], [we', 'cannot', 
 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 'used', 
 'when', 'created', 'them], [weakness', 'attitude', 'becomes', 'weakness', 
 'character], ["you', 'cant', 'blame', 'gravity', 'for', 'falling', 'love"],
 [the', 'difference', 'between', 'stupidity', 'and', 'genius', 'that', 
 'genius', 'has', 'its', 'limits']]]
 [['[', '[', 't', 'r', 'y'], ['n', 'o', 't'], ['b', 'e', 'c', 'o', 'm', 
 'e'], ['m', 'a', 'n'], ['s', 'u', 'c', 'c', 'e', 's', 's'], ['b', 'u', 
 't'], ['r', 'a', 't', 'h', 'e', 'r'], ['t', 'r', 'y'], ['b', 'e', 'c', 'o', 
 'm', 'e'], ['m', 'a', 'n'], ['v', 'a', 'l', 'u', 'e', ']'], ['[', 'l', 'o', 
 'o', 'k'], ['d', 'e', 'e', 'p'], ['i', 'n', 't', 'o'], ['n', 'a', 't', 'u',
 'r', 'e'], ['a', 'n', 'd'], ['t', 'h', 'e', 'n'], ['y', 'o', 'u'], ['w', 
 'i', 'l', 'l'], ['u', 'n', 'd', 'e', 'r', 's', 't', 'a', 'n', 'd'], ['e', 
 'v', 'e', 'r', 'y', 't', 'h', 'i', 'n', 'g'], ['b', 'e', 't', 't', 'e', 
 'r', ']'], ['[', 't', 'h', 'e'], ['t', 'r', 'u', 'e'], ['s', 'i', 'g', 
 'n'], ['i', 'n', 't', 'e', 'l', 'l', 'i', 'g', 'e', 'n', 'c', 'e'], ['n', 
 'o', 't'], ['k', 'n', 'o', 'w', 'l', 'e', 'd', 'g', 'e'], ['b', 'u', 't'], 
 ['i', 'm', 'a', 'g', 'i', 'n', 'a', 't', 'i', 'o', 'n', ']'], ['[', 'w', 
 'e'], ['c', 'a', 'n', 'n', 'o', 't'], ['s', 'o', 'l', 'v', 'e'], ['o', 'u',
 'r'], ['p', 'r', 'o', 'b', 'l', 'e', 'm', 's'], ['w', 'i', 't', 'h'], ['t', 
 'h', 'e'], ['s', 'a', 'm', 'e'], ['t', 'h', 'i', 'n', 'k', 'i', 'n', 'g'], 
 ['u', 's', 'e', 'd'], ['w', 'h', 'e', 'n'], ['c', 'r', 'e', 'a', 't', 'e', 
 'd'], ['t', 'h', 'e', 'm', ']'], ['[', 'w', 'e', 'a', 'k', 'n', 'e', 's', 
 's'], ['a', 't', 't', 'i', 't', 'u', 'd', 'e'], ['b', 'e', 'c', 'o', 'm', 
 'e', 's'], ['w', 'e', 'a', 'k', 'n', 'e', 's', 's'], ['c', 'h', 'a', 'r', 
 'a', 'c', 't', 'e', 'r', ']'], ['[', '"', 'y', 'o', 'u'], ['c', 'a', 'n', 
 't'], ['b', 'l', 'a', 'm', 'e'], ['g', 'r', 'a', 'v', 'i', 't', 'y'], ['f', 
 'o', 'r'], ['f', 'a', 'l', 'l', 'i', 'n', 'g'], ['l', 'o', 'v', 'e', '"', 
 ']'], ['[', 't', 'h', 'e'], ['d', 'i', 'f', 'f', 'e', 'r', 'e', 'n', 'c', 
 'e'], ['b', 'e', 't', 'w', 'e', 'e', 'n'], ['s', 't', 'u', 'p', 'i', 'd', 
 'i', 't', 'y'], ['a', 'n', 'd'], ['g', 'e', 'n', 'i', 'u', 's'], ['t', 'h',
  'a', 't'], ['g', 'e', 'n', 'i', 'u', 's'], ['h', 'a', 's'], ['i', 't', 
  's'], ['l', 'i', 'm', 'i', 't', 's', ']', ']']]
我的输出当前如下所示:

 [[['try', 'not', 'become', 'man', 'success', 'but', 'rather', 'try',
 'become', 'man', 'value], [look', 'deep', 'into', 'nature', 'and', 'then',
 'you', 'will', 'understand', 'everything', 'better], [the', 'true', 'sign', 
 'intelligence', 'not', 'knowledge', 'but', 'imagination], [we', 'cannot', 
 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 'used', 
 'when', 'created', 'them], [weakness', 'attitude', 'becomes', 'weakness', 
 'character], ["you', 'cant', 'blame', 'gravity', 'for', 'falling', 'love"],
 [the', 'difference', 'between', 'stupidity', 'and', 'genius', 'that', 
 'genius', 'has', 'its', 'limits']]]
 [['[', '[', 't', 'r', 'y'], ['n', 'o', 't'], ['b', 'e', 'c', 'o', 'm', 
 'e'], ['m', 'a', 'n'], ['s', 'u', 'c', 'c', 'e', 's', 's'], ['b', 'u', 
 't'], ['r', 'a', 't', 'h', 'e', 'r'], ['t', 'r', 'y'], ['b', 'e', 'c', 'o', 
 'm', 'e'], ['m', 'a', 'n'], ['v', 'a', 'l', 'u', 'e', ']'], ['[', 'l', 'o', 
 'o', 'k'], ['d', 'e', 'e', 'p'], ['i', 'n', 't', 'o'], ['n', 'a', 't', 'u',
 'r', 'e'], ['a', 'n', 'd'], ['t', 'h', 'e', 'n'], ['y', 'o', 'u'], ['w', 
 'i', 'l', 'l'], ['u', 'n', 'd', 'e', 'r', 's', 't', 'a', 'n', 'd'], ['e', 
 'v', 'e', 'r', 'y', 't', 'h', 'i', 'n', 'g'], ['b', 'e', 't', 't', 'e', 
 'r', ']'], ['[', 't', 'h', 'e'], ['t', 'r', 'u', 'e'], ['s', 'i', 'g', 
 'n'], ['i', 'n', 't', 'e', 'l', 'l', 'i', 'g', 'e', 'n', 'c', 'e'], ['n', 
 'o', 't'], ['k', 'n', 'o', 'w', 'l', 'e', 'd', 'g', 'e'], ['b', 'u', 't'], 
 ['i', 'm', 'a', 'g', 'i', 'n', 'a', 't', 'i', 'o', 'n', ']'], ['[', 'w', 
 'e'], ['c', 'a', 'n', 'n', 'o', 't'], ['s', 'o', 'l', 'v', 'e'], ['o', 'u',
 'r'], ['p', 'r', 'o', 'b', 'l', 'e', 'm', 's'], ['w', 'i', 't', 'h'], ['t', 
 'h', 'e'], ['s', 'a', 'm', 'e'], ['t', 'h', 'i', 'n', 'k', 'i', 'n', 'g'], 
 ['u', 's', 'e', 'd'], ['w', 'h', 'e', 'n'], ['c', 'r', 'e', 'a', 't', 'e', 
 'd'], ['t', 'h', 'e', 'm', ']'], ['[', 'w', 'e', 'a', 'k', 'n', 'e', 's', 
 's'], ['a', 't', 't', 'i', 't', 'u', 'd', 'e'], ['b', 'e', 'c', 'o', 'm', 
 'e', 's'], ['w', 'e', 'a', 'k', 'n', 'e', 's', 's'], ['c', 'h', 'a', 'r', 
 'a', 'c', 't', 'e', 'r', ']'], ['[', '"', 'y', 'o', 'u'], ['c', 'a', 'n', 
 't'], ['b', 'l', 'a', 'm', 'e'], ['g', 'r', 'a', 'v', 'i', 't', 'y'], ['f', 
 'o', 'r'], ['f', 'a', 'l', 'l', 'i', 'n', 'g'], ['l', 'o', 'v', 'e', '"', 
 ']'], ['[', 't', 'h', 'e'], ['d', 'i', 'f', 'f', 'e', 'r', 'e', 'n', 'c', 
 'e'], ['b', 'e', 't', 'w', 'e', 'e', 'n'], ['s', 't', 'u', 'p', 'i', 'd', 
 'i', 't', 'y'], ['a', 'n', 'd'], ['g', 'e', 'n', 'i', 'u', 's'], ['t', 'h',
  'a', 't'], ['g', 'e', 'n', 'i', 'u', 's'], ['h', 'a', 's'], ['i', 't', 
  's'], ['l', 'i', 'm', 'i', 't', 's', ']', ']']]
以下是文本文件的内容:

Try not to become a man of success, but rather try to become a man of value. 
Look deep into nature, and then you will understand everything better.
The true sign of intelligence is not knowledge but imagination. 
We cannot solve our problems with the same thinking we used when we created them. 
Weakness of attitude becomes weakness of character.
You can't blame gravity for falling in love. 
The difference between stupidity and genius is that genius has its limits.
以下是我迄今为止编写的代码:

Info = [[line.strip()] for line in Info] 
#Turns original list into lists of lists breaking at each new line
Info_Str = str(Info) #Converts list into string to manipulate easier
Info_Str = Info_Str.lower() #Converts all characters to lowercase
Info_Str = Info_Str.replace(".", "")
Info_Str = Info_Str.replace("!", "")
Info_Str = Info_Str.replace("?", "")
Info_Str = Info_Str.replace(":", "")
Info_Str = Info_Str.replace(",", "")
Info_Str = Info_Str.replace(";", "")
Info_Str = Info_Str.replace("'", "")
Info_Str = Info_Str.replace("-", "")
#The above functions remove all punctuation will leaving the '[]' for the lists
Info_Str = Info_Str.split()
Info_List = Info_Str
New_List = [item for item in Info_List if not item.isdigit()] #Removes all numbers
for word in New_List[:]: #Removes words if their length is less than 3 characters 
    if len(word) < 3:
        New_List.remove(word)
print(New_List) #List of Strings
List_Lists = [list(line) for line in New_List]
print(List_Lists)
Info=[[line.strip()]用于信息中的行]
#将原始列表转换为在每一新行中断的列表列表
Info_Str=Str(Info)#将列表转换为字符串以便于操作
Info_Str=Info_Str.lower()#将所有字符转换为小写
Info\u Str=Info\u Str.replace(“.”,“”)
Info\u Str=Info\u Str.replace(“!”,“”)
Info\u Str=Info\u Str.replace(“?”,“”)
Info\u Str=Info\u Str.replace(“:”,“”)
Info\u Str=Info\u Str.replace(“,”和“”)
Info_Str=Info_Str.replace(“;”,“”)
Info\u Str=Info\u Str.replace(“,”)
Info\u Str=Info\u Str.replace(“-”,“”)
#上述函数将删除所有标点符号,并为列表保留“[]”
Info_Str=Info_Str.split()
信息列表=信息列表
新建列表=[信息列表中项目对应的项目,如果不是项目。isdigit()]#删除所有数字
对于新列表中的单词[:]:#如果单词长度小于3个字符,则删除这些单词
如果len(word)<3:
新列表。删除(word)
打印(新列表)#字符串列表
List_Lists=[新_列表中的行的列表(行)]
打印(列表)

我知道它不是很优雅,我已经很长时间没有编码了

如果你想得到所有单词的列表,不包括空格和特殊字符,你可以将正则表达式
\w+
(至少一个单词字符)与
findall()
结合使用:

我认为将列表转换为字符串会让事情变得更难,而不是更容易

我可能会这样做:

def remove_special_characters(s):
    for c in ".!?:,;'-0123456789":
        s = s.replace(c, "")
    return s

lines = []
with open("data.txt") as file:
    for line in file:
        words = []
        for word in line.split():
            word = word.lower()
            word = remove_special_characters(word)
            if len(word) >= 3:
                words.append(word)
        lines.append(words)
print(lines)
结果(我为增加可读性而添加的换行):


我想这就是你想要做的

all_lines = []
keep=set('qazwsxedcrfvtgbyhnujmikolp QAZWSXEDCRFVTGBYHNUJMIKOLP')
for line in Info:
    line = str(line)
    line = ''.join(filter(keep.__contains__, line))
    line = line.split()
    for word in line:
        if len(word)<3:
            line.remove(word)
    all_lines.append(line)
print (all_lines)
@AdamSmith指出了以下更改,以使内容更具可读性和简单性,这要归功于@AdamSmith:

import string
keep=set(string.ascii_lowercase + string.ascii_uppercase + " ")

使用正则表达式快速回答:

import re
messy_list = ['[[try', 'not', 'become', 'man', 'success', 'but', 
    'rather', 'try', 
    'become', 'man', 'value]', '[look', 'deep', 'into', 'nature', 
    'and', 'then', 
    'you', 'will', 'understand', 'everything', 'better]', '[the', 
    'true', 
    'sign', 'intelligence', 'not', 'knowledge', 'but', 'imagination]', '[we', 
    'cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 
    'used', 'when', 'created', 'them]', '[weakness', 'attitude', 'becomes', 
    'weakness', 'character]', '["you', 'cant', 'blame', 'gravity', 'for', 
    'falling', 'love"]', '[the', 'difference', 'between', 'stupidity', 'and',
    'genius', 'that', 'genius', 'has', 'its', 'limits]]'
]
# clean up double quotes in items of list
messy_list = [item.replace("\"", "") for item in messy_list]
# find word pattern in a string
pattern = re.compile(r"(\w+)")
# replace word pattern by adding single quotes before and after each word
clean_string = pattern.sub(r"\g\'<1>\'",  ",".join(messy_list))
# evaluate a string
print eval(clean_string)

这个字符串列表来自哪里?看起来您只是简单地拆分了
,“
上嵌套列表的字符串表示形式,而不是使用例如来实际解析它。显示我们可以帮助的代码…
ast.literal\u eval
,或者在您提供的任何解决方案之前解析某个步骤,正如乔恩逃避的那样。看起来你正在向我们展示你试图解析某个东西时已经混乱的中间结果。请把原件贴出来。你是怎么做到的?你基本上只需要把列表中的项目分开,然后就可以看到上面的内容了?看起来您创建了一个列表,这是您实际需要的,但随后将其转换为字符串表示形式,并将其拆分回来。不要只是给出一个代码转储,而是提供一个。这似乎不是OP想要的输出。他似乎想要
[line.split()表示文本中的行。splitlines()]
或类似的东西。当然,至少嵌套了两个深度。我认为OP没有找到正确的方式来表达他到底想要什么。我希望您同意,一个包含一个项目的三次嵌套列表是没有用的。因此,从对他可能有用的角度出发,我伪造了这个答案,这样它至少可以作为一个研究的基础。我认为这是有用的,但这不是他所问问题的答案。这就是说,由于您可以将其与他必须做的其他事情结合起来(例如,
[word.lower()代表re.findall(r“\w{3,}”,text)]
),它可能会比他已经想到的实现更好。不过:
[[word.lower()表示句子中的单词,如果len(word)>=3]表示文本中的句子。splitlines()]
并不可怕。谢谢你,我能够得到我想要的输出,你应该更喜欢
保持=设置(string.ascii\u lowercase+string.ascii\u uperscape+“”)
这让每个人都清楚地看到你没有漏掉一个字母。@AdamSmith这给了我一个名称错误:我尝试时没有定义名称“string”(这是因为你必须导入
string
,它是stdlib的一部分)@AdamSmith o ok
import string
keep=set(string.ascii_lowercase + string.ascii_uppercase + " ")
import re
messy_list = ['[[try', 'not', 'become', 'man', 'success', 'but', 
    'rather', 'try', 
    'become', 'man', 'value]', '[look', 'deep', 'into', 'nature', 
    'and', 'then', 
    'you', 'will', 'understand', 'everything', 'better]', '[the', 
    'true', 
    'sign', 'intelligence', 'not', 'knowledge', 'but', 'imagination]', '[we', 
    'cannot', 'solve', 'our', 'problems', 'with', 'the', 'same', 'thinking', 
    'used', 'when', 'created', 'them]', '[weakness', 'attitude', 'becomes', 
    'weakness', 'character]', '["you', 'cant', 'blame', 'gravity', 'for', 
    'falling', 'love"]', '[the', 'difference', 'between', 'stupidity', 'and',
    'genius', 'that', 'genius', 'has', 'its', 'limits]]'
]
# clean up double quotes in items of list
messy_list = [item.replace("\"", "") for item in messy_list]
# find word pattern in a string
pattern = re.compile(r"(\w+)")
# replace word pattern by adding single quotes before and after each word
clean_string = pattern.sub(r"\g\'<1>\'",  ",".join(messy_list))
# evaluate a string
print eval(clean_string)
"[['try','not','become','man','success','but','rather','try','become','man','value'],['look','deep','into','nature','and','then','you','will','understand','everything','better'],['the','true','sign','intelligence','not','knowledge','but','imagination'],['we','cannot','solve','our','problems','with','the','same','thinking','used','when','created','them'],['weakness','attitude','becomes','weakness','character'],['you','cant','blame','gravity','for','falling','love'],['the','difference','between','stupidity','and','genius','that','genius','has','its','limits']]"