读取包含单引号数据的文件，并将其存储在python的列表中_Python_List

读取包含单引号数据的文件，并将其存储在python的列表中

python list

读取包含单引号数据的文件，并将其存储在python的列表中,python,list,Python,List,当我试图读取一个文件并将其存储在列表中时，它无法将单个引号中的字符串作为列表中的单个值存储示例文件： 12 3 'dsf dsf' 清单应包括： listname = [12, 3, 'dsf dsf'] 我可以这样做，如下所示： listname = [12, 3, 'dsf', 'dsf'] 请帮助使用该模块演示： input.txt是示例中包含数据的文件。使用该模块演示： input.txt是示例中包含数据的文件。您可以使用shlex模块以简单的方式拆分数据 import s

当我试图读取一个文件并将其存储在列表中时，它无法将单个引号中的字符串作为列表中的单个值存储

示例文件：

12 3 'dsf dsf'

清单应包括：

listname = [12, 3, 'dsf dsf']

我可以这样做，如下所示：

listname = [12, 3, 'dsf', 'dsf']

请帮助使用该模块

演示：

input.txt是示例中包含数据的文件。

使用该模块

演示：

input.txt是示例中包含数据的文件。

您可以使用

shlex

模块以简单的方式拆分数据

import shlex
data = open("sample file", 'r')
print shlex.split(data.read())

试试看：）

您可以使用

shlex

模块以简单的方式分割数据

import shlex
data = open("sample file", 'r')
print shlex.split(data.read())

试试看：）

您可以使用正则表达式：

import re
my_regex = re.compile(r"(?<=')[\w\s]+(?=')|\w+")
with open ("filename.txt") as my_file:
    my_list = my_regex.findall(my_file.read())
    print(my_list)

正则表达式解释：

(?<=')     # matches if there's a single quote *before* the matched pattern
[\w\s]+    # matches one or more alphanumeric characters and spaces
(?=')      # matches if there's a single quote *after* the matched pattern
|          # match either the pattern above or below
\w+        # matches one or more alphanumeric characters

（？您可以使用正则表达式：
import re
my_regex = re.compile(r"(?<=')[\w\s]+(?=')|\w+")
with open ("filename.txt") as my_file:
    my_list = my_regex.findall(my_file.read())
    print(my_list)


正则表达式解释：
(?<=')     # matches if there's a single quote *before* the matched pattern
[\w\s]+    # matches one or more alphanumeric characters and spaces
(?=')      # matches if there's a single quote *after* the matched pattern
|          # match either the pattern above or below
\w+        # matches one or more alphanumeric characters

（？您可以使用：
>>> l = ['12', '3', 'dsf', 'dsf']
>>> l[2:] = [' '.join(l[2:])]
>>> l
['12', '3', 'dsf dsf']

您可以使用：
>>> l = ['12', '3', 'dsf', 'dsf']
>>> l[2:] = [' '.join(l[2:])]
>>> l
['12', '3', 'dsf dsf']

基本上，您需要解析数据，即：

把它分成代币
解释结果序列

在您的情况下，每个令牌都可以单独解释


对于第一个任务：

每个令牌是：

一组非空格字符，或
一个报价，然后是其他报价

分隔符是单个空格（您没有指定空格/其他空格字符的运行是否有效）

解释：

quoted：取所附文本，放弃引号
非引号：如果可能，转换为整数（您没有指定它是否始终是/应该是整数）
（您也没有指定是否始终为2个整数+带引号的字符串-即，是否应强制执行此组合）

由于语法非常简单，因此可以同时完成两项任务：
import re
i=0
maxi=len(line)
tokens=[]
re_sep=r"\s"
re_term=r"\S+"
re_quoted=r"'(?P<enclosed>[^']*)'"
re_chunk=re.compile("(?:(?P<term>%(re_term)s)"\
                     "|(?P<quoted>%(re_quoted)s))"\
                    "(?:%(re_sep)s|$)"%locals())
del re_sep,re_term,re_quoted
while i<maxi:
    m=re.match(re_chunk,line,i)
    if not m: raise ValueError("invalid syntax at char %d"%i)
    gg=m.groupdict()
    token=gg['term']
    if token:
        try: token=int(token)
        except ValueError: pass
    elif gg['quoted']:
        token=gg['enclosed']
    else: assert False,"invalid match. locals=%r"%locals()
    tokens.append(token)
    i+=m.end()
    del m,gg,token

重新导入
i=0
最大值=长度（直线）
代币=[]
re_sep=r“\s”
re_term=r“\S+”
re_quoted=r“（？P[^']*）”
re_chunk=re.compile（“（？：（？P%（re_术语）s）”\
“|（？P%（重新引用）s））”\
“（？：%（re_sep）s |$）%locals（）
del re_sep，re_术语，re_引用
虽然i基本上，您需要解析数据。这是：

把它分成代币
解释结果序列

在您的情况下，每个令牌都可以单独解释


对于第一个任务：

每个令牌是：

一组非空格字符，或
一个报价，然后是其他报价

分隔符是单个空格（您没有指定空格/其他空格字符的运行是否有效）

解释：

quoted：取所附文本，放弃引号
非引号：如果可能，转换为整数（您没有指定它是否始终是/应该是整数）
（您也没有指定是否始终为2个整数+带引号的字符串-即，是否应强制执行此组合）

由于语法非常简单，因此可以同时完成两项任务：
import re
i=0
maxi=len(line)
tokens=[]
re_sep=r"\s"
re_term=r"\S+"
re_quoted=r"'(?P<enclosed>[^']*)'"
re_chunk=re.compile("(?:(?P<term>%(re_term)s)"\
                     "|(?P<quoted>%(re_quoted)s))"\
                    "(?:%(re_sep)s|$)"%locals())
del re_sep,re_term,re_quoted
while i<maxi:
    m=re.match(re_chunk,line,i)
    if not m: raise ValueError("invalid syntax at char %d"%i)
    gg=m.groupdict()
    token=gg['term']
    if token:
        try: token=int(token)
        except ValueError: pass
    elif gg['quoted']:
        token=gg['enclosed']
    else: assert False,"invalid match. locals=%r"%locals()
    tokens.append(token)
    i+=m.end()
    del m,gg,token

重新导入
i=0
最大值=长度（直线）
代币=[]
re_sep=r“\s”
re_term=r“\S+”
re_quoted=r“（？P[^']*）”
re_chunk=re.compile（“（？：（？P%（re_术语）s）”\
“|（？P%（重新引用）s））”\
“（？：%（re_sep）s |$）%locals（）
del re_sep，re_术语，re_引用
而我listname=[12,3，dsf dsf]
不是一个有效的列表。你的意思是listname=[12,3，'dsf dsf']
？listname=[12,3，dsf dsf]
不是一个有效的列表。你的意思是listname=[12,3，'dsf dsf']
？我可以有多个quetechar吗？@ARK我不这么认为，医生说它必须是一个字符串。我可以有多个quetechar吗？@ARK我不这么认为，医生说它必须是一个字符串。