Python 熊猫：将具有多个范围的值转换为行_Python_Pandas_Dataframe

Python 熊猫：将具有多个范围的值转换为行

python pandas dataframe

Python 熊猫：将具有多个范围的值转换为行,python,pandas,dataframe,Python,Pandas,Dataframe,经过一些谷歌搜索，没有任何好的匹配，我希望你能帮助我完成以下转换。我用{FROM-TO}风格编写了一些值的范围： df_current=pd.DataFrame.from_dict（{'A'：['test{1-2}this{1-3}'，'or{2-3}'，'B'：['yes'，'no']}） A B 0测试{1-2}此{1-3}是 1或{2-3}否为了进一步处理，我想创建以下内容： df_wish=pd.DataFrame.from_dict（{\ “A”：[\ “test1this1”、“

经过一些谷歌搜索，没有任何好的匹配，我希望你能帮助我完成以下转换。我用

{FROM-TO}

风格编写了一些值的范围：

df_current=pd.DataFrame.from_dict（{'A'：['test{1-2}this{1-3}'，'or{2-3}'，'B'：['yes'，'no']}）
A B
0测试{1-2}此{1-3}是
1或{2-3}否

为了进一步处理，我想创建以下内容：

df_wish=pd.DataFrame.from_dict（{\
“A”：[\
“test1this1”、“test1this2”、“test1this3”\
“test2this1”、“test2this2”、“test2this3”\
“或2”、“或3”]，
“B”：[\
‘是’、‘是’、‘是’、‘是’、‘是’、‘是’、‘是’\
'否'，'否']}）
A B
0测试1此1是
1测试1此2是
2测试1此3是
3测试2是1是
4测试2这是2是
5测试2是3是
6或2号
7号或3号

请注意，对于新行，B只是重复的

谢谢，勒内

使用：

import re
from itertools import product

def mapper(s):
    lst = re.findall(r'(\w+)\{(\d+)-(\d+)\}', s)
    prd = [['{}{}'.format(*p) for p in product([w], range(int(m), int(n) + 1))] for w, m, n in lst]
    return list(map(''.join, product(*prd)))

df['A'] = df['A'].map(mapper)
df = df.explode('A').reset_index(drop=True)

详细信息：

import re
from itertools import product

def mapper(s):
    lst = re.findall(r'(\w+)\{(\d+)-(\d+)\}', s)
    prd = [['{}{}'.format(*p) for p in product([w], range(int(m), int(n) + 1))] for w, m, n in lst]
    return list(map(''.join, product(*prd)))

df['A'] = df['A'].map(mapper)
df = df.explode('A').reset_index(drop=True)

步骤A：定义一个

mapper

函数，该函数将输入作为字符串参数，例如

'test{1-2}this{1-3}'

，并将该字符串映射为生成所有可能的字符串，这些字符串可以通过将范围与相应的字相乘来获得。函数

mapper

对于输入字符串

'test{1-2}这个{1-3}'

的工作可以进一步解释为：

print(lst) # Use 're.findall' to parse all the words and their corresponding ranges
[('test', '1', '2'), ('this', '1', '3')]

print(prd) # Use 'itertools.product' to get all inner level products
[['test1', 'test2'], ['this1', 'this2', 'this3']]

# Again use 'itertools.product' to get all outer level products
['test1this1', 'test1this2', 'test1this3', 'test2this1', 'test2this2', 'test2this3']

步骤B：使用列

将函数

mapper

映射到列

的每个值

# print(df)

                                                                          A    B
0  [test1this1, test1this2, test1this3, test2this1, test2this2, test2this3]  yes
1                                                                [or2, or3]   no

步骤C：使用列

将列

中的每个类似列表的值转换为复制索引值的行

# print(df)
            A    B
0  test1this1  yes
1  test1this2  yes
2  test1this3  yes
3  test2this1  yes
4  test2this2  yes
5  test2this3  yes
6         or2   no
7         or3   no