其他关于字符串替换方法的建议?python

其他关于字符串替换方法的建议?python,python,arrays,string,sorting,keyword,Python,Arrays,String,Sorting,Keyword,首先,看看我下面的代码 import string DNA=["Alpha", "Bravo", "Charlie", "Delta", "Echo", "CharlieChoo", "DeltaAir", "Alpha bet", "ChooChoo", "Airline"] body = "{\"startDate\":\"2016-01-01\"\ ,\"endDate\":\"2017-10-30\"\ ,\"timeUnit\":\"date\"\ ,\"keywordGroups

首先,看看我下面的代码

import string

DNA=["Alpha", "Bravo", "Charlie", "Delta", "Echo", "CharlieChoo", "DeltaAir", "Alpha bet", "ChooChoo", "Airline"]

body = "{\"startDate\":\"2016-01-01\"\
,\"endDate\":\"2017-10-30\"\
,\"timeUnit\":\"date\"\
,\"keywordGroups\":[{\"groupName\":\"Alpha\",\"keywords\":[\"Alpha\"]}\
,{\"groupName\":\"Bravo\",\"keywords\":[\"Bravo\"]}\
,{\"groupName\":\"Charlie\",\"keywords\":[\"Charlie\"]}\
,{\"groupName\":\"Delta\",\"keywords\":[\"Delta\"]}\
,{\"groupName\":\"Echo\",\"keywords\":[\"Echo\"]}]\
,\"device\":\"\",\"ages\":[\"1\",\"11\"],\"gender\":\"\"}"

body = body.replace(DNA[0],DNA[5],2)
body = body.replace(DNA[1],DNA[6],2)
body = body.replace(DNA[2],DNA[7],2)
body = body.replace(DNA[3],DNA[8],2)
body = body.replace(DNA[4],DNA[9],2)

body
输出如下

'{"startDate":"2016-01-01","endDate":"2017-10-30","timeUnit":"date","keywordGroups":
[{"groupName":"Alpha betChoo","keywords":["Alpha betChoo"]},
{"groupName":"ChooChooAir","keywords":["ChooChooAir"]},
{"groupName":"Charlie","keywords":["Charlie"]}, 
{"groupName":"Delta","keywords":["Delta"]},
{"groupName":"Airline","keywords":["Airline"]}],"device":"","ages":
["1","11"],"gender":""}'
#body = "{\"startDate\":\"2016-01-01\"\
#,\"endDate\":\"2017-10-30\"\
#,\"timeUnit\":\"date\"\
#,\"keywordGroups\":[{\"groupName\":\"CharlieChoo\",\"keywords\":[\"CharlieChoo\"]}\
#,{\"groupName\":\"DeltaAir\",\"keywords\":[\"DeltaAir\"]}\
#,{\"groupName\":\"Alpha bet\",\"keywords\":[\"Alpha bet\"]}\
#,{\"groupName\":\"ChooChoo\",\"keywords\":[\"ChooChoo\"]}\
#,{\"groupName\":\"Airline\",\"keywords\":[\"Airline\"]}]\
#,\"device\":\"\",\"ages\":[\"1\",\"11\"],\"gender\":\"\"}"
我的预期产出如下

'{"startDate":"2016-01-01","endDate":"2017-10-30","timeUnit":"date","keywordGroups":
[{"groupName":"Alpha betChoo","keywords":["Alpha betChoo"]},
{"groupName":"ChooChooAir","keywords":["ChooChooAir"]},
{"groupName":"Charlie","keywords":["Charlie"]}, 
{"groupName":"Delta","keywords":["Delta"]},
{"groupName":"Airline","keywords":["Airline"]}],"device":"","ages":
["1","11"],"gender":""}'
#body = "{\"startDate\":\"2016-01-01\"\
#,\"endDate\":\"2017-10-30\"\
#,\"timeUnit\":\"date\"\
#,\"keywordGroups\":[{\"groupName\":\"CharlieChoo\",\"keywords\":[\"CharlieChoo\"]}\
#,{\"groupName\":\"DeltaAir\",\"keywords\":[\"DeltaAir\"]}\
#,{\"groupName\":\"Alpha bet\",\"keywords\":[\"Alpha bet\"]}\
#,{\"groupName\":\"ChooChoo\",\"keywords\":[\"ChooChoo\"]}\
#,{\"groupName\":\"Airline\",\"keywords\":[\"Airline\"]}]\
#,\"device\":\"\",\"ages\":[\"1\",\"11\"],\"gender\":\"\"}"
所以基本上我是在尝试替换DNA列表中的组名和关键字。在这个例子中,我的DNA列表中只有10个obj,但我真正的项目包含数千个

我个人的想法是,替换字符串是不合适的,因为字符串可能重叠。 还有别的方法来完成我的任务吗?需要考虑的是,我需要把我的输出作为第一个字符串的相同类型(只有单词被改变)。 提前谢谢

--------------------------------------编辑---------------------------------------------------------------

body_intro = "{\"startDate\":\"2016-01-01\",\"endDate\":\"2017-10-30\",\"timeUnit\":\"date\",\"keywordGroups\":[{\"groupName\":\""
body_keywords = "\",\"keywords\":[\""
body_groupName = "\"]},{\"groupName\":\""
body_last = "\"]}],\"device\":\"\",\"ages\":[\"1\",\"2\",\"3\",\"4\",\"5\",\"6\",\"7\",\"8\",\"9\",\"10\",\"11\"],\"gender\":\"f\"}"


for i in range(0,len(DNA),5):
    if((len(DNA)%5==0) or (i < (len(DNA)-(len(DNA)%5)))):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_groupName + DNA[i+2] + body_keywords + DNA[i+2] + body_groupName + DNA[i+3] + body_keywords + DNA[i+3] + body_groupName + DNA[i+4] + body_keywords + DNA[i+4] + body_last    
    elif(len(DNA)%5==4):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_groupName + DNA[i+2] + body_keywords + DNA[i+2] + body_groupName + DNA[i+3] + body_keywords + DNA[i+3] + body_last    
    elif(len(DNA)%5==3):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_groupName + DNA[i+2] + body_keywords + DNA[i+2] + body_last    
    elif(len(DNA)%5==2):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_last    
    else:
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_last    
@AJAX1234应答出现新错误

import pandas as pd
import json
#reading xlsx file
ex = pd.ExcelFile('mat_hierarchy.xlsx').parse('Sheet1')
DNA = ex.loc[:,'4Level']
DNA
上面是我的DNA文件,下面是输出

0          Fruit
1          MixFruit
2          SuperFruit
3          PassionFruit
4          Orange
5          Lemon
6          Mango
................. it goes on forever :( 
使用这些信息,我运行了您的代码,“名称a未定义”错误一直显示。我只是一个初学者,但我最好的猜测是我的“DNA”被定义为索引(DNA.index[0]或其他…),我已经用数字更改了你的代码“a”,但它仍然不起作用

关于这个问题有什么建议吗? 谢谢你的意见

------------------------编辑2-------------------------------

body_intro = "{\"startDate\":\"2016-01-01\",\"endDate\":\"2017-10-30\",\"timeUnit\":\"date\",\"keywordGroups\":[{\"groupName\":\""
body_keywords = "\",\"keywords\":[\""
body_groupName = "\"]},{\"groupName\":\""
body_last = "\"]}],\"device\":\"\",\"ages\":[\"1\",\"2\",\"3\",\"4\",\"5\",\"6\",\"7\",\"8\",\"9\",\"10\",\"11\"],\"gender\":\"f\"}"


for i in range(0,len(DNA),5):
    if((len(DNA)%5==0) or (i < (len(DNA)-(len(DNA)%5)))):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_groupName + DNA[i+2] + body_keywords + DNA[i+2] + body_groupName + DNA[i+3] + body_keywords + DNA[i+3] + body_groupName + DNA[i+4] + body_keywords + DNA[i+4] + body_last    
    elif(len(DNA)%5==4):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_groupName + DNA[i+2] + body_keywords + DNA[i+2] + body_groupName + DNA[i+3] + body_keywords + DNA[i+3] + body_last    
    elif(len(DNA)%5==3):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_groupName + DNA[i+2] + body_keywords + DNA[i+2] + body_last    
    elif(len(DNA)%5==2):
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_groupName + DNA[i+1] + body_keywords + DNA[i+1] + body_last    
    else:
    body = body_intro + DNA[i] + body_keywords + DNA[i] + body_last    
body\u intro=“{\'startDate\”:\“2016-01-01\”,“endDate\”:“2017-10-30\”,“timeUnit\”:“date\”,“keywordGroups\”:[{\'groupName\”:\”
body\u keywords=“\”,\“keywords\”:[\“”
正文\u groupName=“\”]},{\“groupName\”:\“”
body\'u last=“\”]}]、\“device\”:\“\”、\“ages\”:[“1\”、“2\”、“3\”、“4\”、“5\”、“6\”、“7\”、“8\”、“9\”、“10\”、“11\”]、“性别\”:“f\”
对于范围(0,len(DNA),5)内的i:
如果((len(DNA)%5==0)或(i<(len(DNA)-(len(DNA)%5)):
body=body\u intro+DNA[i]+body\u关键字+DNA[i]+body\u groupName+DNA[i+1]+body\u关键字+DNA[i+2]+body\u关键字+DNA[i+2]+body\u groupName+DNA[i+3]+body\u关键字+DNA[i+3]+body\u groupName+DNA[i+4]+body\u关键字+DNA[i+4]+body\u最后
elif(len(DNA)%5==4):
body=body_intro+DNA[i]+body_关键字+DNA[i]+body_群名+DNA[i+1]+body_关键字+DNA[i+2]+body_关键字+DNA[i+2]+body_群名+DNA[i+3]+body_关键字+DNA[i+3]+body_最后
elif(len(DNA)%5==3):
body=body_intro+DNA[i]+body_关键词+DNA[i]+body_groupName+DNA[i+1]+body_关键词+DNA[i+1]+body_groupName+DNA[i+2]+body_关键词+DNA[i+2]+body_last
elif(len(DNA)%5==2):
body=body\u intro+DNA[i]+body\u关键字+DNA[i]+body\u组名+DNA[i+1]+body\u关键字+DNA[i+1]+body\u last
其他:
body=body\u intro+DNA[i]+body\u关键字+DNA[i]+body\u last
您可以尝试以下方法:

import json
new_body = json.loads(body)
DNA=["Alpha", "Bravo", "Charlie", "Delta", "Echo", "CharlieChoo", "DeltaAir", "Alpha bet", "ChooChoo", "Airline"]
new_body['keywordGroups'] = [{c:[DNA[DNA.index(a)+5] for a in d] if isinstance(d, list) else DNA[DNA.index(a)+5] for c, d in i.items()} for i in new_body['keywordGroups']]
final_data = json.dumps(new_body)
输出:

'{"startDate": "2016-01-01", "endDate": "2017-10-30", "gender": "", 
 "ages": ["1", "11"], "keywordGroups": 
  [{"keywords": ["CharlieChoo"], "groupName": "CharlieChoo"}, 
   {"keywords": ["DeltaAir"], "groupName":"DeltaAir"}, 
   {"keywords": ["Alpha bet"], "groupName": "Alpha bet"}, 
 {"keywords": ["ChooChoo"], "groupName": "ChooChoo"}, {"keywords":["Airline"], "groupName": "Airline"}], "device": "", "timeUnit": "date"}'
您可以尝试以下方法:

import json
new_body = json.loads(body)
DNA=["Alpha", "Bravo", "Charlie", "Delta", "Echo", "CharlieChoo", "DeltaAir", "Alpha bet", "ChooChoo", "Airline"]
new_body['keywordGroups'] = [{c:[DNA[DNA.index(a)+5] for a in d] if isinstance(d, list) else DNA[DNA.index(a)+5] for c, d in i.items()} for i in new_body['keywordGroups']]
final_data = json.dumps(new_body)
输出:

'{"startDate": "2016-01-01", "endDate": "2017-10-30", "gender": "", 
 "ages": ["1", "11"], "keywordGroups": 
  [{"keywords": ["CharlieChoo"], "groupName": "CharlieChoo"}, 
   {"keywords": ["DeltaAir"], "groupName":"DeltaAir"}, 
   {"keywords": ["Alpha bet"], "groupName": "Alpha bet"}, 
 {"keywords": ["ChooChoo"], "groupName": "ChooChoo"}, {"keywords":["Airline"], "groupName": "Airline"}], "device": "", "timeUnit": "date"}'

只需使用正则表达式。我假设你的DNA列表中有一对夫妇,他们有一个目标名字和一个来源名字

import re
length_of_DNA = len(DNA) 
for i, t in enumerate(DNA[:length_of_DNA/2]):
    s = DNA[length_of_DNA/2+i]
    body = re.sub(r'\"'+t+'\"', s, body, 2)

希望对您有所帮助。

只需使用regex即可。我假设你的DNA列表中有一对夫妇,他们有一个目标名字和一个来源名字

import re
length_of_DNA = len(DNA) 
for i, t in enumerate(DNA[:length_of_DNA/2]):
    s = DNA[length_of_DNA/2+i]
    body = re.sub(r'\"'+t+'\"', s, body, 2)
希望有帮助。

为了能够执行“批量”替换(并假设您需要保留替换的元素计数),我将执行以下操作:

lookup = {"Alpha": "CharlieChoo",
          "Bravo": "DeltaAir",
          "Charlie": "Alpha bet",
          "Delta": "ChooChoo",
          "Echo": "Airline"}

lookup_count = {"Alpha": 2,
                "Bravo": 2,
                "Charlie": 2,
                "Delta": 2,
                "Echo": 2}

def replace_using_lookups(match):
    word = match.group(1)
    if word in lookup and lookup_count[word] > 0:
        lookup_count[word] -= 1
        return '"{}"'.format(lookup[word])
    return '"{}"'.format(word)


re.sub('"(\w+)"', replace_using_lookups, body)
如果不需要
lookup\u count
dict,您可以使用更简单的lambda执行替换。

为了能够执行“批量”替换(并且假设您需要保持替换的元素计数),我将执行以下操作:

lookup = {"Alpha": "CharlieChoo",
          "Bravo": "DeltaAir",
          "Charlie": "Alpha bet",
          "Delta": "ChooChoo",
          "Echo": "Airline"}

lookup_count = {"Alpha": 2,
                "Bravo": 2,
                "Charlie": 2,
                "Delta": 2,
                "Echo": 2}

def replace_using_lookups(match):
    word = match.group(1)
    if word in lookup and lookup_count[word] > 0:
        lookup_count[word] -= 1
        return '"{}"'.format(lookup[word])
    return '"{}"'.format(word)


re.sub('"(\w+)"', replace_using_lookups, body)

如果没有必要使用
查找\u count
dict,您可以使用更简单的lambda执行替换。

就我个人而言,我会使用正则表达式来实现同步替换,例如。您的DNA列表是否包含此表单中的关键字['t1',t2',t3',t4',t5',t6',t7',s1',s2',s3',s4',s5',s6',s7']?关键字t的数量与s的数量相同。如果是的话,请尝试我下面的答案。就我个人而言,我会使用正则表达式来进行同步替换,比如。你的DNA列表中包含的关键字是这种形式吗[‘t1’、‘t2’、‘t3’、‘t4’、‘t5’、‘t6’、‘t7’、‘s1’、‘s2’、‘s3’、‘s4’、‘s5’、‘s6’、‘s7’]?关键字t的数量与s的数量相同。如果是,请尝试下面我的答案。你能看到我对你答案的编辑吗?我有一些错误,希望你看看。再次感谢你的回答!@kang
DNA
是熊猫对象,所以它可能不支持
。index
。你最近的
body
保持不变nt编辑?你能看到我对你答案的编辑吗?我有一些错误,希望你看看。再次感谢你的回复!@kang
DNA
是熊猫对象,所以它可能不支持
。index
。在你最近的编辑中,
body
是否保持不变?