Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/289.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python-标记化,替换单词_Python_Dictionary_Tokenize_String Parsing - Fatal编程技术网

Python-标记化,替换单词

Python-标记化,替换单词,python,dictionary,tokenize,string-parsing,Python,Dictionary,Tokenize,String Parsing,我试着创造一些类似句子的东西,把随机的单词放进句子里。具体地说,我会有如下内容: "The weather today is [weather_state]." 能够找到[括号]中的所有标记,然后将它们替换为字典或列表中的随机对应项,给我留下: "The weather today is warm." "The weather today is bad." 或 请记住,[bracket]标记的位置并不总是在同一位置,并且在我的字符串中会有多个带括号的标记,如: "[person] is fe

我试着创造一些类似句子的东西,把随机的单词放进句子里。具体地说,我会有如下内容:

"The weather today is [weather_state]."
能够找到[括号]中的所有标记,然后将它们替换为字典或列表中的随机对应项,给我留下:

"The weather today is warm."
"The weather today is bad."

请记住,[bracket]标记的位置并不总是在同一位置,并且在我的字符串中会有多个带括号的标记,如:

"[person] is feeling really [how] today, so he's not going [where]."
我真的不知道从哪里开始,或者这是使用令牌化或令牌模块的最佳解决方案。任何提示,将指向正确的方向,我非常感谢


编辑:只是为了澄清一下,我真的不需要使用方括号,任何非标准字符都可以。

您可以使用
格式
方法

>>> a = 'The weather today is {weather_state}.'
>>> a.format(weather_state = 'awesome')
'The weather today is awesome.'
>>>
此外:


当然,只有当您可以从方括号切换到花括号时,此方法才有效。

您正在使用回调函数查找re.sub:

words = {
    'person': ['you', 'me'],
    'how': ['fine', 'stupid'],
    'where': ['away', 'out']
}

import re, random

def random_str(m):
    return random.choice(words[m.group(1)])


text = "[person] is feeling really [how] today, so he's not going [where]."
print re.sub(r'\[(.+?)\]', random_str, text)

#me is feeling really stupid today, so he's not going away.   
请注意,与
format
方法不同,这允许对占位符进行更复杂的处理,例如

[person:upper] got $[amount if amount else 0] etc

基本上,您可以在此基础上构建自己的“模板引擎”。

如果您使用大括号而不是括号,那么您的字符串可以用作模板。您可以使用以下方法用大量替换来填充:

屈服

Buster is feeling really hungry today, so he's not going camping.
Buster is feeling really hungry today, so he's not going biking.
Buster is feeling really sleepy today, so he's not going camping.
Buster is feeling really sleepy today, so he's not going biking.
Arthur is feeling really hungry today, so he's not going camping.
Arthur is feeling really hungry today, so he's not going biking.
Arthur is feeling really sleepy today, so he's not going camping.
Arthur is feeling really sleepy today, so he's not going biking.

要生成随机句子,您可以使用:


如果必须使用括号并且格式中没有括号,则 可以用支架替换支架,然后按上述步骤进行:

text = "[person] is feeling really [how] today, so he's not going [where]."
text = text.replace('[','{').replace(']','}')

这可能是一个愚蠢的建议,但是你有没有研究过使用
{}
s的字符串格式?这个
person=person,how=how,where=where
的东西如果有数百个的话会变得非常愚蠢。我决定远离
format(**locals())
,因为它无法确切说明替换是如何进行的。但是如果你有数百个变量,
format(**locals())
将是最好的选择。那太好了,我喜欢它干净高效的方式。它能胜任这项工作,作为一名Python初学者,理解它给了我一个优势。:)聪明的做法是编写一个字典文件,将其保存在光盘上,并将其加载到“单词”字典中。。。有没有关于字典文件语法在文件中是什么样子的指针?非常感谢@bitworks:最简单、最方便的选项是json:
import itertools as IT

text = "{person} is feeling really {how} today, so he's not going {where}."
persons = ['Buster', 'Arthur']
hows = ['hungry', 'sleepy']
wheres = ['camping', 'biking']

for person, how, where in IT.product(persons, hows, wheres):
    print(text.format(person=person, how=how, where=where))
Buster is feeling really hungry today, so he's not going camping.
Buster is feeling really hungry today, so he's not going biking.
Buster is feeling really sleepy today, so he's not going camping.
Buster is feeling really sleepy today, so he's not going biking.
Arthur is feeling really hungry today, so he's not going camping.
Arthur is feeling really hungry today, so he's not going biking.
Arthur is feeling really sleepy today, so he's not going camping.
Arthur is feeling really sleepy today, so he's not going biking.
for i in range(5):
    person = random.choice(persons)
    how = random.choice(hows)
    where = random.choice(wheres)
    print(text.format(person=person, how=how, where=where))
text = "[person] is feeling really [how] today, so he's not going [where]."
text = text.replace('[','{').replace(']','}')