Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/355.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用python分隔字符串中的连接词_Python - Fatal编程技术网

使用python分隔字符串中的连接词

使用python分隔字符串中的连接词,python,Python,我想做的是用python读取这个字符串,并将连接的单词分开。我想要的是一个正则表达式来分隔字符串中的连接词 我想从文件中读取上述字符串,输出如下: "10JAN2015AirMail standard envelope from HyderabadAddress details:John Cena Palm DriveAdelaide.Also Contained:NilAction Taken:Goods referred to HGI QLD for further action.Attac

我想做的是用python读取这个字符串,并将连接的单词分开。我想要的是一个正则表达式来分隔字符串中的连接词

我想从文件中读取上述字符串,输出如下:

"10JAN2015AirMail standard envelope from HyderabadAddress details:John Cena Palm DriveAdelaide.Also Contained:NilAction Taken:Goods referred to HGI QLD for further action.Attachments:Nil34FEB2004"
(将连接词分开)

我需要编写一个正则表达式来分隔:

'2015年1月10日航空邮件','Hyderabaddress','details:John','DriveAdelaide'

需要一个正则表达式来识别上面这样的连接词,并在相同的字符串中用空格分隔它们,如

'2015年1月10日航空邮件,“海得拉巴地址”,“详细信息:约翰”

"10 JAN 2015 AirMail standard envelope from Hyderabad Address details : John Cena Palm Drive Adelaide. Also calculated: Nil Action Taken: Goods referred to USG for further action. Attachments : Nil 60 FEB 2004." 

上面的代码不起作用

我知道这个解决方案可以非常简单地对字符集进行分类(上限、下限、数字),但我更喜欢使用更详细的解决方案:

text = open('C:\sample.txt', 'r').read().replace("\n","").replace("\t","").replace("-",""‌​).replace("/"," ")

newtext = re.sub('[a-zA-Z0-9_:]','',text) #This regex does not work.Please assist

print text
print newtext

有时候,我们只需要指出正确的方向。

我们需要的是你自己尝试解决问题的证据。你有尝试过吗?你不能只是发布一个问题并要求解决方案,而不首先展示你自己在解决方案上的尝试。text=open('C:\sample.txt','r')。read().replace(“\n”,“更换”)。replace(“-”,“”)。replace(“/”,“”)newtext=re.sub('a-zA-Z0-9:','',text)打印文本打印新文本谢谢Rafael。正如你提到的,逻辑是清楚的,我正在检查使用正则表达式的更有效的方法。你想要一个神奇的正则表达式来完成这一切吗?或者可以通过混合使用正则表达式来完成?。。我可以进一步尝试,但这可能既不美观也不高效。
test_text = "10JAN2015AirMail standard envelope from HyderabadAddress details:John Cena Palm DriveAdelaide.Also Contained:NilAction Taken:Goods referred to HGI QLD for further action.Attachments:Nil34FEB2004"
splitted_text = test_text.split(' ')
num = False
low = False
upp = False
result = []

for word in ss:
  new_word = ''
  if not word.isupper() and not word.islower():
    if word[0].isnumeric():
        num = True
        low = False
        upp = False
    elif word[0].islower():
        num = False
        low = True
        upp = False
    elif word[0].isupper():
        num = False
        low = False
        upp = True
    for letter in word:
      if letter.isnumeric():
        if num:
            new_word += letter
        else:
            new_word += ' ' + letter
        low = False
        upp = False
        num = True
      elif letter.islower():
        if low or upp:
            new_word += letter
        else:
            new_word += ' ' + letter
        low = True
        upp = False
        num = False
      elif letter.isupper():
        if low or num:
            new_word += ' ' + letter
        else:
            new_word += letter
        low = False
        upp = True
        num = False
      else:
        new_word += ' ' + letter
    result.append(''.join(new_word))
  else:
    result.append(word)
' '.join(result)
#'10 JAN 2015 Air Mail standard envelope from Hyderabad Address details : John Cena Palm Drive Adelaide . Also Contained : Nil Action Taken : Goods referred to HGI QLD for further action . Attachments : Nil 34 FEB 2004'