Python 用于捕获韩语字母的正则表达式_Python_Regex_Regex Group_Cjk_Regex Greedy

Python 用于捕获韩语字母的正则表达式

python regex

Python 用于捕获韩语字母的正则表达式,python,regex,regex-group,cjk,regex-greedy,Python,Regex,Regex Group,Cjk,Regex Greedy,我的数据框名称如下： '가락시장(340)', '가락시장(8)', '가산디지털단지(7)', '강남(222)', '강남구청', '강동', '강동구청', '강변(214)', '개롱', '개화산', '거여', '건대입구(212)', '건대입구(7)', '경복궁(317)', '경찰병원(341)', '고덕', '고려대', '고속터미널(329)', '고속터미널(7)', '공덕(5)', '공덕(6)', '공릉', '광나루', ... 所有的清单都在这里面期望输

我的数据框名称如下：

'가락시장(340)',
'가락시장(8)',
'가산디지털단지(7)',
'강남(222)',
'강남구청',
'강동',
'강동구청',
'강변(214)',
'개롱',
'개화산',
'거여',
'건대입구(212)',
'건대입구(7)',
'경복궁(317)',
'경찰병원(341)',
'고덕',
'고려대',
'고속터미널(329)',
'고속터미널(7)',
'공덕(5)',
'공덕(6)',
'공릉',
'광나루',
...

所有的清单都在这里面

期望输出：企图但是，

df['name']

没有改变

如何解决此问题？

我们可以用一个简单的表达式捕获您想要的输出，只需一个

“

作为左边界，然后收集字母，类似于：

'([\p{L}]+)

试验

正则表达式如果不需要此表达式，可以在中对其进行修改或更改

正则表达式电路可视化正则表达式：

参考文献

您可以使用以下代码删除括号和括号字符：

import re
pattern = re.compile(r'\(\w*\)')
for text in YOUR_DATA_LIST : 
    only_station_name = re.sub(pattern, '', text)
    print(only_station_name)

'([\p{L}]+)

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"'([\p{L}]+)"

test_str = ("'가락시장(340)',\n"
    " '가락시장(8)',\n"
    " '가산디지털단지(7)',\n"
    " '강남(222)',\n"
    " '강남구청',\n"
    " '강동',\n"
    " '강동구청',\n"
    " '강변(214)',\n"
    " '개롱',\n"
    " '개화산',\n"
    " '거여',\n"
    " '건대입구(212)',\n"
    " '건대입구(7)',\n"
    " '경복궁(317)',\n"
    " '경찰병원(341)',\n"
    " '고덕',\n"
    " '고려대',\n"
    " '고속터미널(329)',\n"
    " '고속터미널(7)',\n"
    " '공덕(5)',\n"
    " '공덕(6)',\n"
    " '공릉',\n"
    " '광나루',")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

import re
pattern = re.compile(r'\(\w*\)')
for text in YOUR_DATA_LIST : 
    only_station_name = re.sub(pattern, '', text)
    print(only_station_name)