开发者

How to delete () using re module in Python

开发者 https://www.devze.com 2023-03-05 20:11 出处:网络
I am in trouble for processing XML text. I want to delete () from my text as follows: from <b>(apa-bhari(n))</b> to <b>apa-bhari(n)</b>

I am in trouble for processing XML text. I want to delete () from my text as follows:

from <b>(apa-bhari(n))</b> to <b>apa-bhari(n)</b>

The following code was made

name= re.sub('<b>\((.+)\)</b>','<b>\1</b>',name)

But this can only returns

<b></b>

I do not understand escape sequences and backreference. Please tell me the solution.开发者_开发问答


You need to use raw strings, or escape the slashes:

name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>', name)


You need to escape backslashes in Python strings if followed by a number; the following expressions are all true:

assert '\1' == '\x01'
assert len('\\1') == 2
assert '\)' == '\\)'

So, your code would be

name = re.sub('<b>\\((.+)\\)</b>','<b>\\1</b>',name)

Alternatively, use the regular expression string definition:

name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>',name)


Try:

name= re.sub('<b>\((.+)\)</b>','<b>\\1</b>',name)

or if you do not want to have an illisible code with \\ everywhere you are using backslashes, do not escape manually backslashes, but add an r before the string, ex: r"myString\" is the same as "myString\\".

0

精彩评论

暂无评论...
验证码 换一张
取 消