I am in trouble for processing XML text. I want to delete () from my text as follows:
from <b>(apa-bhari(n))</b> to <b>apa-bhari(n)</b>
The following code was made
name= re.sub('<b>\((.+)\)</b>','<b>\1</b>',name)
But this can only returns
<b></b>
I do not understand escape sequences and backreference. Please tell me the solution.开发者_开发问答
You need to use raw strings, or escape the slashes:
name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>', name)
You need to escape backslashes in Python strings if followed by a number; the following expressions are all true:
assert '\1' == '\x01'
assert len('\\1') == 2
assert '\)' == '\\)'
So, your code would be
name = re.sub('<b>\\((.+)\\)</b>','<b>\\1</b>',name)
Alternatively, use the regular expression string definition:
name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>',name)
Try:
name= re.sub('<b>\((.+)\)</b>','<b>\\1</b>',name)
or if you do not want to have an illisible code with \\ everywhere you are using backslashes, do not escape manually backslashes, but add an r before the string, ex: r"myString\" is the same as "myString\\".
加载中,请稍侯......
精彩评论