开发者

Decoding a string in python ... with _\x08__\x08_d\x08de\x08el\x08li\x08it\x08te\x08em in it

开发者 https://www.devze.com 2023-03-23 11:18 出处:网络
I have some strings with all this kind of characters in it which also has normal letters , and i want to transform all the \"wired\" characters in t开发者_如何学编程hey\'re normal representation .

I have some strings with all this kind of characters in it which also has normal letters , and i want to transform all the "wired" characters in t开发者_如何学编程hey're normal representation . So my question is : Is there a Pythonic way of doing this ?

I have a string for example this one :

Mymethods defined here:
 |  
 |  __add__(...)
 |      x.__add__(y) <==> x+y

This somehow has this output :

Mymethods defined here:\n 
 |  \n 
 |  _\x08__\x08_a\x08ad\x08dd\x08d_\x08__\x08_(...)\n 
 |      x.__add__(y) <==> x+y


Some (very old) bits of software used to simulate bold text on printers (such as daisy wheel of golfball typewriters) but printing a character then a backspace then the same character again. It looks like your text is an example of this.

That means you need to remove not just the backspace but also the character following it:

>>> s = "_\x08__\x08_d\x08de\x08el\x08li\x08it\x08te\x08em in it"
>>> import re
>>> re.sub("\x08.", "", s)
'__delitem in it'
>>> 

Better of course would be to fix whatever is generating this text and get it to generate bold text in a more useful manner.


\x08 is the character representation for backspace.

So you should do a regexp replace

s/.\\x08//

That will remove all the \x08.

The \n is OK because it represents the end of the line.

0

精彩评论

暂无评论...
验证码 换一张
取 消