开发者

splitting and escaped forward slashes in Python

开发者 https://www.devze.com 2023-04-12 14:52 出处:网络
I have a file containing perl-style regexs of the form /pattern/replace/ that I\'m attempting to read into Python as a list of compi开发者_开发技巧led patterns and their associated replacement strings

I have a file containing perl-style regexs of the form /pattern/replace/ that I'm attempting to read into Python as a list of compi开发者_开发技巧led patterns and their associated replacement strings. Below is what I've done so far.

def get_regex(filename):
    regex = []
    fi = open(filename,'r')
    text = [l for l in fi.readlines() if not l.startswith("#")]
    fi.close()
    for line in text:
        ptn, repl = line[1:].split('/')[:-1]
        regex.append((re.compile(ptn), repl))
    return regex

This works perfectly well until I get to lines with escaped forward slashes, like this:

/$/ <\\/a>/

When I try to split this string, Python returns a list of three elements, ['$', ' <\\', 's>'], rather than (the hoped for) ['$', ' <\\/s>']. Is there some way to make replace interpret the escapes?


Not really, no. Your best bet would probably be to use re.split() instead, with a regex that uses a lookbehind to make sure a forward slash isn't escaped, e.g.

UNESCAPED_SLASH_RE = re.compile(r'(?<!\\)/')
ptn, repl = UNESCAPED_SLASH_RE.split(line[1:])[:-1]
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号