开发者

Python regex match letter sequences

开发者 https://www.devze.com 2023-03-12 06:06 出处:网络
I want to make a regex to match \"AGGH\", \"TIIK\", \"6^^?\" or whatever but not \"AGGA\", \"ABCD\". Basic开发者_运维知识库ally its the pattern of letters which matters. Is there a way to ask for a ch

I want to make a regex to match "AGGH", "TIIK", "6^^?" or whatever but not "AGGA", "ABCD". Basic开发者_运维知识库ally its the pattern of letters which matters. Is there a way to ask for a character you have or haven't previously had?


You could extract the pattern of your strings like this:

def pattern(s):
    d = {}
    return [d.setdefault(c, len(d)) for c in s]

Examples:

>>> pattern("AGGH")
[0, 1, 1, 2]
>>> pattern("TKKG")
[0, 1, 1, 2]
>>> pattern("AGGA")
[0, 1, 1, 0]
>>> pattern("ABCD")
[0, 1, 2, 3]

This function makes it trivial to compare the pattern of two strings.


There is a way to do it with a regex:

import re
strs=("AGGH", "TIIK", "6^^?" ,"AGGA", "ABCD")
p = re.compile('^(?P<one>.)(?P<two>.)(?P=two)(?!(?P=one)).$')
for s in strs:
    print s, p.match(s)

output:

AGGH <_sre.SRE_Match object at 0x011BFC38>
TIIK <_sre.SRE_Match object at 0x011BFC38>
6^^? <_sre.SRE_Match object at 0x011BFC38>
AGGA None
ABCD None

It's ugly, but it works. ;) The period before the dollar sign is needed if you want to match to end of string, it consumes the actual character which is scanned by the (?!(?P=one)), which is a "negative lookahead assertion".


Why not just use substring search?

if "AGGH" in myStr:
    print "Success!"


Yes, you can use conditional regex:

(?(id/name)yes-pattern|no-pattern)

See the details at http://docs.python.org/library/re.html

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号