I have this python script which is supposed to wrap all that look like a path within a tag to make an url out of it.
def wrap(text, regex):
s开发者_开发百科tart, end = '<a href="/static', '">Link to the file</a>'
matchs = sorted([(s.start(), s.end()) for s in re.finditer(regex, text)],
reverse = True)
for match in matchs:
text = text[:match[1]] + end + text[match[1]:]
text = text[:match[0]] + start + text[match[0]:]
return text
And I tried many combination like this one :
>>> wrap('HA HA HA /services/nfs_qa/log.lol HO HO HO', '/services/nfs_qa/.* ??')
'HA HA HA <a href="/static/services/nfs_qa/log.lol HO HO HO">Link to the file</a>'
But it seems I'm not able to get it right. So I could use a little help there !
Thanks in advance
It depends a bit on which characters you allow in path names, but this does the trick for your example:
wrap('HA HA HA /services/nfs_qa/log.lol HO HO HO', '/services/nfs_qa/[^ ]*')
'HA HA HA <a href="/static/services/nfs_qa/log.lol">Link to the file</a> HO HO HO'
The [^ ] means anything but a space (the opposite of [ ]).
If any character is allowed in a path name, it's impossible.
"." mathches every character, you should match " everything except whitespace character", which means \S
or on this example [^ ]
:
wrap('HA HA HA /services/nfs_qa/log.lol HO HO HO', '/services/nfs_qa/\S*')
And, your wrap function could have written simplier using re.sub
import re
def tag_it(match_obj):
tags = "<a href =\"/static{0}\">Link to the File</a>"
return tags.format(match_obj.group(0))
def wrap(text, regex):
return re.sub(regex, tag_it, text)
a = wrap('HA HA HA /services/nfs_qa/log.lol HO HO HO', '/services/nfs_qa/\S*')
print(a)
#Outputs:
#HA HA HA <a href ="/static/services/nfs_qa/log.lol">Link to the File</a> HO HO HO
You are trying to match to much. You only want to match the URL so an RE like '/services/nfs_qa/\S+'
is better suited. the \S+
matches any non whitespace characters after the /services/nfs_qa/
精彩评论