开发者

Regex for URL that don't accept "\n"

开发者 https://www.devze.com 2023-02-13 05:50 出处:网络
I\'m trying to replace a url in a string with regex. The problem is that the string can contain \"\\n\". For example,

I'm trying to replace a url in a string with regex. The problem is that the string can contain "\n". For example,

http://www.google.com\n

And \n is a newline. The string is collected from a textarea. Can anyone please help me find a regex that matches the url and knows that \n isn't part of the url.

Edit,

One of the regex I've tried,

@"(?<!<\s*(?:a|img)\b[^<]*)(\b(?:(?:http|https|ftp|file)://|www\.)[^ |\\]+\b)"

r.Replace(text, "<a href=\"$1\" target=\"&#95;blank\">$1</a>")

r = My 开发者_如何学编程Regex object and text is the input where I want to replace the URL with a hyperlink.


What about just adding \n to your existing regex?

@"(?<!<\s*(?:a|img)\b[^<]*)(\b(?:(?:http|https|ftp|file)://|www\.)[^ |\\\n]+\b)"

?


You could try something like...

(http://)|(https://)?(www.)(\w)+(.)(\w)+

since \ isn't valid for a word character it stops matching at the \n.


I have found a suggestion by https://stackoverflow.com/users/53104/smazy

If you want to match till the very end of the string and ignore any line breaks use \z

Regex regex = new Regex(@"^[a-z0-9]+\z", RegexOptions.Multiline);

This is for both MutliLine and SingleLine, that doesn't matter.


Why not write a proper regex, built-up from the specs? Grab the rfc, and built the regex up, like they build up the definition in the RFC:

http://www.ietf.org/rfc/rfc1738.txt

So, as start:

scheme = @"http|https" ... scheme-specific = "//" + user + ":" + password" + "@" + host + ":" port + "/" + url-path url = scheme + ":" + scheme-specific

Sure, it is a lot of work, but you're sure you're not going to miss any cases. And it's really important to think very carefully about what data to accept (as your current version seems to be XSS-prone as well ( http://jehiah.cz/a/xss-stealing-cookies-101 )

Anything short of this, and you'll need to be coming back time and time again, because there's this other small thing which you discovered later...

0

精彩评论

暂无评论...
验证码 换一张
取 消