开发者

Java regex, need help with escape characters

开发者 https://www.devze.com 2022-12-27 01:26 出处:网络
My HTML looks like: <td class=\"price\" valign=\"top\"><font color= \"blue\">&nbsp;&nbsp;$&nbsp;5.93&nbsp;</font></td>

My HTML looks like:

<td class="price" valign="top"><font color= "blue">&nbsp;&nbsp;$&nbsp;      5.93&nbsp;</font></td>

I tried:

String result = "";
        Pat开发者_JS百科tern p =  Pattern.compile("\"blue\">&nbsp;&nbsp;$&nbsp;(.*)&nbsp;</font></td>");

        Matcher m = p.matcher(text);

        if(m.find())
            result = m.group(1).trim();

Doesn't seem to be matching.

Am I missing an escape character?


Unless escaped at the regex level, $ means match the end of line. And to get the single \ needed to escape the $ it needs to be escaped in the String literal; i.e. two \ characters. So ...

... Pattern.compile("\"blue\">&nbsp;&nbsp;\\$&nbsp;(.*)&nbsp;</font></td>");

But the folks who commented that you shouldn't use regexes to parse HTML are absolutely right!! Unless you want chronically fragile code, your code should use a strict or non-strict HTML parser.


May be you need to escape $ (I think, with two slashes)?

0

精彩评论

暂无评论...
验证码 换一张
取 消