开发者

Eight backslashes required to replace single backslash with double backslashes?

开发者 https://www.devze.com 2023-04-13 04:34 出处:网络
This is a \"what the heck is going on here\" question.I don\'t actually need a solution开发者_开发百科.

This is a "what the heck is going on here" question. I don't actually need a solution开发者_开发百科.

I had to replace all single backslashes in a String with double backslashes . This is what I ended up doing...

strRootDirectory = strRootDirectory.replaceAll("\\\\", "\\\\\\\\");

...where strRootDirectory is a java.lang.String above.

Now, I understand the four backslashes for the first argument: regex expects two backslashes in order to indicate a single literal backslash, and java wants them doubled up. That's fine.

BUT, what the heck is going on with the eight backslashes for the second argument? Isn't the replacement string supposed to be a literal (non-regex, I mean) string? I expected to need four backslashes in the second argument, in order to represent two backslashes.


The second argument isn't a regex-string, but a regex-replacement-string, in which the backslash also has a special meaning (it is used to escape the special character $ used for variable interpolation and is also used to escape itself).

From The API:

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll. Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired.

-- http://download.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll(...)


It's easier if you use replace("\\","\\\\") (String.replace takes literal strings and is more efficient when it's all literal)

or you can ensure correctness through the Pattern.quote and Matcher.quoteReplacement functions


"\\\\\\\\" leads to an in memory representation of a string with 4 backslashes: \\\\. Although the second string isn't a regex string, backslashes and dollar signs are still special characters in it, so they need to be escaped.


According to Java reference material, the replaceAll method interprets backslashes in the replacement string as escape characters too. They could be used to escape the dollar sign character, which could refer to matched expressions to re-use in the replacement string. so naturally, if you want to double the number of backslashes, and both parameters treat backslash as an escape character, you need twice as many backslashes in the replacement string.


Yep, it gets hairy when you need to do this sort of thing, doesn't it.

The reason you need so many backslashes is that you need to take into account that backslash is used for both escaping a string and for escaping a regex.

  • Take 1 backslash.
  • Double it for string escaping.
  • Double it again for regex escaping.
  • Double it again because you need to match two consecutive backslashes in your original string.

That makes 8.


As a fan of not getting into super detailed explanations of regex... I figured out from the major answer post by Bart Kiers above:

System.out.println( "line1: "+"hello\\\\world" );
System.out.println( "line2: "+"hello\\\\world".replaceAll("\\\\\\\\", Matcher.quoteReplacement("\\") ) );

prints out

line1: hello\\world
line2: hello\world

I hope it helps...

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号