开发者

Regex: mask all but the last 5 digits, ignoring non-digits

开发者 https://www.devze.com 2023-02-20 14:49 出处:网络
I want to match a number containing 17-23 digits interspersed with spaces or hyphens, then replace all but the last five digits with a开发者_开发知识库sterisks. I can match with the following regex:

I want to match a number containing 17-23 digits interspersed with spaces or hyphens, then replace all but the last five digits with a开发者_开发知识库sterisks. I can match with the following regex:

((?:(?:\d)([\s-]*)){12,18})(\d[\s-]*){5}

My problem is that I can't get the regex to group all instances of [\s-] in the first section, and I have no idea how to get it to replace the initial 12-18 digits with asterisks (*).


How about this:

s/\d(?=(?:[ -]*\d){5,22}(?![ -]*\d))/*/g

The positive lookahead insures that there are at least 5 digits ahead of the just-matched digit, while the embedded negative lookahead insures that aren't more than 22.

However, there could still be more digits before the first-matched digit. That is, if there are 24 or more digits, this regex only operates on the last 23 of them. I don't know if that's a problem for you.


Even assuming that this is feasible with regex alone I'd bet that it would be way slower than using the non-capturing version of your regex and then reverse iterating over the match, leaving the first 5 digits alone and replacing the rest of them with '*'.


I think your regex is ok, but you might need to have a callback where you can insert the asterisks with another inline regex. The below is a Perl example.

s/((?:\d[\s-]*){12,18})((?:\d[\s-]*){4}\d)/ add_asterisks($1,$2) /xeg

use strict;
use warnings;

my $str = 'sequence of digits 01-2  3-456-7-190 123-416 78 ';

if ($str =~ s/((?:\d[\s-]*){12,18})((?:\d[\s-]*){4}\d)/ add_asterisks($1,$2) /xeg )
{
   print "New string: '$str'\n";
}

sub add_asterisks {
   my ($pre,$post) = @_;
   $pre =~ s/\d/*/g;
   return $pre . $post;
}

__END__

Output

New string: 'sequence of digits **-* *-***-*-*** ***-416 78 '


To give a java regex variant to Alan Moore's answer and using all word characters [a-zA-Z0-9] as \w instead of just digits \d. This will also work with any length string.

public String maskNumber(String number){
    String regex = "\\w(?=(?:\\W*\\w){4,}(?!\\W*\\w))";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(number);
    while(m.find()){
        number = number.replaceFirst(m.group(),"*");
    }
    return number;
}

This example

String[] numbers = {
        "F4546-6565-55654-5457",
        "F4546-6565-55654-54-D57",
        "F4546-6565-55654-54-D;5.7",
        "F4546-6565-55654-54-g5.37",
        "hd6g83g.duj7*ndjd.(njdhg75){7dh i8}",
        "####.####.####.675D-45",
        "****.****.****.675D-45",
        "**",
        "12"
};

for (String number : numbers){
    System.out.println(maskNumber(number));
}

Gives:

*****-****-*****-5457
*****-****-*****-*4-D57
*****-****-*****-*4-D;5.7
*****-****-*****-**-g5.37
*******.*********.(*******){*dh i8}
####.####.####.**5D-45
****.****.****.**5D-45
**
12
0

精彩评论

暂无评论...
验证码 换一张
取 消