开发者

pulling dates out of a string

开发者 https://www.devze.com 2023-03-27 19:11 出处:网络
My problem is as follows: I have an array of strings that contain dates and other data. My date will have one of several formats:

My problem is as follows:

I have an array of strings that contain dates and other data. My date will have one of several formats:

  1. dd/mm/yyyy
  2. dd/mm/yy
  3. mm/yy
  4. d/m/yy
  5. yyyy
  6. yy

Is there a way to search a string for numbers that fit that pattern in the string?

In addition, it would be nice 开发者_运维百科if I could check if the dd is between 1 and 31 inclusive etc, but it would not be so bad if I had to do that afterwards.


Each of these corresponds to a regex.

Here are regexes for each:

  • dd/mm/yyyy ==> \b(?:[012][1-9]|3[01])/(?:0[1-9]|1[012])/\d{4}\b
  • dd/mm/yy ==> \b(?:[012][1-9]|3[01])/(?:0[1-9]|1[012])/\d{2}\b
  • mm/yy ==> \b(?:0[1-9]|1[012])/\d\d\b
  • d/m/yy ==> \b[1-9]/[1-9]/\d\d\b
  • yyyy ==> \b\d{4}\b
  • yy ==> \b\d\d\b

Of course, you can combine these together in different ways. You can even make one super regex.

The last one is rather interesting, though. I can imagine a case where you might have a plain old number in your text, like 42 that might not actually correspond to a year. Still I guess you can postprocess that.

Happy regexing.

ADDENDUM

To answer some questions in the comments:

  1. Yes it works at the beginning and the end of the string, because \b is a word boundary, which includes all transitions from word characters (letters, digits, and underscores) to non-word characters and vice-versa, including the beginning and ending of the string.

  2. To see tests, see here: http://jsfiddle.net/wRufK/. Yes I know this is in JavaScript and not C#, but jsfiddle is a very convenient way to show code in action. There are differences though -- in C# we use Regex.match and the JavaScript regex has extra backslashes to escape the inner forward slashes.

  3. indexOf might be overkill depending on the application. If you want to find all matches, see http://msdn.microsoft.com/en-us/library/twcw2f1c.aspx for info on repeated matching. You can also modify the regexes for capturing.

  4. Since your dates can be in any of the forms above, and probably others, a single regex might be preferable. A very flexible date finder is here: http://www.regular-expressions.info/dates.html. You might want to consider it instead of fixing an exact set.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号