Our application receives log files via email and so the lines are often broken up by the email client. Once I've read the body of the email in I have a string variable $log in the following format.
Fri Aug 26 11:52:30 2011 OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2]
PKCS11] built Fri Aug 26 11:52:30 2011 NOTE: OpenVPN 2.1 requires '--script-security 2'
or higher to call user-defined scripts or executables Fri Aug 26 11:52:30 开发者_高级运维2011
Control Channel Authentication: using 'ta.key' as a OpenVPN static key file
Fri Aug 26 11:52:30 2011 Outgoing Control Channel Authentication: Using 160
bit message hash 'SHA1' for HMAC authentication Fri Aug 26 11:52:30
2011 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1'
for HMAC authentication Fri Aug 26 11:52:30 2011 LZO compression initialized
Fri Aug 26 11:52:30 2011 Control Channel MTU parms [ L:1558 D:166 EF:66 EB:0
ET:0 EL:0 ] Fri Aug 26 11:52:30 2011 Socket Buffers: R=[8192->8192] S=[8192->8192]
As shown above the date does not always start on a newline. I'd like to generate an array containing the dates and log messages so that I can output a table with these fields in their own columns. I understand that I would need a regex to match the date field but how do I go about building the array?
I'm just going to update my answer with a new version entirely, since the example log file has changed a lot. Since the log seems to be line broken just about anywhere, this approach - now including a bit of regexp works:
$log="Fri Aug 26 11:52:30 2011 OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2]
PKCS11] built Fri Aug 26 11:52:30 2011 NOTE: OpenVPN 2.1 requires '--script-security 2'
or higher to call user-defined scripts or executables Fri Aug 26 11:52:30 2011
Control Channel Authentication: using 'ta.key' as a OpenVPN static key file
Fri Aug 26 11:52:30 2011 Outgoing Control Channel Authentication: Using 160
bit message hash 'SHA1' for HMAC authentication Fri Aug 26 11:52:30
2011 Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1'
for HMAC authentication Fri Aug 26 11:52:30 2011 LZO compression initialized
Fri Aug 26 11:52:30 2011 Control Channel MTU parms [ L:1558 D:166 EF:66 EB:0
ET:0 EL:0 ] Fri Aug 26 11:52:30 2011 Socket Buffers: R=[8192->8192] S=[8192->8192]
";
$str = implode(' ',preg_split("/[ ]*[\r\n]+/", $log));
$arrLogLines=preg_split('/[ ]*([\w]{3} [\w]{3} [0-9]{2} [\d:]+ \d{4}) /',$str,-1,PREG_SPLIT_DELIM_CAPTURE); // Cred to Herbert for the regexp, seems to work fine..
array_shift($arrLogLines);
for ($i=0;$i<sizeof($arrLogLines);$i++) {
if (($i/2)==(int)($i/2)) {
$offset=0;
$strArrIdx='date';
} else {
$offset=1;
$strArrIdx='message';
}
$arrLogMessages[($i-$offset)/2][$strArrIdx]=$arrLogLines[$i];
}
var_dump($arrLogMessages);
It produces the expected:
array(8) {
[0]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(56) "OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2] PKCS11] built"
}
[1]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(102) "NOTE: OpenVPN 2.1 requires '--script-security 2' or higher to call user-defined scripts or executables"
}
[2]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(75) "Control Channel Authentication: using 'ta.key' as a OpenVPN static key file"
}
[3]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(98) "Outgoing Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication"
}
[4]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(98) "Incoming Control Channel Authentication: Using 160 bit message hash 'SHA1' for HMAC authentication"
}
[5]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(27) "LZO compression initialized"
}
[6]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(63) "Control Channel MTU parms [ L:1558 D:166 EF:66 EB:0 ET:0 EL:0 ]"
}
[7]=>
array(2) {
["date"]=>
string(24) "Fri Aug 26 11:52:30 2011"
["message"]=>
string(46) "Socket Buffers: R=[8192->8192] S=[8192->8192] "
}
}
I'm not a regex pro and sure there is an easier way, but this works:
$input = "Wed Aug 03 13:56:31 2011 OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2]
[PKCS11] built on Mar 12 2011
Wed Aug 03 13:56:31 2011 NOTE: OpenVPN 2.1 requires '--script-security
2' or higher to call user-defined scripts or executables
Wed Aug 03 13:56:31 2011 Control Channel Authentication: using 'ta.key'
as a OpenVPN static key file";
preg_match_all('/([\w]{3} [\w]{3} [0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{4}) (.*)/', $input, $matches, PREG_SET_ORDER);
var_dump($matches);
This results in:
array(3) {
[0] =>
array(3) {
[0] =>
string(67) "Wed Aug 03 13:56:31 2011 OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2]"
[1] =>
string(24) "Wed Aug 03 13:56:31 2011"
[2] =>
string(42) "OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2]"
}
[1] =>
array(3) {
[0] =>
string(70) "Wed Aug 03 13:56:31 2011 NOTE: OpenVPN 2.1 requires '--script-security"
[1] =>
string(24) "Wed Aug 03 13:56:31 2011"
[2] =>
string(45) "NOTE: OpenVPN 2.1 requires '--script-security"
}
[2] =>
array(3) {
[0] =>
string(71) "Wed Aug 03 13:56:31 2011 Control Channel Authentication: using 'ta.key'"
[1] =>
string(24) "Wed Aug 03 13:56:31 2011"
[2] =>
string(46) "Control Channel Authentication: using 'ta.key'"
}
}
I believe this is what you're looking for:
<?php
$log = <<<LOG
Wed Aug 03 13:56:31 2011 OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2]
[PKCS11] built on Mar 12 2011
Wed Aug 03 13:56:31 2011 NOTE: OpenVPN 2.1 requires '--script-security
2' or higher to call user-defined scripts or executables
Wed Aug 03 13:56:31 2011 Control Channel Authentication: using 'ta.key'
as a OpenVPN static key file
LOG;
function splitLog($log)
{
$log = str_replace("\n",'~',$log);
$log = str_replace("\r",'',$log);
$log .= '~';
preg_match_all('/([\w]{3} [\w]{3} [0-9]{2} [\d:]+ \d{4})((?:.*?~){2})/', $log, $m);
$logArray = array();
foreach($m[0] as $k=>$v)
{
$a['date'] = $m[1][$k];
$a['message'] = trim(str_replace('~', '', $m[2][$k]));
array_push($logArray, $a);
}
return $logArray;
}
$logArray = splitLog($log);
var_dump($logArray);
?>
Output
array
0 =>
array
'date' => string 'Wed Aug 03 13:56:31 2011' (length=24)
'message' => string 'OpenVPN 2.1.4 i686-pc-mingw32 [SSL] [LZO2] [PKCS11] built on Mar 12 2011' (length=72)
1 =>
array
'date' => string 'Wed Aug 03 13:56:31 2011' (length=24)
'message' => string 'NOTE: OpenVPN 2.1 requires '--script-security 2' or higher to call user-defined scripts or executables' (length=102)
2 =>
array
'date' => string 'Wed Aug 03 13:56:31 2011' (length=24)
'message' => string 'Control Channel Authentication: using 'ta.key' as a OpenVPN static key file' (length=75)
If every line starts with a date like this, you can just use substr
.
The date exists on every line and always with the same length. Alright, the first line ends with a sate too, but that has a different meaning and a different notation. Regex isn't gonna help you with that either.
精彩评论