开发者

Delineate and Extract Data from Large Text Files Using Java

开发者 https://www.devze.com 2023-01-19 06:42 出处:网络
I have an ASCII formatted file with 250k+ lines of text on which I need to perform 开发者_JS百科2 steps.

I have an ASCII formatted file with 250k+ lines of text on which I need to perform 开发者_JS百科2 steps.

1) scan through the entire file and delineate sections by matching a given regular expression pattern.

2) read each section of data and parse subsections from it.

One option is to use line-oriented scan of the file utilizing a BufferedReader, test each line for a match and store the line number for matches.

Are there more efficient options perhaps utilizing the nio namespace?


Perhaps pump the file through a chain of streams ; one stream that only passes sections matching your regular expression, followed by a stream that performs the parsing step.

e.g.

OutputStream os = RegexFilterOutputStream(
                  new ParsingStuffOutputStream()
                  );
while(input not empty) {
    // write stuff from input to os
}
0

精彩评论

暂无评论...
验证码 换一张
取 消