开发者

Would it make sense to use MemoryMappedFile to perform a search on large text files?

开发者 https://www.devze.com 2023-03-21 16:03 出处:网络
I\'m tasked with implementing a search function that will search through several large (couple MB) log files and return the lines that contain the keywords. Log files are constantly being added to the

I'm tasked with implementing a search function that will search through several large (couple MB) log files and return the lines that contain the keywords. Log files are constantly being added to the pool so the search has to be dynamic every time.

Would it make sense to create a MemoryMappedFile for each file and then iterate through each line, matching the keywords? If not, what would be a 开发者_高级运维better way to go about it?

Any links to example code would be much appreciated.


Yes. A "couple of MB" is not very much, it easily fits in 2 GB.

You'll want to use the constructor that takes a mapping size because the file will grow in time. Also, I think you'll need to recreate the Accessor or Stream on each search, but I find MSDN a bit unclear here.

With a Stream, it's trivial to create a StreamReader, and read every line. The whole process is very likely I/O bound on reasonable hardware, so don't bother with CPU optimizations initially.


Why not just create a properly structured index object tree in memory, optimized for searching?

EDIT: Added after some comments...

Could be something like this:

class Index
{
    public Dictionary<string, List<SourceFile>> FilesThatContainThisWord {get; set;}
    ...
}


class SourceFile
{
    public string Path {get; set;}
    ...
}


// Code to look up a term
var filesThatContainMonday = myIndex.FilesThatContainThisWord["Monday"];
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号