开发者

Large text file dictionary of random words for benchmarking purposes?

开发者 https://www.devze.com 2023-01-21 07:29 出处:网络
I was wondering if anyone could point me to a very very large dictionary of random words that could be used to test some h开发者_如何转开发igh performance string data structures?I\'m finding some that

I was wondering if anyone could point me to a very very large dictionary of random words that could be used to test some h开发者_如何转开发igh performance string data structures? I'm finding some that are in the ~2MB range... however I'd like some larger if possible. I'm guessing there has to be some large standard string dataset somewhere that could be used. Thanks!


http://norvig.com/big.txt

The above link was mentioned in Norvig's spell checker article - http://norvig.com/spell-correct.html


I'd recommend taking a look through the material available at the TREC (Text REtrieval Conference). Some good datasets which might meet your requirements.

0

精彩评论

暂无评论...
验证码 换一张
取 消