Say I only need to find out whether a line read from a file contains a word from a finite set of words.
One way of doing this is to use a regex like this:
.*\y(good|better|best)\y.*
Another way of accomplishing this is using a pseudo code like this:
 if ( (readLine.find("good")   != string::npos) ||
      (readLine.find("better") != string::npos) ||
      (readLine.find("best")   != string::npos) )
 {
   // line contains a word from a finite set of words.
 }
Which way will have better performance? (i.e. 开发者_如何学Pythonspeed and CPU utilization)
The regexp will perform better, but get rid of those '.*' parts. They complicate the code and don't serve any purpose. A regexp like this:
\y(good|better|best)\y
will search through the string in a single pass. The algorithm it builds from this regexp will look first for \y, then character 1 (g|b), then character 2 (g => go or b => be), character 3 (go => goo or be => bes|bet), character 4 (go => good or bes => best or bet => bett), etc. Without building your own state machine, this is as fast as it gets.
You won't know which is faster until you've measured, but the issues at stake are:
- The regex implementation, esp. whether it needs to precompile (like Google RE2, POSIX regexes).
- The implementation of string::find.
- The length of the string you're searching in.
- How many strings you're searching in.
My bets are on the regex, but again: you've got to measure to be sure.
Obviously not the second one (using 'find'), since you're running three comparisons (need to traverse the string at least 3 times) instead of one hopefully smart one. If the regex engine works at all like it should (and I suppose it does) then it will probably be at least three times faster.
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论