开发者

How to make efficient code emerge through unit testing?

开发者 https://www.devze.com 2022-12-26 19:43 出处:网络
I participate in a TDD Coding Dojo, where we try to practice pure TDD on simple problems. It occured to me however that the code which emerges from the unit tests isn\'t the most efficient.Now this is

I participate in a TDD Coding Dojo, where we try to practice pure TDD on simple problems. It occured to me however that the code which emerges from the unit tests isn't the most efficient. Now this is fine most of the time, but what if the code usage grows so that efficiency becomes a problem.

I love the way the code emerges from unit testing, but is it possible to make the efficiency property emerge through further tests ?

Here is a trivial example in ruby: prime factorization. I followed a pure TDD approach making the tests pass one after the other validating my original acceptance test (commented at the bottom). What further steps could I take, if I wanted to make one of the generic prime factorization algorithms emerge ? To reduce the problem domain, let's say I want to get a quadratic sieve implementation ... Now in this precise case I know the "optimal algorithm, but in most cases, the client will simply add a requirement that the feature runs in less than "x" time for a given environment.

require 'shoulda'
require 'lib/prime'

class MathTest < Test::Unit::TestCase
  context "The math module" do
    should "have a method to get primes" do 
      assert Math.respond_to? 'primes'
    end
  end
  context "The primes method of Math" do
    should "return [] for 0" do
      assert_equal [], Math.primes(0)
    end
    should "return [1] for 1 " do
      assert_equal [1], Math.primes(1)
    end
    should "return [1,2] for 2" do 
      assert_equal [1,2], Math.primes(2)
    end
    should "return [1,3] for 3" do 
      assert_equal [1,3], Math.primes(3)
    end
    should "return [1,2] for 4" do 
      assert_equal [1,2,2], Math.primes(4)
    end 
    should "return [1,5] for 5" do 
      assert_equal [1,5], Math.primes(5)
    end   
    should "return [1,2,3] for 6" do 
      assert_equal [1,2,3], Math.primes(6)
    end       
    should "return [1,3] for 9" do 
      assert_equal [1,3,3], Math.primes(9)
    end        
    should "return [1,2,5] for 10" do 
      assert_equal [1,2,5], Math.primes(10)
    end                  
  end
#  context "Functionnal Acceptance test 1" do
#    context "the prime factors of 14101980 are 1,2,2,3,5,61,3853"do      
#      should "return  [1,2,3,5,61,3853] for ${14101980*14101980}" do
#        assert_equal [1,2,2,3,5,61,3853], Math.primes(14101980*14101980)
#      end
#    end
#  end
end

and the naive algorithm I created by this approach

module Math
  def self.primes(n)
    if n==0
      return []
    else
      primes=[1]  
      for i in 2..n do
        if n%i==0          
          while(n%i==0)
            primes<<i
            n=n/i
          end
        end
      end      
      primes
    end
  end
end

edit 1 Judging from the first answers, I guess I wasn't clear in my initial description: the performance test is not a standard part of my unit test, it is a new acceptance test written to answer a specific requirement from the client.

edit 2 I know how to test the execution time,but it seems like moving from th开发者_StackOverflow社区e trivial algorithm to the optimized one is a huge step. my question is how to make the optimal code emerge, in other terms : how do you decompose the migration from the trivial code to the optimal one ? Some mentioned it is a problem specific approach : I provided a sample problem for which I don't know how to continue.


  • TDD is for correctness and non regression and focus on unit testing
  • Profiling is for performance and it's a functional testing problem.

I also used to participate in a weekly TDD coding Dojo and we tried some experiments to see if it was possible to use it for algorithmic purpose (find better algorithm, find an algorithm where there is no obvious one) or built-in performance constraints.

When using TDD in Dojo we try to follow the rules below

  • write the simplest test that break the existing code (or pay a beer if it doesn't break code)
  • write the simplest code that make the test pass
  • refactor the code (using code smells) before adding a test
  • also refactor tests before adding new tests

Given these rules we have more room to experiment than what is obvious at first sight. We can tweak the definition of simplest and add code smells to take efficiency into account (basically: if we think of several easy ways of implementing something prefer the most efficient and if we know of some more efficient - but still simple of well known - algorithm than the one used in our code it's a smell).

Summarily the results was that TDD itself is not well fitted to predict overall code performance and achieve efficient code from start, even if using TDD and refactoring we succeeded achieving a better insight on our code and could enhance it to achieve better readability and avoid some obvious performance bottlenecks. Trying to insert performances constraints in code at that test level was usually disastrous (we got code and test much too complex and often broken code or too complex to change).

One reason is that TDD we usually work with very small tests set (the simplest test that fail). On the other hand more performance problems occurs with real data set and it fits very poorly with the above rules. Performance tests, even if formally still unit testing, are more alike functional testing. Common optimization strategy involve adding caches, or taking into account some property of real data distribution, or negociate changes in user stories when some small benefit feature has great negative impact on performance. All of these can't really be built-in in TDD but are more likely found while profiling code.

I believe performances goal are basically a functional testing problem.


No, and you shouldn't try. Unit tests test correctness, not efficiency - forcing them to test efficiency is a form of premature optimization.


TDD cannot help you derive algorithms - if that is your question. It's one of those niche areas that doesn't lend itself to the strengths of TDD (niche: as compared to the hordes churning out enterprise software that calls into a zillion frameworks/libraries).

Nothing to stop you from still applying TDD - you may write up a performance test but what would be your target spec ? What if there is a possibility to halve your spec ? These questions cant be answered without profiling your code and chipping away at it in the right manner. Going test-driven wont answer these questions ; at the most it'd give you a safety net to detect if you just broke existing code.

e.g. You could TDD your way to implement sorting, but the chances of you finding a new algorithm or reaching an existing efficient one like quicksort are bleak. Unless you know the principles of algorithm design and consciously work your way towards it.

Update: Supporting evidence. http://www.markhneedham.com/blog/2009/12/10/tdd-big-leaps-and-small-steps/ There are some more - however they are like discussions on reddit. opinionated unsupported free-for-alls. Not posting those.


Unit testing generally checks for piecemeal correctness of functionality.

You can certainly add timing constraints to a unit test, but you may have a hard time expressing them in technology-independent ways (e.g., O(N ln N).

And, just because you can write a unit test that insists that a result be delivered in constant time, does not mean the coder for the unit can necessarily find an algorithm that achieves that effect. (Of course, he might not be able to think of how to correctly implement the functionality, either.

If you do this, I'd suggesting isolating the functionality tests and the performance tests. Then at least you can tell if the code works before you decide the performance is truly bad.


In your unit test code, you can add code that measures the elapsed time of target code. Pseudocode would be something like:

start_time = date_time_now();
Math.primes(1000);
stop_time = date_time_now();
assert stop_time-start_time < target_execution_time;

Some unit text frameworks may already have the elapsed time which you can refer to. This makes extra boilerplate code of measuring the time unnecessary.

Also, elapsed_time is just one example of a "efficiency" metric to use. Other metrics to test include cpu_time, throughput, input/output bytes transferred, etc


Make a efficiency test. You give the requirement :"that the feature runs in less than "x" time for a given environment." write a test that tests for execution time. If you pass the test then there is no need for further code enhancements if it fails make it faster, either through profiling and micro opts or algorithmic enhancements.

I have to agree with BlueRaja performance tests shouldn't be a standard part of your unit tests, though If there is a heavy emphasis on performance it can help keep it on the table.


I initially though this wouldn't work and you needed a bigger leap than TDD might afford, and I was going to say at least your tests are going to help you as you rewrite your code.

But you should try and let us know. Although I haven't done it, I think your next test should be a performance test. This is clearly a requirement at this time, so why not continue in the same vein.

I think you can write one that will run reliably on any platform by establishing a baseline. You'll need some infrastructure to help you out, but can it look like:

TEST: should be faster than O(n^2)
setup: baseline_time_for_10 = time_of( f(10) )
100: assert time_of(f(100)) < baseline_time_for_10 ^ 2    
etc.

I have always wanted to do this, but haven't had the right opportunity on a project. Let us know how it goes.

0

精彩评论

暂无评论...
验证码 换一张
取 消