Sunday, October 19, 2014

Clean Code Approach For Coffee Machine Example

"Hi, I'm Uncle Bob and THIS IS Clean Code". I think that some of us who are initiated already know what this post will be about :)
Recently during work when I was watching some Clean Code episodes (first sentence from this post is beginning sentence from every episode) I realized something.
I have noticed there is so little free information about Clean Code methodologies especially for C++ in the Internet. Lack of complete non-trivial examples about Google Test and Google Mock in web is issue as well. That's why I decided to show demonstration based on popular coffee machine example. The purpose of this post is to present basic and more advanced Google Framework features. I also want to focus on some deeper aspects of Clean Code like clean architecture and SOLID. Notice I'm not going to explain some details to minimize post size so be warned that this post is NOT Clean Code tutorial and NOT Gtest/Gmock tutorial.

1. Testing and mocking in a nutshell

Let's talk about some basic concepts associated with unit testing and mocking. Unit testing is testing on methods/functions level. Every single unit test usually contains part with preparing input data for method (Given), actual testing scenario (When) and checking results with assertion sequence (Then). Suppose we have some function foo with this prototype:
bool foo(Nam nam);
Test cases for foo may look like:
void test_foo_return_true()
{
   // Given:
   Nam nam(...); //prepare nam 

   // When:
   bool foo_result = foo(nam);

   // Then:
   ASSERT_TRUE(foo_result);
}
void test_foo_return_false()
{
   // Given:
   Nam nam(...); //prepare nam

   // When:
   bool foo_result = foo(nam);

   // Then:
   ASSERT_FALSE(foo_result);
}
Now we should explain why unit testing is important and extra effort put in creating tests will repay. Let's see advantages of unit testing:
  • set of well written unit tests is core concept of Clean Code
  • with set of working tests we can get rid of fear of changing/refactoring our project. Every refactoring iteration ends with production code state in which all tests pass
  • with set of working tests we can apply TDD (Test-driven development) and many short iteration cycles test-refactor-run
  • tests are form of documentation[1]
Now it's time on mocking. Mocking is replacing real objects by objects with the same interface but another functionality. Take a look on previous example with some details revealed and tests rewritten:
bool foo(Nam nam)
{
   if (nam.bar())
      return quk()
}
void test_foo_bar_return_true()
{
   EXPECT_CALL(mock_nam, bar)
      .Return(true);

   foo(mock_nam);
   ... 
}
void test_foo_bar_return_false()
{
   EXPECT_CALL(mock_nam, bar)
      .Return(false);

   foo(mock_nam);
   ... 
}
What's going on here? We have function foo again and we want to cover foo in 100% so we must force nam.bar() to return true in one test and false in another. It can be challenging in some cases. For example preparing appropriate nam state may be time consuming. Instead we will inject mock_nam to foo and will call mocked bar returning value explicitly set in EXPECT_CALL.

Why mock?  
  • to simplify testing. As I said before sometimes creating real resource representing real operation like database/network connection or reading file may be painful and expensive. Replacing those objects by mocks may simplify testing and decrease its execution time
  • to speed up prototyping. If we are stuck with sub-optimal design changing complex class hierarchy in general is not easy. Adding mocked class instead full-featured is way easier task
  • to decrease dependencies so instead relying on external libraries/frameworks using hand-written mocked calls may better choice
  • to help designing system
Of course there is some disadvantage of mocking. It's not free. Mocking may be really time-consuming if tested logic is non-trivial. Mocking (in gmock) may force us to reimplementing/refactoring some parts of code only for introducing interfaces and allowing DI. You will see those issues in next paragraphs.

2. Clean Code concepts

I used "Clean Code" term many times without explaining what it is. Now it's the moment.
Definition of Clean Code is not easy. It is quite broad mix of philosophy and methodologies created by Robert C. Martin known as Uncle Bob and customized for object oriented design. Concepts related with Clean Code:
  • good names in code, short functions/methods, avoiding comments, simple classes/structures, good formatting. These are probably the most basic rules, according Uncle Bob: "Clean code is simple and direct. Clean code reads like well-written prose".
  • SOLID and design pattern. These are important tools preventing code smells and fragility in object oriented code
  • suit of tests and Test-driven development
  • clean architecture[2]. The most high level part of Clean Code says that every object oriented system should contain few layers. It' s recommended to have the top layers based on user use cases and domain concepts. Those layers should be independent of frameworks/libraries which are only pluggable tools

3. Google Test and Google Mock

Not much to say. Gtest and Gmock are probably the best testing frameworks for C++ because:
  • they have many features: from basic to very advanced[3]
  • they are portable and platform independent. Support for Linux, Mac OS X, Windows, and more. Moving coffee machine project from Windows 7 to Ubuntu 14.04 was immediate and painless
  • have good failures reporting with providing much information about problems in runtime
  • easy to start  
  • good support and some community (http://googletesting.blogspot.com/)

4. Coffee machine design

In previous paragraphs we presented basic tools which will be used to solve our task. I believe this is the right moment when we can define it. We would like to create an application to simulate real coffee machine. It should provide functionalities like:
  • ingredients and menu settings
  • coffee choosing
  • coin insertion
  • coffee receiving
  • taking change
For simplicity I assumed machine was synchronous and one-threaded. Every object oriented system is use case driven in his nature, so good candidate for first step in design would be prepared with use case diagram[4]. In the name of time saving I will show only one use case. Use case diagram is great tool to show "big picture" of system without any details like modules or classes.


From coffee machine's point of view we have 2 actors (implementation doesn't reflect this separation actually) and 6 possible use cases which correspond to features posted above.With set of well defined use cases designing system class hierarchy is straightforward. In first iteration one use case may map on one method which produces following simple hierarchy expressed in UML:




Lets summarize our results.  We have in this moment one big class called coffee_machine with well exposed interface. Unfortunately every single change in this class methods or inside some field will cause recompilation of entire class. Extending functionality and maintenance of such coffee machine will be a nightmare. It' s obvious that there are too many responsibilities for one class, which clearly breaks SRP[5]. The best what we can do now is splitting functionality into 2 categories: user interface functions and resource management functions, which leads to coffee_machine and coffee_machine_manager classes respectively.


Take a look again on current hierarchy on the left. Coffee making proccess takes place in coffee machine manager which have access to all ingredients and recipes.
Coffee_machine itself expose user interface and is responsible for cash managing. No inheritance and one aggregation relation (by manager field[10]). Not bad. Although this hierarchy is much better than previous but still have many design problems. I wouldn't be myself if I wouldn't mention about it.
  • lack of base class problem. What if we want to have many types of coffee machines with different work characteristics but common methods? To achieve that every separate coffee machine class will be copy of another with small changes. DRY[6] will be broken.
  • lack of "degrees of freedom" problem. What if we want to exchange coffee machine manager or switch to another ingredients storage data structure? Or hold user_credit data structure on another machine and transfer it by network connection? In those situations we can't rely on simple std::map. To achieve that we must rewrite our coffee machine with many switch statements with spaghetti fashion or add some kind of enum field with type. DRY will be broken. OCP will be as well. It's even worse because if client decide to use only one type of ingredients storage with one type of user credit then lots of code will be redundant and YAGNI will be broken. Life is hard.
All above issues originate from one important reason. It is fragility problem. Fragility is one of design smells. It occurs when tiny change of a system code in some place gives big changes in many other places. Everything is too rigid in this moment. As in first approach with one class hierarchy as here maintenance will be challenging.
Just to make it clear. In real life this design is not so bad but for complex system those violations would cause code bloat and in result fear of change because of fragility.


OK. There are many issues, is it possible to solve them all? Yes, it is. Clean Code is remedy.
  • as I said in second paragraph good object oriented software should have one or more abstract layers which exposes plugin system for concrete classes. 
  • to overcome problem with rigidity of dependencies we introduce layer of interfaces (abstract_coffee_machine, abstract_coffee_machine_manager and abstract_storage. I know, I know. I should use word interface not abstract, language purists may complain). Thanks to dynamic polymorphism we will be able to inject concrete class with custom implementation. 
  • notice that we have dependency inversion (DIP is applied) in this moment. Source code dependency is opposite to runtime dependency.
  • now our main classes (coffee_machine and coffee_machine_manager) derive from interfaces. There is an extra generic (template) interface - abstract_storage<Key, Value> for replacing concrete std::map.
  • it is worth to mention that introducing extra abstraction level by adding coffee_machine_manager helps to apply SLA[7]. Now coffee_machine may handle only use cases without operating on concrete data structures like map, set, vector etc.

Current hierarchy is very close to final. Anyway, if we want to test our implementation carefully we need mocking. Therefore we add one mock class per every class. In Google Mock mocks should derive from class we want to mock. Without abstract bases mocking wouldn't be possible because call may be mocked only if it is virtual[8]. 



5. Implementation

I haven't much to say about implementation because there isn't anything special:) Coffee_machine_manager code is trivial. The most interesting part is probably coffee_machine::compute_change_with_greedy_approach where I used this approach http://en.wikipedia.org/wiki/Change-making_problem. Notice how C++11 features like auto, std::initializer_list, nullptr and std::shared_ptr affect on code readability. 
Take a look on some method implementation, for example get_available_coffees:
std::map<std::string,unsigned> coffee_machine::get_available_coffees() const
{
   const auto& ingredients = manager->get_ingredients();
   const auto& coffees = manager->get_coffees();
   std::map<std::string, unsigned> result;

   for (auto coffee : coffees)
   {
      auto& recipe_for_coffee = manager->get_recipe(coffee);
      bool available = check_ingredients_availability(recipe_for_coffee, ingredients);
      if (available)
         result.insert( { coffee, manager->get_price(coffee) } );
   }
   return result;
}
We can see here clean code techniques in action:
  • code is self-explaining. We have good, descriptive names of methods and variables like ingredients or get_recipe. With those names additional comments in code are unnecessary. With those names I haven't much to say about how does this code works ;)
  • short methods. Uncle Bob says that method longer than 3-4 lines is too long. I believe methods not bigger than my hand with no more than double nested statement (like if/for/while) are good enough. Every nested loop should be delegated to separated method (in this case I extracted for loop to check_ingredients_availability)
  • order of methods is important (like order public/private in class header). There are public methods on top and lower level methods with details on bottom. After opening source code file developer see right method implementation. If he want he can scroll down to take a look on details 
  • const-correctness
Let's consider clean code influence on performance. Some people may think that organising code as set of many tiny methods may have bad impact on performance. Notice that today the most important thing about x86/x64 speed code is memory access politics not level of code indirection. Nowadays compilers perform aggressive inlining so many small functions are actually joined into few big one. Even performance gurus like Martin Thompson[9] confirms that clean code means better speed because we actually can see what happen. Anyway, benchmarking and generated code observation is always the best what we can do and if clean code really degrade performance in some parts of system we can rewrite those parts and violate clean code rules.

6. Tests

This paragraph is the most important part of whole post. Programming with Gmock is not piece of cake and lots of details are getting involved so I decided to split this part into 3 sections starting with simple unit test and ending with more advanced techniques. Also this is the right moment to opening coffee machine sources (link on very end of post) and following pieces of code which I demonstrate below.

6.1 Unit testing without Gmock
TEST(ut_coffee_machine_manager, add_recipe)
{
    // Given:
    coffee_machine_manager manager;
    std::set<std::string> coffees = {"cappucino"};

    // When:
    manager.add_recipe("cappucino", 250u, cappucino);

    // Then:
    EXPECT_EQ(coffees, manager.get_coffees());
    EXPECT_EQ(250u, manager.get_price("cappucino"));
    EXPECT_EQ(cappucino, manager.get_recipe("cappucino"));
}
Take a look on test case above. This test is quite simple that's why I will limit myself to short comments:
  • want to test coffee_machine_manager::add_recipe. We use approach Given/When/Then presented in paragraph 1. It is black box test, we set input and check output. Coffee_machine_manager::add_recipe implementation is very simple so mocking would be useless
  • EXPECT_EQ behaves like normal assertion but doesn't abort tests and therefore allow more than one failures to be reported
  • EXPECT_EQ is not symmetric what surprise many people. Google Test failures reporting works better if we put expected value first and actual value as second argument
  • in this case we just call add_recipe and check if all values were actually added to manager

6.2 Unit testing with Gmock

Until we don't use mocks everything is easy and boring. Coffee machine implementation is more complex than coffee machine manager so mocks are needed. Notice mocking require knowledge about tested class implementation so all coffee machine tests are actually white box tests.
Take a look on test case below:
TEST_F(coffee_machine_fixture, get_available_coffees_empty)
{
    // Given:
    reset_mocks();
    expect_index_operators_from_coffee_machine_constructor();

    const coffee_machine lcoffee_machine(mock_manager, user_credit_);
    const std::set<std::string> expected_coffees;
    const std::map<std::string unsigned> expected_coffees_and_price;

    // When:
    expect_getter_calls_from_get_available_coffees(expected_coffees);
    auto actual_coffees_and_price = lcoffee_machine.get_available_coffees();

    // Then:
    EXPECT_EQ(expected_coffees_and_price, actual_coffees_and_price);
}
Above test case tests coffee_machine::get_available_coffees. There is get_available_coffees implementation in previous chapter but I will put it again to see what actually is going on:
std::map<std::string,unsigned> coffee_machine::get_available_coffees() const
{
   const auto& ingredients = manager->get_ingredients();
   const auto& coffees = manager->get_coffees();
   std::map<std::string, unsigned> result;

   for (auto coffee : coffees)
   {
      auto& recipe_for_coffee = manager->get_recipe(coffee);
      bool available = check_ingredients_availability(recipe_for_coffee, ingredients);
      if (available)
         result.insert( { coffee, manager->get_price(coffee) } );
   }
   return result;
}

As get_available_coffees_empty says this ut tests case when there is no available coffees. Let's look on get_available_coffees code. That situation may occur e.g when every coffee recipe demands more ingredients than there are actually leave in storage inside coffee machine manager. Another case is when there is no coffees registered in coffee machine manager at all.
As I said before one of testing goal is covering implementation code in 100% (if possible). It means that we have to force program flow in get_available_coffees to get inside loop in one test case:
for (auto coffee : coffees)
To get full coverege we must achieve this statement as well:
if (available)
Now we have get_available_coffees_empty case. How to force get_available_coffees to return empty result? Forcing manager to get empty coffees from get_coffees is one approach. Easy.

Let's look on coffee_machine_fixture.get_available_coffees_empty again. We are going to force mock_manager to return empty expected_coffees. We will use EXPECT_CALL macros as in paragraph 1. Putting our expectations and mock calls behavior takes place inside expect_getter_calls_from_get_available_coffees which looks like:
void expect_getter_calls_from_get_available_coffees(
   const std::set<std::string> &expected_coffees)
{
   EXPECT_CALL(*mock_manager, get_ingredients())
      .WillOnce(ReturnRef(*mock_storage_));

   EXPECT_CALL(Const(*mock_manager), get_coffees())
      .WillOnce(ReturnRef(expected_coffees));
} 
How does it works in short? We want to change behavior of this line:
 const auto& coffees = manager->get_coffees();
Take a look on class hierarchy in previous paragraph. Type of manager is abstract_coffee_machine_manager and abstract_coffee_machine_manager::get_coffees is (pure) virtual. Classes coffee_machine_manager and mock_coffee_machine_manager are derived from this abstract class. So behavior of get_coffees method may be changed in run-time by dynamic polymorphism. Indeed method get_coffees was overrided inside mock_coffee_machine_manager with this magic macro:
 MOCK_CONST_METHOD0(get_coffees, const std::set<std::string>&);
So now mock haves own get_coffees method which may be "programmed" by EXPECT_CALL-s from ut level. We did that by
EXPECT_CALL(Const(*mock_manager), get_coffees())
   .WillOnce(ReturnRef(expected_coffees)); 
We told our mocked get_coffees to return reference to expected_coffees which is empty set of strings. That's not end. We must dispatch in someway our mock to get_available_coffees method in run-time. We used dependency injection mechanism to inject mock_manager object to get_available_coffees method by constructor. Apart setting EXPECT_CALL-s we must check results by some kind of asserts. After calling tested code in When section by
lcoffee_machine.get_available_coffees();
we checked results by
EXPECT_EQ(expected_coffees_and_price, actual_coffees_and_price);
in Then section.
There is some extra code which may seems to be redundant. I'm talking about expect_index_operators_from_coffee_machine_constructor call. Before I will explain what does this call do I would like to ask you question. What if we comment all EXPECT_CALL-s from get_available_coffees_empty by commenting expect_index_operators_from_coffee_machine_constructor and expect_getter_calls_from_get_available_coffees ? Answers later.

6.3 More advanced unit testing

Over a dozen seconds ago you read a lot of information about how gmock handle mocking and how dependency injection and dynamic polymorphism help in that. It's great but what if we forget about mocking some call? In other words how gmock report lack of EXPECT_CALL on some method. Let's comment only one line with expect_index_operators_from_coffee_machine_constructor in coffee_machine_fixture.get_available_coffees_empty test case ad see what happen.

Gmock reports lack of EXPECT_CALL by "Uninteresting mock function call" message with name of call with arguments. In this case indexOperator method is not expected. Opposite situation take place when we add redundant EXPECT_CALL, e. g. we forget inject mock object but set EXPECT_CALL in test. Let's do that by duplicating following code in expect_getter_calls_from_get_available_coffees:
EXPECT_CALL(*mock_manager, get_ingredients())
   .WillOnce(ReturnRef(*mock_storage_));
Now we get something like that:

Gmock reports redundant EXPECT_CALL by "Unsatisfied and active" message
with name of call with arguments. In this case added before EXPECT_CALL is redundant. Nice. I persuade you to remember those 2 types of reports because there are very common and occur often.

Let's look at expect_index_operators_from_coffee_machine_constructor code:
void expect_index_operators_from_coffee_machine_constructor()
{
     static std::vector<std::pair<ECoin, unsigned>> coins_and_quantities =
         { { ECoin_10gr, 0 }, { ECoin_20gr, 0 }, { ECoin_50gr, 0 },
         { ECoin_1zl, 0 }, { ECoin_2zl, 0 }, { ECoin_5zl, 0 } };

     for (size_t i = 0;i < coins_and_quantities.size(); i++)
        EXPECT_CALL(*user_credit_, indexOperator(coins_and_quantities[i].first))
           .WillRepeatedly(ReturnRef(coins_and_quantities[i].second));
}
You may wonder why coins_and_quantities has static qualifier. There is some subtle issue associated with ReturnRef. This macro is required here because indexOperator returns reference not value[14]. Unfortunately indexOperator will be called outside expect_index_operators_from_coffee_machine_constructor (in lcoffee_machine constructor) so reference to coins_and_quantities[i].second will be used outside this function. So effect will be the same as returning by function reference to local object. Reference to destroyed object and crash...
Static is remedium because it makes lifetime of coins_and_quantities as long as lifetime of whole program.

Another interesting issue is class abstract_storage which was introduced in chapter 4 during design process. It's generic interface declared as
template<class Key, class Value>
class abstract_storage
{
public:
    virtual Value & operator[](const Key &key) = 0;
    virtual const std::shared_ptr<Value> find(const Key &key) const = 0;
    virtual std::map<Key, Value> convertToMap() const = 0;
    virtual ~abstract_storage() {}
};
and replacing concrete std::map. Motivation to writing this class was mocking operator[]. Unfortunately mocking C++ operator is not possible in gmock[11] but workaround exists[12]. I applied this solution in mock_storage.h file by adding indexOperator helper method:
MOCK_METHOD1_T(indexOperator, Value & (const Key &key));
Notice using MOCK_METHOD1_T macro for all methods because of templated type. Also take a look on public keyword in declaration line
class mock_storage : public abstract_storage<Key, Value>
Without this keyword strange things during compilation happen[13].

At the end of post I would like to put some guidelines regarding environment preparing (gtest, gmock and gcov).
  • gmock/gtest building under linux is trivial (configure + make) but be careful with with *.la objects located in lib directory.  That's not static linked libraries (*.a)! That's libtool files and do not feed you linker those files as I did. 
  • it may be surprising that you can control gtest execution behaviour by passing arguments to your binary e.g:
    ./coffee_machine --gtest_repeat=2 --gtest_shuffle --gtest_break_on_failure     
  • the most straightforward way to measure source code coverage is probably  using tool called gcov. In first step you must pass following flags to gcc: -fprofile-arcs, -ftest-coverage. Also -fprofile-arcs for linker is required. Second step is running gcov and gathering reports with detailed code coverage informations per compilation unit. For example gcov coffee_machine.cpp will create detailed report in coffee_machine.cpp.gcov.

Link to coffee machine source: https://github.com/yurai007/coffee_machine

[1] consider situation with that you have project backup containing 2 CD - one with your source code of project and the second one with unit tests sources. Which CD do you prefer to lose?
[2] look at: http://blog.8thlight.com/uncle-bob/2012/08/13/the-clean-architecture.html
[3] https://code.google.com/p/googlemock/wiki/CookBook
[4] http://www.ivarjacobson.com/Use_Case_Driven_Development/
[5] SRP is first of SOLID principle. More information: http://en.wikipedia.org/wiki/SOLID_(object-oriented_design)
[6] DRY - Don't Repeat Yourself; many people understand that only in terms of avoiding duplication of code but DRY means also avoiding duplication of effort (e.g if every functionality in your system is only in one place but in case of bug you must fix this bug in many places you clearly break DRY). Notice that if you must change something in many places (production code, tests, scripts, configuration files, database schema etc) from the same reason you violate DRY.
[7] SLA - Single Level Of Abstraction
[8] That's not actually true. There are ugly hacks for mocking non-virtual methods:
https://code.google.com/p/googlemock/wiki/CookBook#Mocking_Nonvirtual_Methods
[9] http://www.infoq.com/presentations/top-10-performance-myths
[10] It's quite interesting how relations like aggregation or composition are handled by C++. If A compose B then it means that B is part of A and A is single owner of B. Both object are created and destroyed in the same time. In C++ it is expressed that B is field of A or B is nested class in A. Aggregation is handled in different way. When A aggregate B then B is some kind of pointer field or reference field. Lifetime of B is not limited by A lifetime and B may be used outside A scope as well by many other classes. 
[11] see: http://stackoverflow.com/questions/6492664/how-to-create-a-mock-class-with-operator
[12] look at: https://code.google.com/p/googlemock/issues/detail?id=53
[13] I guess that without hint from SO about gcc message - "conversion from derived* to base* exists but is inaccessible" discovering source of real problem would be hard. Hint comes from http://stackoverflow.com/questions/10472848/conversion-from-derived-to-base-exists-but-is-inaccessible
[14] Using normal Return macro instead ReturnRef will cause compilation error with terrible long compiler message from gmock internals. BTW despite that gmock is really cool framework I must confirm that his source code is really nightmare. If you have ever heard statement that C++ is like seven legs cat ( http://yosefk.com/c++fqa/images/cat.png) but don't understand/believe take a look on gmock internals:)

No comments:

Post a Comment