I've been thinking a lot about software achitecure lately. Not just thinking, because I wouldn't come up with these ideas on my own, but consuming a lot about it -- books, talks, slide decks, blog posts. And while thinking about all this, I've been hacking away at some projects in my spare time. And I noticed something, there's a lot of things in these projects that look a lot like this:


In [ ]:
@app.route('/register', methods=['GET', 'POST'])
def register():
    form = RegisterUserForm()
    
    if form.validate_on_submit():
        user = User()
        form.populate_obj(user)
        db.session.add(user)
        db.session.commit()
        return redirect('homepage')
    
    return render_template('register.html', form=form)

This is a bog standard user registration endpoint. We create a form, check if it's valid, shove that information on a user model and then into the database and redirect off. If it's not valid or if it wasn't submitted (the user just navigated to the page), we render out some HTML.

It's all very basic, well trodden code. Besides, who wants to do registration again? It's boring. We want to do the interesting stuff. But there's some very real consequences to this code:

It's not testable

Everything is wrapped up together, form validation, database stuff, rendering. Honestly, I'm not interested in testing if SQLAlchemy, WTForms of Jinja2 work -- they have their own tests. So testing this ends up looking like this:


In [ ]:
@mock.patch('myapp.views.RegisterUserForm')
@mock.patch('myapp.views.db')
@mock.patch('myapp.views.redirect')
@mock.patch('myapp.views.url_for')
@mock.patch('myapp.views.render_template')
def test_register_new_user(render, url_for, redirect, db, form):
    # TODO: Write test
    assert True

What's even the point of this? We're just testing if Mock works at this point. There's actual things we can do to make it more testable, but before delving into that,

It hides logic

If registering a user was solely about, "Fill this form out and we'll shove it into a database" there wouldn't be a blog post here. However, there is some logic hiding out here in the form:


In [ ]:
class RegisterUserForm(Form):
    def validate_username(self, field):
        if User.query.filter(User.username == field.data).count():
            raise ValidationError("Username in use already")
    
    def validate_email(self, field):
        if User.query.filter(User.email == field.data).count():
            raise ValidationError("Email in use already")

When we call RegisterUserForm.validate_on_submit it also runs these two methods. However, I'm not of the opinion that the form should talk to the database at all, let alone run validation against database contents. So, let's write a little test harness that can prove that an existing user with a given username and email causes us to not register:


In [ ]:
from myapp.forms import RegisterUserForm
from myapp.models import User

from collections import namedtuple

from unittest import mock

FakeData = namedtuple('User', ['username', 'email', 'password', 'confirm_password'])

def test_existing_username_fails_validation():
    test_data = FakeData('fred', 'fred@fred.com', 'a', 'a')
    UserModel = mock.Mock()
    UserModel.query.filter.count.return_value = 1
    form = RegisterUserForm(obj=test_data)
    
    with mock.patch('myapp.forms.User', UserModel):
        form.validate()
    
    assert form.errors['username'] == "Username in use already"
    
def test_existing_email_fails_validation():
    test_user = FakeUser('fred', 'fred@fred.com', 'a', 'a')
    UserModel = mock.Mock()
    UserModel.query.filter.first.return_value = True
    form = RegisterUserForm(obj=test_user)
    
    with mock.patch('myapp.forms.User', UserModel):
        form.validate()
    
    assert form.errors['username'] == "Email in use already"

If these pass -- which they should, but you may have to install mock if you're not on Python 3 -- I think we should move the username and email validation into their own callables that are independently testable:


In [1]:
def is_username_free(username):
    return User.query.filter(User.username == username).count() == 0

def is_email_free(email):
    return User.query.filter(User.email == email).count() == 0

And then use these in the endpoint itself:


In [ ]:
@app.route('/register', methods=['GET', 'POST'])
def register():
    form = RegisterUserForm()
    
    if form.validate_on_submit():
        if not is_username_free(form.username.data):
            form.errors['username'] = ['Username in use already']
            return render_template('register.html', form=form)
        
        if not is_email_free(form.email.data):
            form.errors['email'] = ['Email in use already']
            return render_template('register.html', form=form)
        
        user = User()
        form.populate_obj(user)
        db.session.add(user)
        db.session.commit()
        return redirect('homepage')
    
    return render_template('register.html', form=form)

This is really hard to test, so instead of even attempting that -- being honest, I spent the better part of an hour attempting to test the actual endpoint and it was just a complete mess -- let's extract out the actual logic and place it into it's own callable:


In [ ]:
class OurValidationError(Exception):
    def __init__(self, msg, field):
        self.msg = msg
        self.field = field

def register_user(username, email, password):
    if not is_username_free(username):
        raise OurValidationError('Username in use already', 'username')
    
    if not is_email_free(email):
        raise OurValidationError('Email in use already', 'email')
    
    user = User(username=username, email=email, password=password)
    db.session.add(user)
    db.session.commit()
    
    
@app.route('/register', methods=['GET', 'POST'])
def register_user_view():
    form = RegisterUserForm()
    
    if form.validate_on_submit():
        try:
            register_user(form.username.data, form.email.data, form.password.data)
        except OurValidationError as e:
            form.errors[e.field] = [e.msg]
            return render_template('register.html', form=form)
        else:
            return redirect('homepage')
        
    return render_template('register.html', form=form)

Now we're beginning to see the fruits of our labors. These aren't the easiest functions to test, but there's less we need to mock out in order to test the actual logic we're after.


In [ ]:
def test_duplicated_user_raises_error():
    ChasteValidator = mock.Mock(return_value=False)
    
    with mock.patch('myapp.logic.is_username_free', ChasteValidator):
        with pytest.raises(OurValidationError) as excinfo:
            register_user('fred', 'fred@fred.com', 'fredpassword')
    
    assert excinfo.value.msg == 'Username in use already'
    assert excinfo.value.field == 'username'

def test_duplicated_user_raises_error():
    ChasteValidator = mock.Mock(return_value=False)
    PromisciousValidator = mock.Mock(return_value=True)
    
    with mock.patch('myapp.logic.is_username_free', PromisciousValidator),
         mock.patch('myapp.logic.is_email_free', ChasteValidator):
        with pytest.raises(OurValidationError) as excinfo:
            register_user('fred', 'fred@fred.com', 'fredpassword')
    
    assert excinfo.value.msg == 'Email in use already'
    assert excinfo.value.field == 'email'

def test_register_user_happy_path():
    PromisciousValidator = mock.Mock(return_value=True)
    MockDB = mock.Mock()
    
    with mock.patch('myapp.logic.is_username_free', PromisciousValidator),
         mock.patch('myapp.logic.is_email_free', ChasteValidator), 
         mock.patch('myapp.logic.db', MockDB):
    
        register_user('fred', 'fred@fred.com', 'freddpassword')
    
    assert MockDB.commit.call_count

Of course, we should also write tests for the controller. I'll leave that as an exercise. However, there's something very important we're learning from these tests. We have to mock.patch everything still. Our validators lean directly on the database, our user creation leans directly on the database, everything leans directly on the database. And I don't want to do that, we've found that it makes testing hard. We're also seeing if we need to add another registration restriction -- say we don't like people named Fred so we won't let anyone register with a username or email containing Fred in it -- we need to crack open the register_user function and add it directly. We can solve both of these problems.

The Database Problem

To address the database problem we need to realize something. We're not actually interested in the database, we're interested in the data it stores. And since we're interested in finding data rather than where it's stored at, why not stuff an interface in the way?


In [3]:
from abc import ABC, abstractmethod

class AbstractUserRepository(ABC):
    
    @abstractmethod
    def find_by_username(self, username):
        pass
    
    @abstractmethod
    def find_by_email(self, email):
        pass
    
    @abstractmethod
    def persist(self, user):
        pass

Hmm...that's interesting. Since we'll end up depending on this instead of a concrete implementation, we can run our tests completely in memory and production on top of SQLAlchemy, Mongo, a foreign API, whatever.

But we need to inject it into our validators instead of reaching out into the global namespace like we currently are.


In [10]:
def is_username_free(user_repository):
    def is_username_free(username):
        return not user_repository.find_by_username(username)
    return is_username_free

def is_email_free(user_repository):
    def is_email_free(email):
        return not user_repository.find_by_email(email)
    return is_email_free

These validators are simple enough that closures work instead of full-fledged objects. The important part here is to maintain a consistent interface -- if we need to use classes all of a sudden, we need to define a __call__ on them to maintain this interface.

We can also change our register callable to accept the repository as well:


In [12]:
def register_user(user_repository):
    email_checker = is_email_free(user_repository)
    username_checker = is_username_free(user_repository)
    
    def register_user(username, email, password):
        
        if not username_checker(username):
            raise OurValidationError('Username in use already', 'username')

        if not email_checker(email):
            raise OurValidationError('Email in use already', 'email')

        user = User(username=username, email=email, password=password)
        user_repository.persist(user)
        
    return register_user

Of course the tests break now, and that's okay. We made a very sweeping change to the architecture here. We need to go back through and alter the tests one by one, but instead of patching everything out we can do something better: Dependency Injection.


In [15]:
def test_duplicated_email_causes_false():
    fake_user_repository = mock.create_autospec(AbstractUserRepository)
    fake_user_repository.find_by_email.return_value = True
    checker = is_email_free(fake_user_repository)
    
    assert not checker('fred@fred.com')
    
def test_duplicated_username_causes_false():
    fake_user_repository = mock.create_autospec(AbstractUserRepository)
    fake_user_repository.find_by_username.return_value = True
    checker = is_username_free(fake_user_repository)
    
    assert not checker('fred')


def test_register_user_happy_path():
    fake_user_repository = mock.create_autospec(AbstractUserRepository)
    fake_user_repository.find_by_email.return_value = False
    fake_user_repository.find_by_username.return_value = False
    registrar = register_user(fake_user_repository)
    
    registrar('fred', 'fred@fred.com', 'fredpassword')
    
    assert fake_user_repository.persist.call_count

But to test that our validators function correctly in this context, we need to fake out find_by_email and find_by_username indpendently. This is a symptom of our code not being Open-Closed.

The Open-Closed Problem

Revisiting the other major issue from how the code is laid out right now is that it's not Open-Closed. If you're not familiar with the principle, Wikipedia says this:

"software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification"

Or in a different way, "You should be able to change functionality without editing existing code." -- I believe I need to credit Sandi Metz with this, but I'm not sure. We've actually already used this idea by injecting the User Repository. In tests, we inject a fake or in memory repository, but in production it can be a SQLAlchemy implementation, or maybe wrap that up into a caching repository. We can do the same thing with the validators.


In [16]:
def register_user(user_repository, validator):
    def registrar(username, email, password):
        user = User(username, email, password)
        validator(user)
        user_repository.persist(user)
    return registrar

Of course, our tests break again, so let's revisit the currently breaking one first:


In [21]:
def test_register_user_happy_path():
    fake_user_repository = mock.create_autospec(AbstractUserRepository)
    registrar = register_user(fake_user_repository, lambda user: None)
    
    registrar('fred', 'fred@fred.com', 'fredpassword')
    
    assert fake_user_repository.persist.call_count
    
def test_register_user_fails_validation():
    fake_user_repository = mock.create_autospec(AbstractUserRepository)
    fake_validator = mock.Mock(side_effect=OurValidationError('username in use already', 'username'))
    registrar = register_user(fake_user_repository, fake_validator)
    
    try:
        registrar('fred', 'fred@fred.com', 'fredpassword')
    except OurValidationError as e:
        assert e.msg == 'username in use already'
        assert e.field == 'username'
    else:
        assert False, "Did not Raise"

We'll need to tweak the validation logic some to make up for the fact that we're passing the whole user object now:


In [4]:
def validate_username(user_repoistory):
    def validator(user):
        if not user_repoistory.find_by_username(user.username):
            raise OurValidationError('Username in use already', 'username')
        return True
    return validator

def validate_email(user_repoistory):
    def validator(user):
        if not user_repoistory.find_by_email(user.email):
            raise OurValidationError("Email in use already", 'email')
        return True
    return validator

The tests for these are pretty straight forward as well, so I'll omit them. But we need a way to stitch them together...


In [3]:
def validate_many(*validators):
    def checker(input):
        return all(validator(input) for validator in validators)
    return checker

And then hook it all up like this:


In [ ]:
validator = validate_username(validate_email(user_repository), validate_username(user_repository))
registrar = register_user(user_repository, validator)

Our neglected Controller

We've spent a lot of time looking at how to compartmentalize the registration logic and portion out its concerns. However, the controller itself needs some attention as well. When we last left, it looked like this:


In [ ]:
@app.route('/register', methods=['GET', 'POST'])
def register_user_view():
    form = RegisterUserForm()
    
    if form.validate_on_submit():
        try:
            register_user(form.username.data, form.email.data, form.password.data)
        except OurValidationError as e:
            form.errors[e.field] = [e.msg]
            return render_template('register.html', form=form)
        else:
            return redirect('homepage')
        
    return render_template('register.html', form=form)

But we can do beter than that. The problem here is that the logic is set in stone, nested flows of control. But mostly, I really like any excuse to use class based views.


In [ ]:
class RegisterUser(MethodView):
    def __init__(self, form, registrar, template, redirect):
        self.form = form
        self.registrar = registrar
        self.template = template
        self.redirect = redirect
    
    def get(self):
        return self._render()
    
    def post(self):
        if self.form.validate_on_submit():
            return self._register()
        else:
            return self._render()
    
    def _register(self):
        try:
            self.registrar(self.form.username.data, self.form.email.data, self.form.password.data)
        except OurValidationError as e:
            self._handle_error(e)
            self._render()
        else:
            return self._redirect()
    
    def _render(self):
        return render_template(self.template, self.form=form)

    def _redirect(self):
        return redirect(url_for(self.redirect))
    
    def _handle_error(self, e):
        self.form.error[e.field] = [e.msg]

Now that looks like a lot of code. However, each piece is much simpler than the original function. This speaks to handing out actions, as if it were controlling things. We can test the main logic of it as well. Even though we should test it, I might just leave it alone. Maybe run it through an acceptance test.

What did we gain?

Everything is much more high level. The controller, validation, registration, even the form. Nothing's concerned with more than it needs to be. Sure, there's still some plumbing to do with the SQLAlchemy implementation of the UserRepository