On observing an execution tree of interdependent models in MVC

Question

I've developed on the Yii Framework for a while now (4 months), and so far I have encountered some issues with MVC that I want to share with experienced developers out there. I'll present these issues by listing their levels of complexity.

[Level 1] CR(create update) form. First off, we have a lot of forms. Each form itself is a model, so each has some validation rules, some attributes, and some operations to perform on the attributes. In a lot of cases, each of these forms does both updating and creating records in the db using a single active record object.
-> So at this level of complexity, a form has to

when opened,
- be able to display the db-friendly data from the db in a human-friendly way
- be able to display all the form fields with the attributes of the active record object. Adding, removing, altering columns from the db table has to affect the display of the form.
when saves, be able to format the human-friendly data to db-friendly data before getting the data
when validates, be able to perform basic validations enforced by the active record object, it also has to perform other validations to fulfill some business rules.
when validating fails, be able to roll back changes made to the attribute as well as changes made to the db, and present the user with their originally entered data.

[Level 2] Extended CR form. A form that can perform creation/update of records from different tables at once. Not just that, whether a form would create/update of one of its records can sometimes depend on other conditions (more business rules), so a form can sometimes update records at table A,B but not D, and sometimes update records at A,D but not B -> So at this level of complexity, we see a form has to:

be able to satisfy [Level 1]
be able to conditionally create/update of certain records, conditionally create/update of certain columns of certain records.

[Level 3] The Tree of Models. The role of a form in an application is, in many ways, a port that let user's interact with your application. To satisfy requests, this port will interact with many other objects which, in turn, interact with many more objects. Some of these objects can be seen as models. Active Record is a model, but a Mailer can also be a model, so is a RobotArm. These models use one another to satisfy a user's request. Each model can perform their own operation and the whole tree has to be able to roll back any changes made in the case of error/failure.

Has anyone out there come across or been able to solve these problems?

I've come up with many stuffs like encapsulating model attributes in ModelAttribute objects to tackle their existence throughout tiers of client, server, and db.

I've also thought we should give the tree of models an Observer to observe and notify the observed models to rollback changes when errors occur. But what if multiple observers can exist, what if a node use its parent's observer but give its children another observers.

Engineers, developers, Rails, Yii, Zend, ASP, JavaEE, any MVC guys, please join this discussion for the sake of science.

--Update to teresko's response:---
@teresko I actually intended to incorporate the services into the execution inside a unit of work and have the Unit of work not worry about new/updated/deleted. Each object inside the unit of work will be responsible for its state and be required to implement their own commit() and rollback(). Once an error occur, the unit of work will rollback all changes from the newest registered object to the oldest registered object, since we're not only dealing with database, we can have mailers, publishers, etc. If otherwise, the tree executes successfully, we call commit() from the oldest registered object to the newest registered object. This way the mailer can save the mail and send it on commit.

Using data mapper is a great idea, but We still have to make sure columns in the db matches data mapper and domain object. Moreover, an extended CR form or a model that has its attributes depending on other models has to match their attributes in terms of validation and datatype. So maybe an attribute can be an object and shipped from model to model? An attribute can also tell if it's been modified, what validation should be performed on it, and how it can be human-friendly, application-friendly, and db-friendly. Any update to the db schema will affect this attribute, and, thereby throwing exceptions that requires developers to make changes to the system to satisfy this change.

score 16 · Accepted Answer · edited May 23 '17 at 12:23

The cause

The root of your problem is misuse of active record pattern. AR is meant for simple domain entities with only basic CRUD operations. When you start adding large amount of validation logic and relations between multiple tables, the pattern starts to break apart.

Active record, at its best, is a minor SRP violation, for the sake of simplicity. When you start piling on responsibilities, you start to incur severe penalties.

Solution(s)

Level 1:

The best option is the separate the business and storage logic. Most often it is done by using domain object and data mappers:

Domain objects (in other materials also known as business object or domain model objects) deal with validation and specific business rules and are completely unaware of, how (or even "if") data in them was stored and retrieved. They also let you have object that are not directly bound to a storage structures (like DB tables).

For example: you might have a LiveReport domain object, which represents current sales data. But it might have no specific table in DB. Instead it can be serviced by several mappers, that pool data from Memcache, SQL database and some external SOAP. And the LiveReport instance's logic is completely unrelated to storage.
Data mappers know where to put the information from domain objects, but they do not any validation or data integrity checks. Thought they can be able to handle exceptions that cone from low level storage abstractions, like violation of UNIQUE constraint.

Data mappers can also perform transaction, but, if a single transaction needs to be performed for multiple domain object, you should be looking to add Unit of Work (more about it lower).

In more advanced/complicated cases data mappers can interact and utilize DAOs and query builders. But this more for situation, when you aim to create an ORM-like functionality.

Each domain object can have multiple mappers, but each mapper should work only with specific class of domain objects (or a subclass of one, if your code adheres to LSP). You also should recognize that domain object and a collection of domain object are two separate things and should have separate mappers.

Also, each domain object can contain other domain objects, just like each data mapper can contain other mappers. But in case of mappers it is much more a matter of preference (I dislike it vehemently).

Another improvement, that could alleviate your current mess, would be to prevent application logic from leaking in the presentation layer (most often - controller). Instead you would largely benefit from using services, that contain the interaction between mappers and domain objects, thus creating a public-ish API for your model layer.

Basically, services you encapsulate complete segments of your model, that can (in real world - with minor effort and adjustments) be reused in different applications. For example: Recognition, Mailer or DocumentLibrary would all services.

Also, I think I should not, that not all services have to contain domain object and mappers. A quite good example would be the previously mentioned Mailer, which could be used either directly by controller, or (what's more likely) by another service.

Level 2:

If you stop using the active record pattern, this become quite simple problem: you need to make sure, that you save only data from those domain objects, which have actually changed since last save.

As I see it, there are two way to approach this:

Quick'n'Dirty

If something changed, just update it all ...

The way, that I prefer is to introduce a checksum variable in the domain object, which holds a hash from all the domain object's variables (of course, with the exception of checksum it self).

Each time the mapper is asked to save a domain object, it calls a method isDirty() on this domain object, which checks, if data has changed. Then mapper can act accordingly. This also, with some adjustments, can be used for object graphs (if they are not to extensive, in which case you might need to refactor anyway).

Also, if your domain object actually gets mapped to several tables (or even different forms of storage), it might be reasonable to have several checksums, for each set of variables. Since mapper are already written for specific classes of domain object, it would not strengthen the existing coupling.

For PHP you will find some code examples in this ansewer.

Note: if your implementation is using DAOs to isolate domain objects from data mappers, then the logic of checksum based verification, would be moved to the DAO.
Unit of Work

This is the "industry standard" for your problem and there is a whole chapter (11th) dealing with it in PoEAA book.

The basic idea is this, you create an instance, that acts like controller (in classical, not in MVC sense of the word) between you domain objects and data mappers.

Each time you alter or remove a domain object, you inform the Unit of Work about it. Each time you load data in a domain object, you ask Unit of Work to perform that task.

There are two ways to tell Unit of Work about the changes:
- caller registration: object that performs the change also informs the Unit of Work
- object registration: the changed object (usually from setter) informs the Unit of Work, that it was altered
When all the interaction with domain object has been completed, you call commit() method on the Unit of Work. It then finds the necessary mappers and store stores all the altered domain objects.

Level 3:

At this stage of complexity the only viable implementation is to use Unit of Work. It also would be responsible for initiating and committing the SQL transactions (if you are using SQL database), with the appropriate rollback clauses.

P.S.

Read the "Patterns of Enterprise Application Architecture" book. It's what you desperately need. It also would correct the misconception about MVC and MVC-inspired design patters, that you have acquired by using Rails-like frameworks.

Thousands thanks to your detailed reply. I hope we can keep this discussion going. I have posted my reponse to you as an update of the original post. — Thomas Vo, Nov 22 '12 at 07:42
StackOverflow is a Q&A site, not a discussion forum. Also, you completely missed the point of what I wrote. — tereško, Nov 22 '12 at 08:56
Sorry, i may need more experience to better understand your opinion. I'll read the book as you have said then. Maybe after that i'll have some more basics to this. — Thomas Vo, Nov 22 '12 at 15:54
The PoEAA books does not really contain "basics" for OOP. For that I would recommend to start by watching all the lectures listed at the bottom of [this answer](http://stackoverflow.com/a/9855170/727208). — tereško, Nov 22 '12 at 18:18
Thanks!! exactly what I neeed. I've been search all over for these kinds of lectures. — Thomas Vo, Nov 23 '12 at 02:02
Ok assuming that we have separated storage logic and domain logic, domain object and object graph, some problems still haven't been solved: when a another attribute gets added into a domain object, we need to update both the data mapper and the db with that field, and update the form with that field. Not only that, we have to update the validation rules for that field, and display the errors to the user when validation fails. — Thomas Vo, Nov 23 '12 at 06:26
Great answer as always, @tereško. I'm not completely sure the Domain Object should know about its "dirtiness", though (re: the `isDirty()` method you mentioned). I think it's not its responsibility, but rather the Data Mapper or UoW's. Any thoughts? — ishegg, Nov 20 '17 at 21:50
@ishegg that's why I called it "quick and dirty" approach. It's the pragmatic solution, if you have one of maybe two such object trees, that you need to persist, When your code grows beyond that, the UoW approach becomes a lot more sustainable. Then again, these days I have a lot simpler solution for this problem - just don't make huge object trees. They are not worth it. — tereško, Nov 20 '17 at 22:58