The Reification Abstraction

One of the areas I enjoy is problem domain modeling. Developers have a tendency to skip this and immediately plunge into solutions. This works for them when the problem is understood…which is true far less often than they think.

Reification is a term used in many domains, but it’s easy to show it in problem modeling — and it’s often known (though not named) so if it seems familiar, excellent. (If you have experience in databases, it’s a part of normalization as well.)

To show an example, I’d like to model something that is not nearly as simple as it sounds, and use conceptual reification to address it.

What is reification?

Reification means “to make an abstraction concrete.” The definition in Wikipedia is fine, but I’m going to show the use of it instead of belabor it’s formality.

What’s the simple problem?

Here’s a simple problem: two people wish to get married (I’m not going to worry about the social aspects, only the modeling).

If person A and person B want to get married, what do we need to know?

A key aspect of modeling is “lifecycle.” In design, this is the CRUD step (Create, Read, Update, Delete) with the definition of the constructors and such. In this level of model, let’s just focus on the conceptual bits. What is the lifecycle of marriage?

We need to know WHO gets married to WHOM and WHEN. We could ask where, too, if we needed to track that. But, for now, just the participants and the date are enough to start this example.

Let’s be concrete! Two people, A and B. A and B get married on June 5, 2021.

So, we have two persons. Person represents a concrete, so we can use it immediately. Let’s capture the above:

A person who can be married

That’s a class diagram, it defines any number of Person (I have v1 so I don’t mix them up in later versions in this document). A class diagram lets us have any number of particular concrete instances or objects, so here’s the example’s objects:

Two persons married

This … well, it works. But it’s clunky. Why?

The problem with v1

There is actually no marriage here at all. There’s a relationship between two objects (named MarriedTo) but it’s not something that can be addressed directly. It has no identity of its own at all. Instead, it’s a bit of duplicated information and a convention.

The duplicated information (the MarriedOn date) is really awful. If it were entered wrong on one of them, the data wouldn’t even agree. Even if we did this on a whiteboard, or by using simple index cards, having to write the date on two cards would bring this point up.

This is a typical starting point. It’s not that it’s a bad starting point, but it’s a poor ending point.

What does it take to make it better?

Addressing v1 problems

The duplication of the date is bad, but it’s really a symptom. The problem is “There is actually no marriage here at all.” That’s what is really missing. We have a relationship that doesn’t have its own entry as a box.

Without that entry, it’s difficult to capture marriage information. I used only MarriedOn, but if even one more attribute is sought, it would have to be duplicated as well. Even in normal use, we talk about the word “marriage.”

But what does this actually mean? I can’t “hold” a marriage. It doesn’t have mass, or volume. It’s a concept. Being married is a defined relationship (the only part this software worries about, to keep it simple–not the legal, religious, social, etc.). It’s not physical. It’s … abstract.

In order to address the v1 problem, let’s make the conceptual “real.” Let’s make it into its very own thing. Let’s reify it.

The v2 start where Marriage is a thing

Here’s what happens when a Marriage box is added to the diagram:

v2 class diagram

Some initial notes, before we put it to the test by examples.

The Person v2 (just “Person”) now has information related only to itself — the name.

The Marriage v2 (just “Marriage”) has the date MarriedOn. It’s also responsible for the two Persons who are married. Let’s make this real by extending our sample to show this …

v2 object example

The Marriage instance carries what is known to the Marriage. Yes, it improves DRY (Don’t Repeat Yourself) but that’s not all that it does. It provides an explicit concrete instance of the concept for this particular case. This object represents the marriage of A and B on June 5, 2021. It’s explicit. It can be talked about precisely. It can also be used in other relationships.

For instance, we can add “DivorcedOn” and end the marriage.

It also has immediate negative consequences. It allows bigamy (person C is also married to person B with a different Marriage card):

Bigamy

Isn’t it nice we can find problems really early?

The bigamy problem is interesting because it’s not necessarily a negative. What the drawn box model does is enable us to raise a few questions with our client because certain weird things can be manifested into view.

In the United States, bigamy is illegal. And yet, if someone breaks the law, our system could end up detecting it.

We could have a rule that we record a LegalViolation if there’s an overlap in Marriage relationships for a given person between MarriedOn and DivorcedOn (oh, now we really also need a DivorcedOn — we’re learning).

In this way, instead of our system rejecting an “illegal” situation, it accepts it (because it’s data we have) but it indicates a problem.

Again, this is purely at the problem level, not the design level. We’re still purely in the realm of Person and Marriage. We’re exploring the problem space with this example, we’re not programming yet.

What do we do with this knowledge?

I have no idea — and neither do YOU! What we need to do is raise it with the customer/client/Subject Matter Expert (SME) etc. In other words, exploration of the problem leads us to ask questions.

Often our questions will be met with a bewildered stare and “Huh. I don’t know.” That’s fine…it’s not our job during analysis to solve their domain. It’s our job to understand their domain! That understanding has to reflect their understanding. If we substitute our own, we’re adding assumptions instead of capturing their expertise. We will have made a catastrophic mistake.

Additional things we can discover

What about plural marriages, where there are more than two people?

Again, this might be illegal (in the United States, it is, though some groups seek to do it and even use non-marriage structures to arrange it).

What would it take to enable the change, to allow plural marriages?

Changing the multiplicity from 2 to N would be simple … it would also be limiting enough to almost certainly be wrong. Why?

Bad solution v3 for plural marriage

Without drawing out the objects, here’s a simple narrative scenario:

  1. A and B get married on June 5, 2021
  2. C joins their marriage on June 12, 2021

Oh, wait … now the MarriedOn date isn’t at all correct unless we’re restricting a plural marriage to “all at once.”

We would first have to check with our experts to see if they allow a plural marriage to happen with partners coming and going over time. If not, if they only allow a single event for a group, then the above is correct. However, any time there’s a temporal relationship between multiple parties it is rare that the parties can’t come and go (though it does happen).

Allow plural marriages with partners changing over time

We need a way to represent the period in which a person shares in the marriage. That almost sounds like a case for reification.

Actually, that’s precisely the case for reification. It creates a notion of a Marriage Member, the person who belongs to a Marriage for a particular duration.

Drawn, it looks like this:

Plural marriage with partners entering and leaving at different times

What’s funny, we have duplication of data again. Because a MarriageMember is not required to share the same anniversary. This is a consequence of dynamic membership. It may have duplicate data.

On the other hand, person A might form the Marriage (pay some fee to register it, perhaps?) And then A and B pay the registry for their MarriageMember, on the same or different days. Of course, if that’s how it works, the multiplicity is wrong (the multiplicity actually requires that the Marriage have at least two MarriageMembers to be created).

Is this legal? I don’t know! Again, the issue is, I don’t know what rules that client needs. If I say yes or no, it’s my view and not the client’s and that’s me making the same mistake I mentioned earlier: adding my own assumptions.

Addressing what the client decides

Once we make the examples (or scenarios, or stories, or whatever term is desired for your methodology of choice) we need to capture our questions. Then it becomes necessary to present them to the client and capture what they decide.

The client might decide that marriages are only formed of two people, bigamy is illegal, and must be caught and reported. The user might decide that plural marriages are crucial and make the case of a two person contract a “special case.” They could decide something totally different.

We capture these decisions by adding constraints — which become non-functional requirements. They might also be called rules, or conditions. They are difficult to draw in UML diagrams such as I’ve been using (they are in braces and likely to be misunderstood if written in formal language because these are meant for sharing with customers, not logicians!) so they are normally captured in “notes.”

Be sure to list all the examples and the decisions (keep listed what was rejected and why too!) to ensure that later stakeholders can see the decisions that were made as well as rejected. This provides a much better basis for discussion and ensures that the developers don’t “help” the later steps by reinterpretation.

Diagrams on their own are good, but without the various supporting examples and narratives the diagrams are never enough. The narratives are crucial. The model is the set of diagrams and narratives (and perhaps other artifacts).

Reification benefits during domain modeling

As this simple set of examples shows, ensuring that all concepts get their “own box” (or “making them first class citizens”) dramatically improves the ability to both explore the problem and represent the decisions made.

The benefit of listing all the scenarios/examples/uses extends even further. In many ways, given just the narrative text for all the scenarios, without the UML diagrams, there would be more understanding than a quick sketch without the text. I love whiteboards, but a photo of a whiteboard is an artifact, it is not a complete model in its own right.

Also, I’ve found that people not accustomed to making and interpreting UML diagrams (or table diagrams in RDBMS) can’t make sense of abstract boxes. But they can easily and readily understand if you use physical index cards and do object (instead of class) modeling.

Object modeling on paper cards avoids all the abstraction. It’s not as easy to share (perhaps take pictures at each step) but it’s great for a discussion. I’ve gone into meetings with nothing more than a stack of blank index cards and a sharpie more than once to avoid all the computer issues. People are surprised, but without any formal training the domain experts can start making cards and adding them with the others on the table. Anything they can’t add, or that seems “wonky,” is a real problem to address.

You don’t need to tell them you are reifying, of course. You can just say you’re trying to understand their needs.

After all, “seek first to understand, then to be understood” is one of Stephen Covey’s habits of highly effective people. And in ensuring the right software is built, it’s a highly effective habit of analysts too. Of course, analysts are people, so that’s not really surprising.

Keep the Light!
Brian Jones

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s