-1

I got a problem with a legacy code base in my new company.

Basically, they have 2 big code bases/projects or assemblies which I would say assembly A and assembly B. Also, we are having assembly Test A and assembly Test B.

The problem I got here is assembly Test A is using some classes from assembly Test B. And assembly Test B is using some classes from assembly Test A. But not finish, assembly Test A is using some classes from assembly B. It is really nasty, I know that. That is the reason now in technically, we are having 2 separate projects but actually we put them in one package. Then you can imagine that every time we deploy A then we have to deploy B.

My task is to decouple those assemblies or splitting them into 2 separate assembly/project so it will be good for future deployment and testing and whatever.

I came to a strategy which is a slow process but it might make sure that we still guarantee the functionalities. My initial strategy is that I will slowly remove the dependencies of assembly B in testing assembly A and make the tests pass with testing assembly A. It means I am removing dependencies of assembly B from assembly A. Next step is doing the same way with assembly B and finally we will have 2 completed separate assemblies.

For me, I believe that it is not a easy task but I dont mind to work on that. However, I need your opinion or advises on my initial strategy. I dont want to take a risk of coming back to the beginning in the middle of that kind of project.

So could you guys give me any honest opinion or pieces of advice on that task?

I highly appreciate that.

Thanks

NinjaCoder
  • 43
  • 1
  • 7
  • 1
    Sounds mess up. I would prefer to see a dependency diagram, yet that's beside the point. Have you heard of technical debt? That code as a lot of technical debt. Has you heard of people getting money from a credit card to pay another credit card? That is the only idea I have. You can separate the projects by duplicating code, and then you would have to pay by refactoring out any unnecesary code or any internal code duplication in the already separated projects. – Theraot May 26 '17 at 14:39
  • 2
    Sounds like you need a 3rd project, "TestCommon". – Bradley Uffner May 26 '17 at 14:40
  • @Theraot You are absolutely correct when you mentioned technical debt. I really like the example of paying through credit card as well (Y) :D. To be honest with you, I know that I will have duplicated code in both assemblies but they are staying in two different domains so I hope that it should be fine. It is really messy code bases. However, it is what it is and we should make it better. I will try to find the dependency diagram and let you know but I think you can imagine what the problem is now. Do you have any other ideas? Cheers. – NinjaCoder May 26 '17 at 14:44
  • @Bradley: This is what I am thinking now but it might be for the other step. The most simple and important step I thought (correct me I am wrong) is to split them first and when I get familiar with the code bases then I can find some common thing and put them into a Test Common. What do you think? Cheers – NinjaCoder May 26 '17 at 14:46
  • I can see some merit in both approaches, but if it were me doing it, I would go straight to the common code. This is really a personal choice though. I really dislike the idea of having duplicate code, even if it only temporary. – Bradley Uffner May 26 '17 at 14:49
  • @Bradley: Thanks for that idea. It was my first thought when I had a look at the codes. However, one of my colleagues told me that it is not a good idea in this circumstance because he explained something about accessing to database for both assemblies with common entities can make problem for the system which I have not understood now. You know I just joined the company then confront with a big task which I guess no one wants to work on that hehe...but let me discuss with them again. – NinjaCoder May 26 '17 at 14:57
  • @Theraot: Do you have any other ideas? Many thanks. – NinjaCoder May 26 '17 at 15:23
  • https://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/product-reviews/0131177052 have some initial guidance... Overall the question is probably too broad for SO. – Alexei Levenkov May 26 '17 at 15:24
  • Side note: check out https://meta.stackoverflow.com/questions/260776/should-i-remove-fluff-when-editing-questions for guidance on "thank you" notes in posts. – Alexei Levenkov May 26 '17 at 15:25
  • @AlexeiLevenkov: Thanks for the book. I have read it already but will review it again. sorry for a very vague question. The reason is that I cannot upload the codes or something like that. But I believe that you guys will understand the situation here. Cheers. – NinjaCoder May 26 '17 at 15:26
  • 1
    Yes it is clear you need advice and guidance, but situation here: you asked very broad opinion based question on SO. It may be more on-topic on https://softwareengineering.stackexchange.com/ (or likely already discussed many times there - make sure to search and check they help->tour before moving question there) – Alexei Levenkov May 26 '17 at 15:31
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/145218/discussion-between-ninjacoder-and-alexei-levenkov). – NinjaCoder May 26 '17 at 15:35
  • @AlexeiLevenkov when referring other sites, it is often helpful to point that [cross-posting is frowned upon](https://meta.stackexchange.com/tags/cross-posting/info) – gnat May 26 '17 at 15:49

1 Answers1

2

Honestly, I find the description a bit confusing. Thus, I will take it bit by bit.

Basically, they have 2 big code bases/projects or assemblies which I would say assembly A and assembly B.

<code>A</code>, <code>B</code>

Also, we are having assembly Test A and assembly Test B.

I'll assume that Test A depends on A and Test B depends on B:

<code>Test A</code> depends on <code>A</code>,<code>Test B</code> depends on <code>B</code>

Test A is using some classes from assembly Test B

<code>Test A</code> depends on <code>A</code>,<code>Test B</code> depends on <code>B</code>, <code>Test A</code> depends on <code>Test B</code>

Test B is using some classes from assembly Test A

<code>Test A</code> depends on <code>A</code>,<code>Test B</code> depends on <code>B</code>, <code>Test A</code> depends on <code>Test B</code>, <code>Test B</code> depends on <code>Test A</code>

Test A is using some classes from assembly B

<code>Test A</code> depends on <code>A</code>,<code>Test B</code> depends on <code>B</code>, <code>Test A</code> depends on <code>Test B</code>, <code>Test B</code> depends on <code>Test A</code>, <code>Test A</code> depends on <code>B</code>


Ok, I can reason about that graph:

  • A and B are sinks (leaves if you prefer). In theory, you can deployed them separately. I'm guessing you deploy Test A with A and Test B with B and that is causing problems. Quick fix: Do not deploy tests.

  • There is a cycle: Test A -> Test B -> Test A

Originally I did some remarks on Test A -> Test B -> B, Test A -> B, I'm scratching that, it is not relevant for this situation.


The cyclic dependency Test A -> Test B -> Test A is the main concern. How to fix it is an economics question.

I was saying in the comments that you may allow code duplication, which you would have to refactor later. That was just the first thing that came to mind. Doing that, would increases technical debt, but allows you to archive the separation in less time. If you need to separate the projects right now, I would say you do this. If there is time, you can be more pragmatic.

To fix the cyclic dependency you need to understand that Test A does not need Test B, instead Test A only needs a subset of Test B. Similarly Test B only needs a subset of Test A.

Furthermore, we can say that both Test A and Test B need this subset. Borrowing from Bradley Uffner's comment I'll call TestCommon.

You may start by creating an empty TestCommon assembly, add the dependency to both Test A and Test B and start moving code there. Eventually you would be able to remove the dependencies from Test A to Test B and from Test B to Test A. There is a chance that a cyclic depenency would continue to exist inside of TestCommon (without a detailed dependency diagram, I cannot tell). Regardless, extracting TestCommon will result in code easier to maintain and reuse.


Extracting TestCommon will result in the going from this:

<code>Test A</code> depends on <code>Test B</code>, <code>Test B</code> depends on <code>Test A</code>

To this:

<code>Test A</code> depends on <code>Test B</code>, <code>Test B</code> depends on <code>Test A</code>

Full diagram:

<code>Test A</code> depends on <code>A</code>, <code>Test B</code> depends on <code>B</code>, <code>Test A</code> depends on <code>B</code>, <code>Test A</code> depends on <code>TestCommon</code>, <code>Test B</code> depends on <code>TestCommon</code>

This is better. Now you can deploy Test B and B (and TestCommon) without major problems. Yet, you still need to deploy B with Test A because it references it directly.

There is a link that goes from Test A to B. I will assume the more nuances case, which is where Test A uses B (I’ll explore the other cases later). Following same approach that we did to extract TestCommon we would need to introduce another assembly. I will call it Utility (I do not know what is in there, just that Test A needs it), and will allow you to go from this:

<code>Test A</code> depends on <code>B</code>, <code>Test B</code> depends on <code>B</code>

To this:

<code>Test B</code> depend on <code>B</code>, <code>Test B</code> depends on  <code>Utility</code>, <code>Test A</code> depends on <code>Utility</code>, <code>B</code> depends on <code>Utility</code>


At this point, we have separated the project successfully. We can see in the full diagram that it is possible to deploy Test A and A without Test B and B, and vice versa.

Full diagram:

<code>Test A</code> depends on <code>A</code>, <code>Test B</code> depends on <code>B</code>, <code>Test B</code> depends on <code>Utility</code>, <code>Test A</code> depends on <code>Utility</code>, <code>B</code> depends on <code>Utility</code>, <code>Test A</code> depends on <code>TestCommon</code>, <code>Test B</code> depends on <code>Test Common</code>

To deploy A:

<code>Test A</code> depends on <code>A</code>, <code>Test A</code> depends on <code>Utility</code>, <code>Test A</code> depends on <code>TestCommon</code>

To deploy B:

<code>Test B</code> depends on <code>B</code>, <code>Test B</code> depends on <code>Utility</code>, <code>B</code> depends on <code>Utility</code>, <code>Test B</code> depends on <code>TestCommon</code>


¿What is Utility?

Remember that Utility was originally part of B. It is the part of B that we need to make Test A work. Althought I do not know much about Utility. Yet, I know what I do not know...

A. I do not know if Test A uses or tests Utility, I assume it uses it.

B. I do not know if Test B uses or tests Utility

C. I do not know if B need or not Utility

Let us explore the possible cases:

  1. Test A uses Utility, Test B uses Utility, B needs Utility

    Utility must be some suitability code that everybody uses to make things easier. A defect in Utility would be a defect in both projects. Make tests for it in a new Test Utility.

  2. Test A uses Utility, Test B uses Utility, B does not need Utility

    Utility is only needed for testing. Merge it into TestCommon.

  3. Test A uses Utility, Test B tests Utility, B needs Utility

    This is the more nuances of all cases...

    B and Test A still both need this code... this code serves two masters Test A and B. This is not wrong. You can leave it as it is.

    There is a risk that in the future you may want to introduce a change in Utility for one of these projects, but that change introduces a bug in the other.

    Right now the tests for Utility are in Test B, about which the people working on A may not be worrying about. Therefore separating Test Utility from Test B is a good idea.

    If, in the future, the requirements that Test A and B differ too much, there is a change that they will need two different solutions (at which point you can merge those into their respective projects).

  4. Test A uses Utility, Test B tests Utility, B does not need Utility

    Remove the dependency B has on Utility and separate the tests for Utility from Test B into a new Test Utility.

  5. Test A testsUtility, Test B uses Utility, B needs Utility

    Test A should not be testing code from B. Migrate that code to Test B and remove the dependency from Test A to Utility. You can now merge Utility back into B.

  6. Test A tests Utility, Test B uses Utility, B does not need Utility

    Remove the dependency B has on Utility and separate the tests for Utility from Test A into a new Test Utility.

  7. Test A tests Utility, Test B testsUtility, B needs Utility

    The tests for Utility are split in two. Extract them and put them in a new Test Utility.

  8. Test A tests Utility, Test B tests Utility, B does not need Utility

    Utility is not being used. You may remove it along with its tests.


Final note: I made the diagrams using yUML.

Theraot
  • 18,248
  • 4
  • 45
  • 72
  • I believe that will be the correct solution. I will discuss with my colleagues to see how it works because I have a little knowledge of that system but I believe it is the right way. Many thanks for your answer that helps me to understand how to make a good answer and question on SO as well. I highly appreciate that – NinjaCoder May 27 '17 at 04:42
  • Also, we will have a risk of big conflict during refractoring that messy codes. As you know, they are working codes so there will be updated codes everyday therefore it will cause big problem when merging even I will work on 2 # git repositories – NinjaCoder May 27 '17 at 04:47
  • @NinjaCoder Glad of being of help. As long as you commit what you do, it can be reverted using the repositories, so do not be afraid of writing code. You may also want to consider increasing coverage of tests. Also if your team does not have it, see if they can get some form of continuous integration that run tests on each commit. That way, if your commit result in breaking something unexpected, you will know as soon as possible. The longer it takes to fix a defect, the more expensive it is, therefore an affordable continuous integration solution is a good investment. – Theraot May 27 '17 at 10:28
  • I have done more investigation on the codes and I might found that the reason why TestB using A assembly. The reason is that they are integration testing then in A assembly, there are several services which is used to communicate with database. And in B Assembly, we need to use that services in order to perform the test. – NinjaCoder Jun 01 '17 at 07:46
  • You can imagine that we have post and child post, both of post entities have been stored in the same table in database. So class A has some database services to get parent post and class B has some services to get child post, therefore we have TestDataHelper class stayed inside TestB assebly and TestDataHelper class is using both Class A and B. It makes a strong dependencies. – NinjaCoder Jun 01 '17 at 07:46
  • In this case, which is the best way to remove the dependencies? My initial thought is to make some fake A class/services in TestB because we have to accept the duplicated codes for this stage. What do you think? – NinjaCoder Jun 01 '17 at 07:48
  • @NinjaCoder Welcome to the world of [mocking](https://stackoverflow.com/questions/3622455/what-is-the-purpose-of-mock-objects). Basically, yes, use fake objects. There are a series of similar [patterns used for testing](http://xunitpatterns.com/Mocks,%20Fakes,%20Stubs%20and%20Dummies.html). Doing that will allow to convert your integration test to unit tests. Note that the integration tests could be using your concrete classes in ways that you aren't testing otherwise. You may want to add new tests to compensate that. See [code coverage](https://msdn.microsoft.com/en-us/library/dd537628.aspx). – Theraot Jun 01 '17 at 15:56
  • @NinjaCoder Your mock or fake objects do not need to be temporary. That is, there is a chance that you don't need to return to integration testing. Although, unit tests can't catch everything, there are [emergent behaviors](https://en.wikipedia.org/wiki/Emergence) that only integration tests will get, such as threading and performance problems. Consider library to ease writing mocks (such as [moq](https://github.com/Moq/moq4)). Also consider to use and reference [interfaces instead of concrete classes](https://stackoverflow.com/a/44155493/402022), yet you need to discuss that with the team. – Theraot Jun 01 '17 at 16:03