0

How should Reference Data be implemented in State Based Continuous Database Deployment solutions?

The typical State Based solution I have come across is to use a series of post deploy SQL scripts that truncate reference data Tables and then proceed to INSERT reference data one line at a time.

The issue with this typical solution is that environments can sometimes be changed by people outside of development and these changes need to be persisted. As a result a lot of manual comparisons are required to identify these changes during deploy cycles and less and less trust is put into automated deployments.

Are there any alternative solutions to this typical implementation that cater to situations where environments may have important changes that are not in source control?

-- Example of typical solution

-- drop existing data
TRUNCATE myReferenceData

-- insert all data to produce specific state
INSERT INTO myReferenceData (Col1, Col2) Values ("one", 1)
INSERT INTO myReferenceData (Col1, Col2) Values ("two", 2)
-- etc 
Arkiliknam
  • 1,765
  • 1
  • 17
  • 34
  • 1
    What is your question here? You haven't asked anything. – Larnu Jul 14 '20 at 11:34
  • "Are there good alternative solutions for managing Reference Data in State Based Continuous Deployment set ups?" – Arkiliknam Jul 14 '20 at 11:35
  • I meant what was your specific Programming Question, which you don't ask. That's just a recommendation question, which is off topic for Stack Overflow. – Larnu Jul 14 '20 at 11:37
  • I'm asking if there are other accepted ways to manage Reference Data in State Based Continuous Delivery scenarios. The "typical solution" is the typical answer in other stack overflow questions, so I want to differentiate my question from those. – Arkiliknam Jul 14 '20 at 11:40
  • @Larnu I'm trying to reword the question, as its something I would very much like answers to and I'd rather not have the question down voted and closed prematurely. Does it ask a legitimate question now? – Arkiliknam Jul 14 '20 at 12:26

1 Answers1

1

I think that post-deployment scripts are the best option (assuming you are using SSDT, not RedGate).
To maintain reference data I usually use following pattern:

  1. Create a temp table with information which has to be always overwritten from source control.
  2. Use a merge statement to update production data.

This way you can preserve some columns or rows added outside of source control.

Also it is very important to ensure that everybody understand which tables are maintained by deployment scripts and can't be updated manually outside the source control. So if someone don't follow the process, blame on them. You need to clearly define which tables can't be updated outside of development.

Piotr
  • 2,753
  • 1
  • 6
  • 15
  • I was also thinking of using a combination temp tables and merge statements to apply reference data without necessarily deleting live data (after reading this article https://sqlplayer.net/2019/10/script-deploy-the-data-for-database-from-ssdt-project/). I'll spend a bit more time looking at this as a solution. Any helpful tips on how you've achieved this? – Arkiliknam Jul 15 '20 at 10:20
  • I would focus on clear communication what can be changed outside development. – Piotr Jul 15 '20 at 16:55