DBAs - Step away from production

I have worked as a DBA and also as a Sql Server developer and in many roles had full access to production, it is OK though I know what I am doing and I normally don’t make mistakes (we’ll come back to this!).

To be clear, this isn’t a moan at DBA’s, I am one and can and do make changes myself - I am trying to improve processes that I have input into. This also applies to anyone who has access to production, ops teams tweaking config settings take note!

I butted my way into a conversation on twitter that ended up being about source control and how sometimes developers make changes that remove indexes and such.

Generally I would say that this is because the index isn’t in source control if that is being used or just not on the development database, I image this scenario:

Developer prepares a release and in doing so:

  • Developer does a schema compare of either a dev database or source control to a production database
  • Developer gets script and runs it into a QA environment
  • Developer gives it a smoke test and hands it to QA
  • QA tests and says it is good
  • The Developer hands the release off to the ops team (including the dba)

The Ops team deploy the update:

  • Code blah blah is deployed
  • DB Scripts are run

The update is live and performance sucks, really bad.

DBA diagnoses issue and puts back index that he put on table weeks ago.

The update is now fast again, wahoo, cheers all round.

Who is at fault here? The developer who deleted the index or the DBA who put the index directly on a production table weeks ago and not putting it on the other environments or in source control?

Let’s look at another scenario, what would happen if the developer decided that he wanted to make a change, he decided that instead of encrypting credit card numbers he could get much better performance if he didn’t. He also decided that it would be more convenient to keep the credit card numbers handy so he appended them to the user names in memory (he forgot that everyone’s usernames are displayed on the site). He wrote his code changes and decided that because he knew what he was doing he would just deploy straight to production, remember he knows what he is doing so it will be fine.

It doesn’t take long for customers to start complaining that a bunch of random numbers are messing with the formatting on the site and so forth - eeek.

I would say that this scenario is unlikely because developers, very rarely these days, deploy straight to production without any constraints.

Back to our DBA, he is looking at another performance issue and thinks that he can fix it by deleting an index, he accidentally also deletes a foreign key and in doing so introduces data corruption.

Why can the DBA cause data corruption but the developer can’t deploy code?

DBA’s shouldn’t make code changes, whether that be schema changes, changes to t-sql code or modifying data without going through development and QA - if you use source control then the change should be checked-in and migrated through the deployment process.

Wait - we’re DBA’s we know what we are doing and sometimes stuff is urgent. This is true, the problem with data is that data is a living thing, it grows, morphs and changes at the whim of users or other systems. Sometimes things do need to change and change quickly so have a process for going straight to production but then all changes MUST go back through source control and the deployment process otherwise they will get lost and can’t be QA’d.

To finish off, I said that I have had full access to production and I have made changes that have been perfectly fine, saving the day in many situations but if I look back honestly, I have also made mistakes, all recoverable but mistakes none the less and mitigating it by wrapping changes in production isn’t foolproof in itself.