Our Database has a very large table with several million rows of data. Some old code was written from a naive standpoint by myself several years ago, and doesn't handle for poor database performance. The database itself was also built at a point where my real-world knowledge was limited.
This table (by neccessity of the use-case) has large numbers of superfluous rows that should (I think) be cleaned up over time. Eventually, I'll write some routines to automatically do this cleanup on a daily basis (thus decreasing the cost).
In the meantime however, I need to come up with an optimization approach. I don't have any experience writing indexes for a database, so I think I'm going to have to do some learning there.
My approach thus far has been to delete the superfluous rows as such:
SET NOCOUNT ON;
DECLARE @r INT;
SET @r =1;
While @r > 0
BEGIN
BEGIN TRANSACTION;
DELETE TOP (100)
From [Shift Offer]
WHERE offer_timestamp IS NULL
AND callout_id_fk < 18605
AND call_round_id_fk IS NULL
SET @r = @@ROWCOUNT;
COMMIT TRANSACTION;
END
Previously, I had Top(1000) set, which caused some pretty serious performance degradation.
After doing some reading, I saw that viewing the execution plan can give us some insight here. Now, I can see the problem in this query is that (I think) the existence of the clustered index is causing slow write operations.
The table is denormalized such that it's not doing a ton of joins (if any) when we're doing read or update operations. Each chunk of the data (defined by callout_id_fk) is only worked on for maximum a few days, and then is only stored for record keeping purposes.
As the table has grown though, there have been some performance issues that have arisen. One of which I was actually able to reproduce when I accidentally degraded the performance with my delete operation. So that tells me we certainly need to do some database tuning here in addition for writing the software code to be a little more robust in terms of handling bad performance.
So I'm left with the question. Is deleting the offending superfluous rows a bad approach? Could the database be improved by putting some thought into indexing our offending table (rather than letting Azure do the Indexing)? Should I do both deleting of rows and creating indexes?
Lastly, should I drop the indexes, do the delete operation, and then recreate each index? I'm unsure if dropping the indexes will exacerbate the performance issue while running this operation, so I'm curious what other folks might think is a good approach here.