5

I have a column that I would like to store a lot of text data in (XML data). Approx 8,000 chars per row, and about 100-500 rows per minute.

That much data means I will have to purge the column out fairly aggressively. (Since I have to host my SQL Server on our company's SAN, storage space is quite expensive.) But if I can find a way to compress this data down, I can keep it around longer.

I have seen things like this article on using CLR Integration to compress BLOBs in SQL 2005.

I have also seen the tool SQLCompress.NET for SQL Server 2005.

Both seem to say they do exactly what I want. Compress the data of a single column while it is stored.

However, the tool has been abandoned (no updates since 2008) and I don't know much about CLR Integration, except I have heard that it can cause problems. Also, both of these solutions are for SQL Server 2005.

So, here is my question. I am using SQL Server 2008 R2. Will either of these SQL Server 2005 solutions work well for me?

Or is there another solution that I can use to compress my data?

NOTE: Row Compression and Page Compression will not help with what I need (At least as far I can can see.) Row Compression is storing fixed length data in variable length fields and page compression reduces instances of redundant data. Neither of these will help with large blocks of text.

NOTE II: I saw this question, but its answer uses row and page compression or FILESTREAMs. I don't want to use FILESTREAMs because I loose the ability to mirror my database.

Community
  • 1
  • 1
Vaccano
  • 70,257
  • 127
  • 405
  • 747
  • What are you using to write the data to (and read data from) SQL Server? Some C# application? – Chris Shain Jan 27 '12 at 18:41
  • @ChrisShain - I am using a C# application (WCF/NServiceBus hosted in IIS) to write to it. I use SSMS to read it. – Vaccano Jan 27 '12 at 19:28
  • 1
    Considering the current two answers, could someone comment on "I don't know much about CLR Integration, except I have heard that it can cause problems"? I have recently used CLR integration myself (without knowing much about it, I admit), and there's been no problems. – bfavaretto Jan 27 '12 at 19:42
  • I cannot really answer your question, but an alternative that will always work is to have the client app compress the data. You would store a varbinary(max) blob. And you can use any compression library that you want including the strong ones like 7-zip (LZMA SDK). Those will not work inside SQL Server I guess. – usr Jan 27 '12 at 18:52
  • I had thought of that. But then I cannot query the data using SSMS. I will always need a custom application to view the data. Still, that would work and I may end up doing that, if I cannot find a better way. – Vaccano Jan 27 '12 at 19:29

2 Answers2

0

I think that using your best bet is either to use a client library to compress and decompress the data before you put it into SQL Server, and if you want to query by particular elements or attributes of the XML, you can extract those and store them in separate columns or normalized rows (which you'd want to do anyway- querying large XML text columns, especially for nested elements, is slow).

Chris Shain
  • 49,121
  • 5
  • 89
  • 122
  • I don't need to query inside of the XML, just get the whole XML out in an SSMS query. (Just as if it were normal text data.) I would rather not compress at the client side because then I cannot use SSMS to see the data. Is column level compression using CLR integration a bad idea? Is this uncommon territory for SQL Server? – Vaccano Jan 27 '12 at 19:51
  • When with the client compressing and decompressing. Not as nice as column compression in SQL Server, but I will take what I can get. – Vaccano Feb 10 '12 at 15:58
0

You could try using filestream to store the xml docs and use NTFS to compress them.

See this artcle

Using Filestream in SQL2008

  • 1
    Alas, when you use FileStream you loose the ability to mirror your database. That is a feature that my DBAs need. Otherwise I would probably choose this solution. – Vaccano Feb 10 '12 at 15:57