4

Problem Definition:
I have a Database table with huge amount of data(more than 100,000 rows) , table structure is like

AppID  DocID  DocStatus 
1      100    0
1      101    1    
2      200    0    
2      300    1

Per applicationID there may be thousands of Documents, I have to fetch Count of the documents with status 0 and count of the documents with status 1 grouped by applicationID.

When I am mapping this object using hibernate, it will eat up lot of heap memory due to large amount of table data.

How Can I achieve this using Hibernate query? OR Should I use SQL Query or Stored Procedure for this ?

Note : My Web Application is in JAVA/Tapestry framework and using Hibernate 3. Database is SQL Server 2012.

Lukas Eder
  • 181,694
  • 112
  • 597
  • 1,319
Hars
  • 135
  • 2
  • 10
  • please post the query you are using, if you are trying to load all the object to memory it will definitely use a lot of memory – fmodos Feb 15 '14 at 07:10
  • @fmodos Yes I was trying to load all the object to memory. I am new to hibernate I have written simple queries like `Criteria crit = session.createCriteria(XYZ.class); crit.add( Restrictions.eq("DocumentStatus", 0)); criteria.setProjection(Projections.rowCount()); Integer count = criteria.uniqueResult();` But if I will use this type of queries I need to hit 2 hibernate queries to get count with status 0 and with status 1. Can I write a hibernate query to get both in a single query ? – Hars Feb 15 '14 at 16:09

1 Answers1

16

Every time you have a data-centric problem (as opposed to a Java domain-model-centric one), you should use SQL directly. Your database will be much faster than your Java code, because calculations can be performed closely to the data, instead of transferring all of it through the wire and into your memory. See also "2. Processing Data in Memory" of this blog post.

You can achieve this with JDBC directly, or with a native query, or with any third-party SQL library of your choice, such as MyBatis or jOOQ.

Your problem can be trivially solved with any of these queries:

Using GROUP BY

SELECT [AppID], [DocStatus], count(*)
FROM [MyTable]
GROUP BY [AppID], [DocStatus]

Example on SQLFiddle

Using nested selects

SELECT [AppID],
       (SELECT count(*) FROM [MyTable] [t2]
        WHERE [t1].[AppID] = [t2].[AppID]
        AND [DocStatus] = 0) [Status_0],
       (SELECT count(*) FROM [MyTable] [t2]
        WHERE [t1].[AppID] = [t2].[AppID]
        AND [DocStatus] = 1) [Status_1]
FROM [MyTable] [t1]
GROUP BY [AppID]

Example on SQLFiddle

Using SUM()

SELECT [AppID],
       SUM(IIF([DocStatus] = 0, 1, 0)) [Status_0],
       SUM(IIF([DocStatus] = 1, 1, 0)) [Status_1]
FROM [MyTable] [t1]
GROUP BY [AppID]

Example on SQLFiddle

Using PIVOT

SELECT [AppID], [0], [1]
FROM (
  SELECT [AppID], [DocStatus]
  FROM [MyTable]
) [t]
PIVOT (count([DocStatus]) FOR [DocStatus] IN ([0], [1])) [pvt]

Example on SQLFiddle

Lukas Eder
  • 181,694
  • 112
  • 597
  • 1,319
  • Yes I opted for nested SQL query for this particular situation. It worked for me. Thanks alot. – Hars Feb 18 '14 at 08:53