1

I have to make a reporting utility which takes data from a very old and large table. My search criteria will take out say a million records at a time which later be used for some crap IO operation. I have the option of using JDBC which will give me a ResultSet or Hibernate which will give me List. I want to know will there be a performance difference between the two while iteration.

ares
  • 3,921
  • 4
  • 31
  • 54
  • 3
    Well Hibernate will need to go through the result set to build that list, so what do you think? However it's more likely that your query is the thing you should concentrate on, not the iterating of the results. – Kayaman Jul 16 '14 at 10:43
  • I did some tests on my local machine and hibernate indeed takes an awful lot of time when query.list() is called for very large number of records. I've to come up with a better solution. – ares Jul 16 '14 at 12:32
  • 2
    If the operations you'll be doing are suitable, you should perform them while iterating (i.e. in a streaming fashion). That might not be suitable, but it would be quicker than first reading all the data in the memory and then processing them. – Kayaman Jul 16 '14 at 12:34
  • Possible duplicate of [Fastest way to iterate through large table using JDBC](http://stackoverflow.com/questions/1080852/fastest-way-to-iterate-through-large-table-using-jdbc) – Joey Baruch Oct 12 '15 at 08:12

1 Answers1

2

That is dependent on:

  • Hibernate version
  • DBMS and version
  • JDBC driver and version

How it often works is: if you create your Statements with:

PreparedStatement stmt = con.prepareStatement(sql,
                                              ResultSet.TYPE_FORWARD_ONLY,
                                              ResultSet.CONCUR_READ_ONLY);

then decent DBMS / drivers will stream big queries, and the memory impact will be small, at the cost of holding the Connection for longer. If you get the List for a big query with Hibernate, it will try to load the entire result set in memory at once, and if GC kicks in, the whole thing will crawl at best and will crash at worst. So for big result sets JDBC will be the better option.

Now, if you don't actually mind having a List with Hibernate, you can work with ScrollableResults. See this question: even if the approach did not work for that particular case, that technique will work right on every DBMS/driver combination that the JDBC approach would work right (after all it is just a thin Hibernate layer over the pure JDBC approach explained above). And you also get the ORM part of Hibernate.

Community
  • 1
  • 1
gpeche
  • 20,230
  • 4
  • 32
  • 47
  • Call it a co-incident... I applied the same approach with PreapredStatement by setting Resultset as Type_forward and Concor_read. I tried ScrollableResults too with StatelessSession but it felt like forcing myself to use Hibernate, so I settled for generic JDBC. To avoid any memory issues I'm getting the records in bunches say 20 thousand at a time and then doing the IO operation. It's a little slow but eliminates the risk of OutOfMemeoyError – ares Jul 17 '14 at 13:16