9

I am finding inserting rows in table structures or in normal cells - manually or otherwise - very very slow. Like it takes more than 10 mins to insert 7 rows in a table (containing literal strings only) or in adjacent cells, in a sheet with no conditional formatting.

The workbook has 45 worksheets and 20 tables, with the bigger tables having XML files of about 10KB. There are 33MB worth of spreadsheet XMLs with most being around 300KB with 5 more than 1MB and one being 15MB. Its fairly complex but not massive. All of the calculations flow nicely from left to right up to down, right sheet to left sheet and I've mostly managed to avoid array formulas. All of the tables have regular structures, with the calculated columns having one only formula. Most of the table columns are calculated, with only a couple of smaller ones containing literal data.

I do have a lot of conditional formatting on a couple of sheets but I've been very careful to keep it rational and stopped it from fragmenting: I have about 45 rules for the whole sheet and these are generalised to cover all columns. The main processing for the formating decisions are moved into the tables as helper columns and as I said, very regular in structure.

It seems that these type of edits are not thread safe so only one processor is loading up and there is very light disc activity. I can't understand what excel is doing all that time. Of course I set calculation to manual...

I've seen comments attributing this type of thing to the increased row and column limits, but I don't understand why this should be a factor. If I look at the XML files of the spreadsheets, there is only code for rows and columns that are occupied with values or formulas. So why are the unoccupied cells in play?

This is having a massive effect on my productivity - although I'm learning a lot by reading in sites like this in my new-found spare time. I really need to figure out what the problem is so that I can avoid or work around this issue if possible.

Can anybody help me on that?

Just in case people are wondering about this, the answer is to use power query and power view in excel. I find medium (500k lines) datasets and complex structures and transformations all work without a hitch. I never use formulae in tables anymore. The other thing is that this naturally leads you to power bi which is great. That's my tip.

Cool Blue
  • 6,163
  • 6
  • 21
  • 54
  • Inserts cause implicit rewriting of all formulas in shifted cells and any formulas relying on any data in shifted cells - I believe. I assume the calculation chain is used to determine which cells might need to be rewritten on insert/deletion, but it may just be a straight scan through the entire workbook. – Orbling Oct 23 '13 at 14:57
  • Does insertion only at the end of the table (with a sort if needed) make any difference to the performance? – Orbling Oct 23 '13 at 15:01
  • but why "shift" empty space? And what's to be done with table references which are all regular structures? If you change what's inside a table, or even the number of rows, it should not affect any reference via a regular structure. =tablename[@[column name]] for example. this reference is independent on the table contents or even rows count. So is: =Index(tablename[column name], same for this: match(ref1,tablename[column name 2],0))... what needs to be done? I don't get it... – Cool Blue Oct 23 '13 at 15:05
  • No Sir, re-sizing the table to add rows at the end is just as excruciating... it doesn't seem to matter. – Cool Blue Oct 23 '13 at 15:06
  • Yes, none of those need to be rewritten - but does Excel know that without looking each time? It's hardly going to cache the dependency chain for such a large workbook. Table totals and internal definitions of the table do however need to be altered. Are you inserting piecemeal, or just a single block insertion? – Orbling Oct 23 '13 at 15:10
  • Just a single block, for example 7 rows in a table with 3 columns. This takes about 30 mins... – Cool Blue Oct 23 '13 at 15:23
  • But the spreadsheet is defined by the code in the XML right? and this is what drives and informs the decision model. So the decision model looks in the XML files and adjusts things: thats the extent of it's universe. What has the white space got to do with it Sir? It has the same status as "all the other spreadsheets in the world" for example. Because it is not referenced in the XML structure. So why would it need to go and check them? – Cool Blue Oct 23 '13 at 15:28
  • Might be worth looking at the dependency tree that comes from that table. It does sound a heck of a lot of time for such an operation. I would be interested to watch its file accesses during that time, see what on earth its doing. Use [Process Monitor](http://technet.microsoft.com/en-gb/sysinternals/bb896645) perhaps. – Orbling Oct 23 '13 at 15:31
  • It's only defined by the XML for storage, I doubt it uses it internally in that form. But with such a large data set, it might have to swap - though as you say, it's not giant. There must be some discrepancy. – Orbling Oct 23 '13 at 15:33
  • There is almost zero I/O activity actually, as seen by the indicator light and confirmed by the graph in process monitor. Its completely cpu intensive and as I state above, limited to one cpu. – Cool Blue Oct 23 '13 at 15:33
  • but if it can be uniquely defined by the XML, why would it create a flat, dumb model in memory to include all of the stuff that is not relevant? Like the XML, if a new cell is referenced then it can be added to the model... its an object like any other. I can't imagine why they would do that Sir... – Cool Blue Oct 23 '13 at 15:38
  • Have you looked at why only one core is in use? Are you using non-thread safe functions at all, any custom XLL in use that are not flagged as thread safe? (See: http://msdn.microsoft.com/en-us/library/office/bb687899.aspx) Have you tried performing the insertion from within a VBA function with `Application.ScreenUpdating = False` active? – Orbling Oct 23 '13 at 15:39
  • Yes it'll be defined in memory with the internal object form, some form of sparse table, though that is not XML. – Orbling Oct 23 '13 at 15:40
  • It has the same behaviour on my test sytem which is excel 2010 and common behaviour for any largish spreadsheet that I have including table structures. I have also noticed that some tables where I can't copy and paste into the cells. For example I use control V and nothing happens. – Cool Blue Oct 23 '13 at 15:41
  • I'll try from VBA as you suggest and see what happens... Although VBA is not thread safe. I will check, but I thought I read somewhere that edits like this are not thread safe... – Cool Blue Oct 23 '13 at 15:44
  • And yes Sir, that's my point, it will be a sparse table of included objects. That's why I wanted to challenge any emerging, common orthodoxy about the impact of the extra space. It seems bogus to me. – Cool Blue Oct 23 '13 at 15:46
  • I'll also try it with conditional formatting switched off on the sheets that are dependent on the one I'm working on. This could be the problem because I have a lot of cells (tens of thousands) with conditional formatting. But the CF for whole sheet (about 80k cells) is managed with only 45 rules and all of the decisions are handled by helper columns and passed in with named ranges. – Cool Blue Oct 23 '13 at 16:03
  • Update: I deleted ALL of my beautiful CF and it made no difference. Even the single threading persisted. I have no UDF's on the affected sheet and only about 16 cells in the whole workbook that use UDFs and they are not volatile and not calculated often. As stated earlier, I have calculation switched off anyway so its not about executing the model, its about restructuring it. – Cool Blue Oct 23 '13 at 16:11
  • Out of interest, how long does a full recalculation take? – Orbling Oct 23 '13 at 16:18
  • Thanks for your interest! The main data-assembly sheet is where all of the re-calc time goes and this takes 6.5 seconds from "dirty" but, normal recalc is about 1.5 seconds. I have a dashboard sheet where I manually control the calculations and its about half a second. As I said, I've been very careful to keep the structure regular (or at least I've raked over it many times and evolved it to be so: I wish I could say I got it right first time but: not so much :Q). – Cool Blue Oct 23 '13 at 16:25
  • Ah well, all the more mysterious with the restructuring delay. Many orders of magnitude more intensive than a full recalculation. I know altering table definitions is slow from experience, my thoughts on why are as above. But that could not possibly explain the sort of delays you are facing. I wonder if there is a debug interface... – Orbling Oct 23 '13 at 16:29

3 Answers3

4

Long insertion times may be due to INDEX (or other functions) that reference a whole column, or a whole row.

I had a very similar problem: not too complex worksheet (about 2500 rows, with 15 columns of data (results from a query), and about 10 columns of formulas to extract data from the query results. when I inserted a column, the first columns might insert within 4 seconds or so, but the second insert would take over a minute. Yikes! I searched the internet and found this site http://support.microsoft.com/kb/2755145.

My experience:

I was using a formula like =INDEX(11:11,1,MATCH(AC$5,$10:$10,0)), about 25000 times in my worksheet. You can see that each formula references an entire row twice. Apparently, when I added a column, since each row is affected, and therefore each of my formulas was affected, Excel would dutifully go to work trying to figure out what to do about that.

Based on what I learned form the microsoft website, I changed the formula to =INDEX(QueryResults,ROW()-ROW(QueryHeaders),MATCH(AC$5,QueryHeaders,0)), where the QueryResults and QueryHeaders are simple named ranges.

After I made this change throughout the sheet, inserting a column became almost instantaneous - less than a second.

Chuck Trese
  • 151
  • 2
  • 4
  • Thanks Chuck, No, I never have entire columns or rows in my formulae. That's not the problem in my case. – Cool Blue Nov 02 '14 at 05:15
  • This fixed my issue on a very important document which held a huge table with lots of references to other sheets. Thank you! Took me a surprisingly long time to find this answer. – Zeretil Jul 20 '18 at 13:55
1

This sounds like the problem described here http://fastexcel.wordpress.com/2012/01/30/excel-2010-tableslistobject-slow-update-and-how-to-bypass/

If so you have to break one of the conditions to bypass it:
For this slowdown to occur each of the following conditions must be true:

A cell within the Table must be selected
The sheet containing the Table must be the Active Sheet
The cell being updated must be on the same sheet as the table, but does not have to be within the table
There must be a reasonable number of formulas in the workbook.


Maybe you could do the update indirectly via VBA with another sheet active. Or Maybe moveing all the formulas to a separate workbook would bypass it.
Or convert your Tables back to normal ranges (& use dynamic range names if neccessary)

Charles Williams
  • 21,820
  • 5
  • 34
  • 36
  • Hi, I tried running a macro to insert a row in a list object in the target sheet. After noting the insertion point based on the active cell, the macro activates a sheet with no tables and then does the insertion in the target sheet. Theoretically it then re-Activates the target sheet but I didn't hang around long enough to find out. Basically the same ridiculous behaviour occurs and it takes more than 10 mins to insert one row in a very simple table. I think tables are broken in excel. Its very disappointing. – Cool Blue Oct 24 '13 at 09:30
  • Could you confirm that if you change the Table back to a Range it runs fast? (Otherwise you may have found a different problem). If you confirm I will try to file a Bug report. – Charles Williams Oct 24 '13 at 10:11
  • I just tested inserting a row rather than changing a value: its slow both with Tables and ordinary ranges so your situation is NOT the same as described in my post. Maybe you can bypass by just changing a value in an empty row rather than inserting a row. – Charles Williams Oct 24 '13 at 10:23
  • I can confirm it is still diabolically slow if I first convert to a range. On your second point, I don't understand your suggested work-around Charles... – Cool Blue Oct 24 '13 at 13:27
  • Your problem occurs when a row is inserted. If you just enter values into empty rows at the bottom of the range then it won't be slow. But if you are using Tables you also need to avoid the problem outlined in my post. – Charles Williams Oct 24 '13 at 14:14
0

Try removing conditional formatting and then reapplying it with vba after main code is through. Worked for me.

  • Hi, I'm still interested in this so thanks for the suggestion. My problem however, is not about slow VBA, it's about excel taking a huge ammount of time to insert columns in tables. I have learned a lot since my original post so I now know why it's only hitting one CPU, and that excel has one main thread, but I still don't understand if there's a chronic problem with Tables in excel. With regard to the CF, it's really not a problem coz the same behaviour persisted when I deleted it. And my CF is very clean, I make sure it doesn't fragment and I ensure all of the decisions are in helper columns – Cool Blue Dec 26 '13 at 12:28
  • I have a similar problem. When I add data, I insert a row at the top because the data is about 4000 rows by 50 columns. Sometimes it takes forever, and sometimes it just takes a couple seconds. That is very frustrating when you need to smoke out a bug. All of the math is done on a separate worksheet, but that separate worksheet does use the INDEX function. FWIW, Excel has become my primary computing tool, since Mathcad terminated fifteen years ago. I still don't understand the verbose macro language very well, but I can get it to do a lot of work without understanding. – richard1941 Jul 13 '17 at 12:55