21

I have gotten some reports from users of crashes when try use my application on Verizon's 4G/LTE.

Looking at the stack trace, it looks like Android's HttpClient.execute() implementation is throwing an OOM. This happens only on 4G/LTE devices, specifically HTC Thunderbolt, and only when on 4G/LTE. WiFi, 3G, UMTS are OK. Also works fine on Sprint's WiMax 4G stuff works fine.

Two questions:

  • What's the best way to get the attention of Android devs about this? Any better options than reporting on http://code.google.com/p/android/issues?

  • Any ideas on how I can work around this? I don't have a 4G device myself and I can't get this happen in the emulator so I need to make some educated guesses here. I can try to catch the OOM in my code and attempt to cleanup and force GC, but I'm not sure if that's a good idea. Comments or other suggestions?

Here's what my code's doing:

    HttpParams params = this.getHttpParams(); // returns params
    ClientConnectionManager cm = new ThreadSafeClientConnManager(params, this.getHttpSchemeRegistry() );
    DefaultHttpClient httpClient = new DefaultHttpClient( cm, params );

    HttpResponse response = null;
    request = new HttpGet( url );

    try {

        response = httpClient.execute(request); // <-- OOM on 4G/LTE. OK otherwise
        int statusCode = response.getStatusLine().getStatusCode();
        Log.i("fetcher", "execute returned, http status " + statusCode );

    ...

Here's the crashing stack trace:

E/dalvikvm-heap(11639): Out of memory on a 2055696-byte allocation. I/dalvikvm(11639): "Thread-16" prio=5 tid=9 RUNNABLE I/dalvikvm(11639): | group="main" sCount=0 dsCount=0 s=N obj=0x48563070 self=0x3c4340 I/dalvikvm(11639): | sysTid=11682 nice=0 sched=0/0 cgrp=default handle=3948760 I/dalvikvm(11639): | schedstat=( 208709711 74005130 214 )

I/dalvikvm(11639): at org.apache.http.impl.io.AbstractSessionInputBuffer.init(AbstractSessionInputBuffer.java:~79) I/dalvikvm(11639): at org.apache.http.impl.io.SocketInputBuffer.(SocketInputBuffer.java:93) I/dalvikvm(11639): at org.apache.http.impl.SocketHttpClientConnection.createSessionInputBuffer(SocketHttpClientConnection.java:83) I/dalvikvm(11639): at org.apache.http.impl.conn.DefaultClientConnection.createSessionInputBuffer(DefaultClientConnection.java:170) I/dalvikvm(11639): at org.apache.http.impl.SocketHttpClientConnection.bind(SocketHttpClientConnection.java:106) I/dalvikvm(11639): at org.apache.http.impl.conn.DefaultClientConnection.openCompleted(DefaultClientConnection.java:129) I/dalvikvm(11639): at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:173) I/dalvikvm(11639): at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:164) I/dalvikvm(11639): at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:119) I/dalvikvm(11639): at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:348) I/dalvikvm(11639): at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:555) I/dalvikvm(11639): at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:487) I/dalvikvm(11639): at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:465) I/dalvikvm(11639): at com.myapplication.Fetcher.trySourceFetch(Fetcher.java:205) I/dalvikvm(11639): at com.myapplication.Fetcher.run(Fetcher.java:298) I/dalvikvm(11639): at java.lang.Thread.run(Thread.java:1102) I/dalvikvm(11639): E/dalvikvm(11639): Out of memory: Heap Size=24171KB, Allocated=23142KB, Bitmap Size=59KB, Limit=21884KB E/dalvikvm(11639): Extra info: Footprint=24327KB, Allowed Footprint=24519KB, Trimmed=348KB W/dalvikvm(11639): threadid=9: thread exiting with uncaught exception (group=0x40025b38)

EJoshuaS - Reinstate Monica
  • 10,460
  • 46
  • 38
  • 64
psychotik
  • 35,551
  • 33
  • 95
  • 134
  • 1
    Just confirmation that I'm tracking this same issue. Only appears on htc_mecha (thunderbolt) on verizon_wwe. Issue first appeared March 17 2011. – DougW Mar 21 '11 at 20:35
  • 1
    I went and bought an HTC Thunderbolt to diagnose this issue. What CommonsWare says below is correct. Setting the buffer manually to 8k resolves the crashes. Not sure why HTC decided to change that. Hope they enjoy the phone restocking count++. – DougW Mar 25 '11 at 00:16
  • Awesome, I was experiencing this problem too. Thanks for the confirmation. – Victor Mar 25 '11 at 00:18
  • this also occurs on a LG P920 device as well – petey Nov 26 '12 at 18:46

3 Answers3

26

Looking at the stack trace, it looks like Android's HttpClient.execute() implementation is throwing an OOM.

That is not indicated by the stack trace you have on the issue. Of course, you didn't provide the whole stack trace on the issue.

What's the best way to get the attention of Android devs about this? Any better options than reporting on http://code.google.com/p/android/issues?

The odds of this being a pure Android bug are small, though not zero.

Here are some other possibilities, in no particular order:

  1. There is no problem with execute() per se, but that you are simply running out of memory, and the stack traces you have encountered are simply demonstrating that execute() is stressing your heap.

  2. The problem is in some modifications that HTC made to Android for the Thunderbolt, possibly only taking effect when on the LTE network.

  3. The problem is somehow caused by the Verizon LTE network itself (e.g., some proxy of theirs sending back screwball information that is causing HttpClient to have a conniption).

Any ideas on how I can work around this?

First, I'd use existing tools (e.g., dumping HPROF and examining with Eclipse MAT) to confirm that you don't have a memory leak in general that the Thunderbolt/LTE combo just seems to be tripping over.

Next, I recommend that you come up with some way to consistently reproduce the error. That could be your existing app with a series of steps to follow, or it could be a dedicated app (e.g., log the URL that triggers the OOM, then create a tiny app that just does that HttpClient request). I wish DeviceAnywhere had a Thunderbolt, but it doesn't look like it. I'll put some feelers out and see if I can get some help on that front.

In terms of working around it, as a stopgap, you can detect that you're running on a Thunderbolt via android.os.Build data, and perhaps that you're on LTE via ConnectivityManager (I'm guessing LTE would list as WiMAX, but that's just a guess), and warn users about the problems with that combo.

Beyond that, you can try changing up your HttpClient usage a bit and see if it has an effect, such as:

  • If you are only supporting API Level 8 or higher, you could give AndroidHttpClient a shot as a drop-in replacement
  • Disable multi-threaded access (in general or Thunderbolt-specific) and get rid of the ThreadSafeClientConnManager

I'm sorry that I don't have a "magic bullet" answer for you here.


UPDATE

Now that I have the full stack trace, looking through the source code is...illuminating, somewhat.

The problem appears to be that:

HttpConnectionParams.getSocketBufferSize(params);

is returning that 2MB or so value that is triggering the OOM. That's an awfully big buffer, particularly for the Dalvik GC engine, which can get fragmented (yes, there's that word again).

params here is the HttpParams. You seem to be creating those yourself via getHttpParams(). For example, AndroidHttpClient sets that to 8192:

HttpConnectionParams.setSocketBufferSize(params, 8192);

If you are setting the socket buffer size yourself, try reducing it. If not, try setting it to 8192 and see if that helps.

CommonsWare
  • 910,778
  • 176
  • 2,215
  • 2,253
  • Ah, cut-paste error. The full stack is now there. Thanks. Those are good suggestions - I'll try some of the things you recommended and post an update if I find anything new. – psychotik Mar 18 '11 at 22:20
  • Btw, I did think of #1 but from what I can tell in the logs that users have sent they are doing nothing special while on LTE that wouldn't happen while on 3G/WiFi. So though there is a change that this is because of a memory condition induced by my application it seems unlikely since only using LTE causes it to occur. – psychotik Mar 18 '11 at 22:23
  • 1
    @psychotik: I updated the answer based upon further research, itself based upon your amended stack trace. – CommonsWare Mar 18 '11 at 22:33
  • @CommonsWare - thanks, that sounds promising. Btw, can you point me to where you found this code. I want to dig around and see why I might not have seen this on other phones/connection types. I assume you're looking at the stock Android source (http://android.git.kernel.org/) or somewhere else? – psychotik Mar 18 '11 at 23:10
  • 1
    @psychotik: "Btw, can you point me to where you found this code." -- use Google Code Search (http://www.google.com/codesearch). Put a class name in the search bar, and `android.git.kernel.org` in the Package field. It's excellent for this sort of problem. The good news is that all the line numbers match the latest stuff in the repo, so there was really no guesswork. I just started from the actual crash point and worked my way backwards, trying to figure out where the buffer size was coming from. – CommonsWare Mar 18 '11 at 23:21
  • 3
    @psychotik: "see why I might not have seen this on other phones/connection types" -- well, that's the strange part. Assuming you aren't setting it to 2055696, and since I see no evidence that it normally is 2055696, my best guess would be that HTC is somehow setting it by default to 2055696 via a hacked version of `HttpConnectionParams`. – CommonsWare Mar 18 '11 at 23:24
  • @CommonsWare looks like your suggestion was spot on - maybe HTC did mess with those defaults. This would've been a nasty one for me to figure out without your help above, so thanks again. – psychotik Mar 20 '11 at 23:01
  • @psychotik Have you been able to verify this on a Thunderbolt? Have you had any luck in fixing this in your, or did it help just to set the buffer size? – Eric Nordvik Mar 21 '11 at 09:10
  • 3
    @psychotik @cant0na - Our app was experiencing this same issue. I went and bought an HTC Thunderbolt to test. Setting the buffer size manually does indeed resolve the issue. – DougW Mar 25 '11 at 00:15
  • Had the same problem, confirmed this fixed it. Thanks! – Darrell Jul 15 '11 at 17:46
  • 1
    I have a HTC ThunderBolt. I did this call in the code, HttpConnectionParams.getSocketBufferSize(httpClient.getParams()), but it returned -1, but not 2055696. What's going on? – Kai Oct 06 '11 at 22:26
4

here's the fix: https://review.source.android.com/22852

in the meantime, URLConnection is immune. it's only HttpClient that has this problem.

if you're a developer wanting to test this kind of failure, you can use "adb shell setprop" to set, say, "net.tcp.buffersize.wifi" so that the maximum read/write socket buffer sizes are huge when your device is on wifi. something like the following would be a real stress test:

adb shell setprop net.tcp.buffersize.wifi 4096,80999999,80999999,4096,80999999,80999999

it's this kind of configuration change that exercises the HttpClient bug. i don't know what the exact values on the Thunderbolt are, but someone with the device could find out using "adb shell getprop | grep buffersize".

Elliott Hughes
  • 4,517
  • 2
  • 20
  • 21
3

Maybe this will help:

// Set the timeout in milliseconds until a connection is established.
int timeoutConnection = 5000;

// Set the default socket timeout (SO_TIMEOUT) 
// in milliseconds which is the timeout for waiting for data.
int timeoutSocket = 4000;

// set timeout parameters for HttpClient 
HttpParams httpParameters = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(httpParameters, timeoutConnection);
HttpConnectionParams.setSoTimeout(httpParameters, timeoutSocket);
HttpConnectionParams.setSocketBufferSize(httpParameters, 8192);//setting setSocketBufferSize

DefaultHttpClient httpClient = new DefaultHttpClient();
httpClient.setParams(httpParameters);
wormhit
  • 3,408
  • 31
  • 42