3

We just migrated to google cloud endpoints v2 / java8 and found that latency has gone up. We see this kind of request in traces often:

https://servicecontrol.googleapis.com/v1/services/<myapi>.endpoints.<myappid>.cloud.goog:check

Which uses around 14ms. Also, somehow memory usage went up and our B2 frontends suddenly start blocking and having delays of 10s often, which could be a problem with connection pooling not done right, but was somehow not present with endpoints-v1 & java7 before. At the same time, we see 0 errors reported per instance (which is not true, it is aborting requests after around 10-30s all the time) and we cannot get any stack traces to see where a request was aborted like before.

Killing / restarting an instance will solve the 10s problem for some time, but that is naturally not a solution.

Are there any steps that have to be done to get to the promised performance improvements of v2?

cputoaster
  • 614
  • 5
  • 13
  • I found out how to see the stack trace, basically you have to search in the GAE logs, not the endpoint logs. Would be nice to somehow directly get there from the endpoints page. – cputoaster Aug 10 '17 at 10:23

2 Answers2

4

TL;DR - GCE 2.0 alone is faster and more reliable than GCE 1.0, but don't use API Management or you'll give back all those gains and then some.

I too was seeing major slowness issues when testing out GCE 2.0, and I couldn't possibly justify subjecting my users to such terrible latency drops, so I set out to determine what's going on.

Here was my methodology:

I set up a minimum viable App Engine app consisting of just one simple API call that returns a server timestamp using Endpoints 1.0, Endpoints 2.0, and Endpoints 2.0 with API Management. You can see all the code for these here: https://github.com/ubragg/cloud-endpoints-testing

I deployed each of these to a separate App Engine app and tested the API using the API Explorer at these links (so you can try for yourself):
GCE 1.0
GCE 2.0
GCE 2.0+AM

The results? Here are the results of a bunch of requests in rapid succession on each of the APIs:

             GCE 1.0    GCE 2.0    GCE 2.0+AM
average       434 ms      80 ms        482 ms
median         90 ms      81 ms        527 ms
high         2503 ms      85 ms        723 ms
low            75 ms      73 ms        150 ms

As you can see, GCE 2.0 without AM was both fast and consistent. Even GCE 1.0 usually was pretty fast, but would occasionally have some troublesome outliers. GCE 2.0 with AM was pretty much always unacceptably slow, only dipping into the "maybe acceptable" range on rare occasions.

Note that all of these times are from the client perspective reported by the API Explorer. Here are the server reported averages for the same requests from the App Engine dashboard over the same time period:

             GCE 1.0    GCE 2.0    GCE 2.0+AM
average        24 ms      14 ms        395 ms

So bottom line is, if you care about latency, API Management isn't really an option. If you're curious about how to run GCE 2.0 without API Management, simply be sure NOT to follow any of the instructions here: https://cloud.google.com/endpoints/docs/frameworks/python/adding-api-management.

Codiak
  • 1,792
  • 2
  • 11
  • 20
  • This answer saved me. I was going insane wondering why my apis were so slow and deleting the api management from the app fixed everything. – rhodysurf Jun 18 '18 at 12:54
2

Using the base API framework without the management library (of which the 14ms calls you mentioned are a part), you should get some improved latency. There is some increased memory usage in the v2 frameworks, as it is now incorporating code that was previously a separate service. If you are not using API management, I would suggest removing the library and seeing if it helps. It should eliminate the 14ms of latency and reduce memory use a fair amount, as you won't be loading as much code or data.

saiyr
  • 2,535
  • 1
  • 9
  • 13
  • Can you point me to the documentation of what I would lose without the api management?Also, the example documentation mentions that one should use a B4_1G instance, which is very costly. Does that mean that endpoints usage is now only really supported on large projects and should be avoided for small things? – cputoaster Aug 11 '17 at 11:26
  • You lose data in the Endpoints tab in Cloud Console, third party authentication, and API key support. I would try a normal B4 instance first. If that doesn't work, I think we need to do some optimization--a B4_1G instance shouldn't be required. – saiyr Aug 15 '17 at 17:53