CakeSession::_startSession - Slow on Elasticache

Question

We're running CakePHP 2.9, and using an Elasticache Cluster for Session Storage (which is stored via Memcached).

We've disabled PHP's in-built session garbage collection as recommended here: https://tideways.io/profiler/blog/php-session-garbage-collection-the-unknown-performance-bottleneck

session.gc_probability = 0

We have also set the probability setting to 0 within CakePHP's Cache config.

However; we're still having issues whereby occasionally we experience major slow-downs in CakeSession::_startSession, as reported by New Relic:

The Elasticache Cluster is not showing any metrics which would suggest there is a problem (unless there's some metric I'm not understanding correctly).

Any suggestions on how to diagnose this cause?

@apokryfos Yes - all within the same Security Group - is that what you meant? — user984976, Mar 05 '17 at 05:38
No VPC is not the same as the securty group. VPC is like a LAN for the services. Check [the faq pages out](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html) — apokryfos, Mar 05 '17 at 07:24
Yeah, its called "VPC Security Group". The cluster is on the same VPC Security Group as the EC2 Instances. — user984976, Mar 05 '17 at 20:42
If your instances are on the same VPC (which is what's implied by using the same VPC security group) then the only other reason I can think of is that they're `t` type instances and the burst quota is regularly being exceeded. — apokryfos, Mar 06 '17 at 07:05
Sorry they're all c4.large. About a month ago we moved off t2 type instances because we were having issues with credits running out. This issue has persisted since switching instance sizes. — user984976, Mar 06 '17 at 09:07
@srayhunter 2 within the cluster. Spread over 2 availability zones. — user984976, Mar 15 '17 at 06:32
@user984976 I ran into issues where having 2 nodes in the cluster caused a ton of issues. I wonder if you change it to 1 if that would fix the issue. — Ray Hunter, Mar 15 '17 at 23:25
In the graph i can't see the problem with `CakeSession::_startSession`. The whole execution time for `Dispacher::dispach` is only 5ms, including `CakeSession::_startSession`. — pbacterio, Mar 17 '17 at 12:01
@pbacterio: Perhaps I'm mis-reading the graph, but my understanding is that it's showing that total execution time was 0.026s up till it hit CakeSession::_startSession, then it took 5.7s to complete that before carrying on with TenantAuthorizeComponent::initialize at timestamp 5.787? — user984976, Mar 18 '17 at 00:14

user984976 · Accepted Answer · 2017-03-24T22:17:16.067

This issue appears to have been caused by session locking, something I wasn't even aware existed.

This article explains how and why Session Locking exists: https://ma.ttias.be/php-session-locking-prevent-sessions-blocking-in-requests/

What's important is that memcached has session locking turned on by default.

In our case, we don't use Sessions for much other than Authentication, our application doesn't use the session information for storing User State (like a shopping cart would), so we simply disabled session locking with the php.ini setting:

memcached.sess_locking = 0

Since making this change, we've seeing a huge improvement in response times (~200ms average to ~160). This is especially noticeable on AJAX-heavy pages which load a lot of data concurrently. Previously it seems these requests were being loaded sequentially however they're now all serviced simultaneously, the difference in speed is incredible.

While there are likely some edge cases we'll uncover over the coming weeks/months as a result of turning off session locking, this appears to be the cause of the issue, and this change seems to have stopped the problem from occurring.

rock3t · Answer 2 · 2017-03-18T19:15:45.140

You need to debug in decoupled way to find out which layer is causing problems.

It can be Cake, AWS infrastructure, network latency...

Run this small PHP script and tell us the time it took.

// memcache
$m = microtime( true );
$memcache_obj = new Memcache;
$memcache_obj->connect('myhost.cache.amazonaws.com', 11211);
printf('%.5f', microtime( true ) - $m) ;

// memcached.
$time = microtime( true );
$m = new Memcached();
$m->addServer('<elasticache node endpoint>', 11211);

$m->set('foo', 100);
var_dump($m->get('foo'));
printf('%.5f', microtime( true ) - $time) ;

If time is OK, the problem will be Cake.

However being honest here, I fairly certain the problem is ElastiCache Cluster.

Try to point to and end-point of a node and not the end-point of ElastiCache Cluster and let me know how ti goes.

"Memcache" is not installed, only "Memcached" - do you know how to perform this with Memcached? — user984976, Mar 18 '17 at 00:55

score 0 · Answer 3 · answered Jan 02 '20 at 13:51

We had similar problem of site becoming slow after moving sessions to Memcached on AWS (EC2 and Elasticache/Memcached). Following changes fixed the problem.

php.ini - session.lazy_write = Off
memcached.ini - memcached.sess_locking = Off

Now site is working fine, with expected speed.

But I am wondering if there is any adverse effects of turning off these settings?

CakeSession::_startSession - Slow on Elasticache

3 Answers3