6

I have a Django app that connects to Oracle. Every 10 refreshes or so, regardless of what page, it gives an error. First time after restart an ORA-03113 and every subsequent errors ORA-03135.

I have checked to alert.log and listener.log on the database side and can't find anything about this session terminating in the log file.

I've edited site-packages/django/db/init.py and put an try/except around close_if_unusable_or_obsolete. The page does not give an error anymore, and in the logfile no ORA error but I still see the SIGPIPE error.

My guess : Django thinks there is still a connection and tries to close (close_if_unusable_or_obsolete) it but on the Oracle side the connection does not exist. The first ORA-03113 does not result in a record in the listener.log, so it seems it does not even reach the Database server. After the modification in the init.py the result page is correct. So the query for building up the page is successful. It seems to be a connection for administrative purposes of some kind.

I'm lost, does anyone have an idea how to find this error?

cx_oracle==5.1.3
Django==1.8.3
Oracle Client : instantclient-basic-linux.x64-12.1.0.2.0
Database version : 11.2.0.3.0

uwsgi_error.log

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 177, in __call__
    signals.request_started.send(sender=self.__class__, environ=environ)
  File "/usr/lib/python2.7/site-packages/django/dispatch/dispatcher.py", line 189, in send
    response = receiver(signal=self, sender=sender, **named)
  File "/usr/lib/python2.7/site-packages/django/db/__init__.py", line 64, in close_old_connections
    conn.close_if_unusable_or_obsolete()
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 403, in close_if_unusable_or_obsolete
    self.close()
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 198, in close
    self._close()
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 152, in _close
    return self.connection.close()
  File "/usr/lib/python2.7/site-packages/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 152, in _close
    return self.connection.close()
django.db.utils.OperationalError: ORA-03113: end-of-file on communication channel
Process ID: 10345
Session ID: 195 Serial number: 54225

[pid: 30|app: 0|req: 1/4] 123.123.123.123 () {46 vars in 1024 bytes} [Fri May 27 15:58:34 2016] GET /admin/ => generated 0 bytes in 8 msecs (HTTP/1.1 500) 0 headers in 0 bytes (0 switches on core 0)
[pid: 28|app: 0|req: 4/5] 123.123.123.123 () {46 vars in 1024 bytes} [Fri May 27 15:58:38 2016] GET /admin/ => generated 31712 bytes in 291 msecs (HTTP/1.1 200) 6 headers in 251 bytes (1 switches on core 0)
[pid: 28|app: 0|req: 5/6] 123.123.123.123 () {46 vars in 1024 bytes} [Fri May 27 15:58:39 2016] GET /admin/ => generated 31712 bytes in 276 msecs (HTTP/1.1 200) 6 headers in 251 bytes (1 switches on core 0)
[pid: 28|app: 0|req: 6/7] 123.123.123.123 () {46 vars in 1024 bytes} [Fri May 27 15:58:40 2016] GET /admin/ => generated 31712 bytes in 288 msecs (HTTP/1.1 200) 6 headers in 251 bytes (1 switches on core 0)
[pid: 30|app: 0|req: 2/8] 123.123.123.123 () {46 vars in 1024 bytes} [Fri May 27 15:58:40 2016] GET /admin/ => generated 31712 bytes in 608 msecs (HTTP/1.1 200) 6 headers in 251 bytes (1 switches on core 0)
Fri May 27 15:58:41 2016 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /admin/ (ip 131.180.77.1) !!!
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 177, in __call__
    signals.request_started.send(sender=self.__class__, environ=environ)
  File "/usr/lib/python2.7/site-packages/django/dispatch/dispatcher.py", line 189, in send
    response = receiver(signal=self, sender=sender, **named)
  File "/usr/lib/python2.7/site-packages/django/db/__init__.py", line 64, in close_old_connections
    conn.close_if_unusable_or_obsolete()
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 403, in close_if_unusable_or_obsolete
    self.close()
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 198, in close
    self._close()
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 152, in _close
    return self.connection.close()
  File "/usr/lib/python2.7/site-packages/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/lib/python2.7/site-packages/django/db/backends/base/base.py", line 152, in _close
    return self.connection.close()
django.db.utils.OperationalError: ORA-03135: connection lost contact
Process ID: 10345
Session ID: 195 Serial number: 54225

SQL*Net trace file contains:

(4043057216) [31-MAY-2016 15:18:51:303] nioqsn: entry
(4043057216) [31-MAY-2016 15:18:51:304] nioqsn: exit
(4043057216) [31-MAY-2016 15:18:51:304] nioqrc: entry
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_bsd: entry
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_bsd: tot=0, plen=13.
(4043057216) [31-MAY-2016 15:18:51:304] nttfpwr: entry
(4043057216) [31-MAY-2016 15:18:51:304] nttfpwr: socket 29 had bytes written=13
(4043057216) [31-MAY-2016 15:18:51:304] nttfpwr: exit
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_bsd: packet dump
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_bsd: 00 0D 00 00 06 00 00 00  |........|
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_bsd: 00 00 03 0F 0F           |.....   |
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_bsd: exit (0)
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_brc: entry: oln/tot=0
(4043057216) [31-MAY-2016 15:18:51:304] nttfprd: entry
(4043057216) [31-MAY-2016 15:18:51:304] ntt2err: entry
(4043057216) [31-MAY-2016 15:18:51:304] ntt2err: Read unexpected EOF ERROR on 29
(4043057216) [31-MAY-2016 15:18:51:304] ntt2err: exit
(4043057216) [31-MAY-2016 15:18:51:304] nttfprd: exit
(4043057216) [31-MAY-2016 15:18:51:304] nserror: entry
(4043057216) [31-MAY-2016 15:18:51:304] nserror: nsres: id=0, op=68, ns=12537, ns2=12560; nt[0]=507, nt[1]=0, nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0
(4043057216) [31-MAY-2016 15:18:51:304] nsbasic_brc: exit: oln=0, dln=0, tot=0, rc=-1
(4043057216) [31-MAY-2016 15:18:51:304] nioqer: entry
(4043057216) [31-MAY-2016 15:18:51:304] nioqer:  incoming err = 12151
(4043057216) [31-MAY-2016 15:18:51:304] nioqce: entry
(4043057216) [31-MAY-2016 15:18:51:304] nioqce: exit
(4043057216) [31-MAY-2016 15:18:51:304] nioqer:  returning err = 3113
(4043057216) [31-MAY-2016 15:18:51:304] nioqer: exit
(4043057216) [31-MAY-2016 15:18:51:304] nioqrc: exit
(4043057216) [31-MAY-2016 15:18:51:304] nioqds: entry
(4043057216) [31-MAY-2016 15:18:51:304] nioqds:  disconnecting...

Incomming Error ORA-12151: TNS:received bad packet type from network layer

EDIT Response to Titus

SELECT PROFILE, LIMIT FROM DBA_PROFILES WHERE RESOURCE_NAME = 'IDLE_TIME'; and: 
DEFAULT UNLIMITED

SELECT PROFILE FROM DBA_USERS WHERE USERNAME = 'DJANGO_USER';
DEFAULT

I don't use TNSNAMES.ORA or SQLNET.ORA

Rob van Laarhoven
  • 8,063
  • 1
  • 27
  • 48
  • Perhaps this http://stackoverflow.com/questions/83068/ora-03113-end-of-file-on-communication-channel-after-long-inactivity-in-asp-net helps – trinchet May 31 '16 at 12:44
  • @trinchet Thank you. I have seen the link. There is no error on the service side. There is no firewall between appserver and dbserver. The connection _does_ work but now and then fails. – Rob van Laarhoven May 31 '16 at 13:42
  • You are saying that you checked the alert.log for the database and do not see any errors from that time? It wouldn't be an ORA-3113 in the alert log, it'd be something different. Also, are you using TLS (Transport Layer Security, aka SSL)? – Matthew McPeak Jun 03 '16 at 20:17
  • Can you post the output from this: SELECT PROFILE, LIMIT FROM DBA_PROFILES WHERE RESOURCE_NAME = 'IDLE_TIME'; and: SELECT PROFILE FROM DBA_USERS WHERE USERNAME = DJANGO_USER; – Titus P Jun 06 '16 at 17:38
  • Also, the contents of your SQLNET.ora and, if your using TNS, TNSNAMES.ora. – Titus P Jun 06 '16 at 17:40
  • @TitusP Thanks for your response. This does not seem to be a timeout problem since it occurs right after starting the application. – Rob van Laarhoven Jun 07 '16 at 11:30

1 Answers1

2

This is the workaround that worked for me:

# wsgi.py
from django.db import connection
from uwsgidecorators import postfork

@postfork
def reconnect_to_db():
    print "RECONNECT TO DB!"
    connection.connect()

For some reason, the uwsgi workers need to be manually connected postfork, or the first query of every worker would fail.