Due to several Oracle bugs, Ignite can encounter 'Library Cache Mutex X' waits on the Monitored Database Instance when retrieving SQL Statistics. There are over 30 documented bugs associated with this wait. This article describes what Mutexes are, the Mutex waits and the many issues that may be encountered.
If Ignite encounters Mutex wait or acquires a Mutex to get Statistics, the query to retrieve SQL statistics may timeout, producing a "socket read timed out" error in the Ignite logs. This error in turn causes the Ignite monitor to try to restart. In some environments, the Oracle session which has the Mutex where the timed out occurred, is orphaned by Ignite. This means that the Oracle Instance still sees the session as active, doesn't know that it's orphaned and tries to continue processing the session's request. At this point, if you attempt to kill the ophaned session using the Oracle "alter system kill" command, the system may hang because the Oracle Process Monitor (PMON) also has to aquire a Mutex in order to clean up the session. If PMON is in the 'Library Cache Mutex X' wait, the system will hang indefinitely and the only quick resolution is to restart the database instance.
Note: A better way to clean up the orphaned session is to issue an OS kill command on the system process id. For Linux/Unix sessions, you will need to issue a 'kill -9 pid', or for Windows, 'orakill pid' at a command prompt. This will kill the underlying OS process so the session is not seen as 'active' in Oracle, and PMON will correctly clean it up without the Mutex wait.
A. Upgrade Oracle. The version affected by these issues are: Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 188.8.131.52 - Release: 10.1 to 11.2 Information in this document applies to any platform.
B. If you cannot upgrade Oracle to a version that does not exhibit the bug, you can do the following in Ignite:
SELECT a.sql_id, a.event, a.p1text, a.p1, a.p2text, a.p2, a.last_call_et, a.status, a.state FROM v$session a, v$sql b WHERE a.program = 'IgniteMonitor' AND a.sql_id = b.sql_id AND a.status = 'ACTIVE' AND a.last_call_et > 420 AND b.sql_text like '%FROM x$kglcursor%';