Issue with Disks from (merged) [message #678935] |
Thu, 23 January 2020 00:01 |
janakors
Messages: 232 Registered: September 2009
|
Senior Member |
|
|
hi there
i am facing this issue for quite few days and tried to kill the issue but now i need an expert opnion on it
i am constantly receiving ORA-15081,ORA-27063, ORA-00345 and ORA-00312. the probable cause looks like to me is that my LUNs (disks ) get disapper and then appear but till then my instance get crashes. now this happen on my primary node only and secod node is ok.After instance crashes then i need to startup and or first i do mounting of my diskgroups then start the instance on node 1 only . Please help me in knowing the issue
my alert log is as follows
Wed Jan 22 18:42:12 2020
Thread 1 cannot allocate new log, sequence 468
Checkpoint not complete
Current log# 1 seq# 467 mem# 0: +ASMSG1/grid3h3/redo01.log
Wed Jan 22 18:42:53 2020
Thread 1 advanced to log sequence 468 (LGWR switch)
Current log# 2 seq# 468 mem# 0: +ASMDG1/grid3h3/redo02.log
Wed Jan 22 18:43:37 2020
Archived Log entry 5149 added for thread 1 sequence 467 ID 0xffffffffcc68b26b dest 1:
Wed Jan 22 18:47:14 2020
Thread 1 cannot allocate new log, sequence 469
Checkpoint not complete
Current log# 2 seq# 468 mem# 0: +ASMDG1/grid3h3/redo02.log
Thread 1 advanced to log sequence 469 (LGWR switch)
Current log# 1 seq# 469 mem# 0: +ASMSG1/grid3h3/redo01.log
Wed Jan 22 18:47:29 2020
Archived Log entry 5151 added for thread 1 sequence 468 ID 0xffffffffcc68b26b dest 1:
Wed Jan 22 18:49:36 2020
Thread 1 cannot allocate new log, sequence 470
Checkpoint not complete
Current log# 1 seq# 469 mem# 0: +ASMSG1/grid3h3/redo01.log
Thread 1 advanced to log sequence 470 (LGWR switch)
Current log# 2 seq# 470 mem# 0: +ASMDG1/grid3h3/redo02.log
Wed Jan 22 18:49:59 2020
WARNING: Write Failed. group:1 disk:2 AU:7403 offset:555008 size:512
Errors in file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_lgwr_763.trc:
ORA-15080: synchronous I/O operation to a disk failed
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 512
WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 434 in group 1 on disk 2 allocation unit 7403
Errors in file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_lgwr_763.trc:
ORA-00345: redo log write error block 1084 count 1
ORA-00312: online log 2 thread 1: '+ASMDG1/grid3h3/redo02.log'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
Errors in file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_lgwr_763.trc:
ORA-00340: IO error processing online log 2 of thread 1
ORA-00345: redo log write error block 1084 count 1
ORA-00312: online log 2 thread 1: '+ASMDG1/grid3h3/redo02.log'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
LGWR (ospid: 763): terminating the instance due to error 340
Wed Jan 22 18:50:00 2020
System state dump requested by (instance=1, osid=763 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_diag_733_20200122185000.trc
Dumping diagnostic data in directory=[cdmp_20200122185000], requested by (instance=1, osid=763 (LGWR)), summary=[abnormal instance termination].
Instance terminated by LGWR, pid = 763
In my Sys log i have following:-
Jan 21 19:32:53 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:53 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:53 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85d087b6000000a8 (sd8):
Jan 21 19:32:53 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f8b1667e6000000b5 (sd12):
Jan 21 19:32:54 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:54 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:54 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:54 db115.k454.x drive offline
Jan 21 19:38:05 db115.k454.xCLSD: [ID 770310 daemon.notice] The clock on host db115 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
Jan 21 20:00:57 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85d1542d000000b3 (sd0):
Jan 21 20:00:57 db115.k454.x Command failed to complete...Device is gone
Jan 21 20:00:57 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f8b1667e6000000b5 (sd12):
Jan 21 20:00:57 db115.k454.x Command failed to complete...Device is gone
In ASM logs on both nodes
Asm logs
-bash-3.2$ tail -100 /u01/app/oracle/diag/asm/+asm/+ASM2/trace/alert_+ASM2.log
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:22 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:34 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:46 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:58 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Is it due to LUN disconnectivity or some other issue in the Database. this is what my query is and any remdey pl
please guide
regards
|
|
|
Issue with Disks from [message #678936 is a reply to message #678935] |
Thu, 23 January 2020 00:10 |
janakors
Messages: 232 Registered: September 2009
|
Senior Member |
|
|
hi there
i am facing this issue for quite few days and tried to kill the issue but now i need an expert opnion on it
i am constantly receiving ORA-15081,ORA-27063, ORA-00345 and ORA-00312. the probable cause looks like to me is that my LUNs (disks ) get disapper and then appear but till then my instance get crashes. now this happen on my primary node only and secod node is ok.After instance crashes then i need to startup and or first i do mounting of my diskgroups then start the instance on node 1 only . Please help me in knowing the issue
my alert log is as follows
Wed Jan 22 18:42:12 2020
Thread 1 cannot allocate new log, sequence 468
Checkpoint not complete
Current log# 1 seq# 467 mem# 0: +ASMSG1/grid3h3/redo01.log
Wed Jan 22 18:42:53 2020
Thread 1 advanced to log sequence 468 (LGWR switch)
Current log# 2 seq# 468 mem# 0: +ASMDG1/grid3h3/redo02.log
Wed Jan 22 18:43:37 2020
Archived Log entry 5149 added for thread 1 sequence 467 ID 0xffffffffcc68b26b dest 1:
Wed Jan 22 18:47:14 2020
Thread 1 cannot allocate new log, sequence 469
Checkpoint not complete
Current log# 2 seq# 468 mem# 0: +ASMDG1/grid3h3/redo02.log
Thread 1 advanced to log sequence 469 (LGWR switch)
Current log# 1 seq# 469 mem# 0: +ASMSG1/grid3h3/redo01.log
Wed Jan 22 18:47:29 2020
Archived Log entry 5151 added for thread 1 sequence 468 ID 0xffffffffcc68b26b dest 1:
Wed Jan 22 18:49:36 2020
Thread 1 cannot allocate new log, sequence 470
Checkpoint not complete
Current log# 1 seq# 469 mem# 0: +ASMSG1/grid3h3/redo01.log
Thread 1 advanced to log sequence 470 (LGWR switch)
Current log# 2 seq# 470 mem# 0: +ASMDG1/grid3h3/redo02.log
Wed Jan 22 18:49:59 2020
WARNING: Write Failed. group:1 disk:2 AU:7403 offset:555008 size:512
Errors in file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_lgwr_763.trc:
ORA-15080: synchronous I/O operation to a disk failed
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 512
WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 434 in group 1 on disk 2 allocation unit 7403
Errors in file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_lgwr_763.trc:
ORA-00345: redo log write error block 1084 count 1
ORA-00312: online log 2 thread 1: '+ASMDG1/grid3h3/redo02.log'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
Errors in file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_lgwr_763.trc:
ORA-00340: IO error processing online log 2 of thread 1
ORA-00345: redo log write error block 1084 count 1
ORA-00312: online log 2 thread 1: '+ASMDG1/grid3h3/redo02.log'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
LGWR (ospid: 763): terminating the instance due to error 340
Wed Jan 22 18:50:00 2020
System state dump requested by (instance=1, osid=763 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_diag_733_20200122185000.trc
Dumping diagnostic data in directory=[cdmp_20200122185000], requested by (instance=1, osid=763 (LGWR)), summary=[abnormal instance termination].
Instance terminated by LGWR, pid = 763
In my Sys log i have following:-
Jan 21 19:32:53 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:53 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:53 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85d087b6000000a8 (sd8):
Jan 21 19:32:53 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f8b1667e6000000b5 (sd12):
Jan 21 19:32:54 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:54 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:54 db115.k454.x Command failed to complete...Device is gone
Jan 21 19:32:54 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85cdf433000000a4 (sd10):
Jan 21 19:32:54 db115.k454.x drive offline
Jan 21 19:38:05 db115.k454.xCLSD: [ID 770310 daemon.notice] The clock on host db115 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
Jan 21 20:00:57 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f85d1542d000000b3 (sd0):
Jan 21 20:00:57 db115.k454.x Command failed to complete...Device is gone
Jan 21 20:00:57 db115.k454.xscsi: [ID 107833 kern.warning] WARNING: /nxup/disk@g648f8db1009e7b8f8b1667e6000000b5 (sd12):
Jan 21 20:00:57 db115.k454.x Command failed to complete...Device is gone
ASM logs on both nodes
Asm logs
-bash-3.2$ tail -100 /u01/app/oracle/diag/asm/+asm/+ASM2/trace/alert_+ASM2.log
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:22 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:34 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:46 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Wed Jan 22 18:59:58 2020
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
NOTE: Attempting voting file relocation on diskgroup OCR_DG
NOTE: Failed voting file relocation on diskgroup OCR_DG
Is it due to LUN disconnectivity or some other issue in the Database. this is what my query is and any remdey pl
please guide
regards
[Updated on: Thu, 23 January 2020 00:19] Report message to a moderator
|
|
|
Re: Issue with Disks from (merged) [message #678939 is a reply to message #678935] |
Thu, 23 January 2020 01:40 |
John Watson
Messages: 8960 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
I am not clear what you are asking. What is the difference between what you call your "primary" node and the "second" node? THere is no such concept of primary and secondary in RAC. It looks as though one of them has problems with the shared disc connection.
|
|
|
|
Re: Issue with Disks from (merged) [message #678942 is a reply to message #678940] |
Thu, 23 January 2020 04:57 |
John Watson
Messages: 8960 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
janakors wrote on Thu, 23 January 2020 08:13can you elaborate this shared disk connection point becasue i also have the same point. I do not know anything about your shared discs. Other than the obvious: they are not configured correctly.
|
|
|
Re: Issue with Disks from (merged) [message #678963 is a reply to message #678942] |
Thu, 23 January 2020 22:54 |
janakors
Messages: 232 Registered: September 2009
|
Senior Member |
|
|
what does these ORA-xxx errors specfiy. this is i am asking
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
Errors in file /u01/app/oracle/diag/rdbms/grid3h3/grid3h31/trace/grid3h31_lgwr_763.trc:
ORA-00340: IO error processing online log 2 of thread 1
ORA-00345: redo log write error block 1084 count 1
ORA-00312: online log 2 thread 1: '+ASMDG1/grid3h3/redo02.log'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
LGWR (ospid: 763): terminating the instance due to error 340 ORA-00345: redo log write error block 1084 count 1
[code]
|
|
|
Re: Issue with Disks from (merged) [message #678964 is a reply to message #678963] |
Fri, 24 January 2020 01:03 |
|
Michel Cadot
Messages: 68712 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
ORA-00345: redo log write error block %s count %s
*Cause: An IO error has occurred while writing the log
*Action: Correct the cause of the error, and then restart the system.
If the log is lost, apply media/incomplete recovery.
ORA-15081: failed to submit an I/O operation to a disk
*Cause: A submission of an I/O operation to a disk has failed.
*Action: Make sure that all the disks are operational.
[Updated on: Fri, 24 January 2020 01:04] Report message to a moderator
|
|
|
|
|