ORA-07445:[SIGFPE] [Integer divide by zero]内部错误一例

作者: Maclean Liu , post on April 19th, 2011 , English Version
【本站文章除注明转载外,均为本站原创编译】
转载请注明:文章转载自: Oracle Clinic – Maclean Liu的个人技术博客 [http://www.oracledatabase12g.com/]
本文标题: ORA-07445:[SIGFPE] [Integer divide by zero]内部错误一例
本文永久地址: http://www.oracledatabase12g.com/archives/ora-07445-sigfpe-integer-divide-by-zero%e5%86%85%e9%83%a8%e9%94%99%e8%af%af%e4%b8%80%e4%be%8b.html

一套SUNOS 5.10上的单节点10.2.0.3系统出现了ORA-07445: exception encountered: core dump [SIGFPE] [Integer divide by zero] [42788866] [] [] []内部错误,具体trace日志如下:

mon_ora_17633.trc

Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle/oracle/product/10.2.0
System name: SunOS
Node name: monitor-a
Release: 5.10
Version: Generic_139556-08
Machine: i86pc

ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [SIGFPE] [Integer divide by zero] [42788866] [] [] []
Current SQL statement for this session:
select req_time into :b0 from t_FIX_TranSerial where
((((tran_bank=:b1 and tran_type=:b2) and term_no=:b3)
and trace_no=:b4) and local_time='0')
----- Call Stack Trace -----

sigsetjmp <- call_user_handler
<- sigacthandler <- kpopfr <- kposdi <- kpopsdi <- opiefn0
<- kpoal8 <- opiodr <- ttcpip <- opitsk <- opiino
<- opiodr <- opidrv <- sou2o <- opimai_real <- main
<- 0000000000E54FE7

PROCESS STATE
-------------
O/S info: user: monitor, term: pts/5, ospid: 17621, machine: monitor-a
program: CTPDATA@monitor-a (TNS V1-V3)
application name: CTPDATA@monitor-a (TNS V1-V3), hash value=0
last wait for 'SQL*Net message to client' blocking sess=0x0 seq=6213 wait_time=1 seconds since wait started=0
driver id=62657100, #bytes=1, =0

通过在MOS上查询以上ORA-07445错误的arguement可以发现Note <ORA-7445 [KPOPFR] [SIGFPE] [INTEGER DIVIDE BY ZERO] When Repeatedly Executing a Query (Doc ID 421203.1)> :

Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3
This problem can occur on any platform.
Symptoms

1. Repeatedly executing a query can lead to the following error: 

ORA-7445 [kpopfr] [SIGFPE] [INTEGER DIVIDE BY ZERO]

2. The call stack from the ORA-07445 trace file should contain the following functions:
kposdi  kpopsdi

The error is caused by BUG 5753629.
Abstract: QUERY FAILS BY ORA-7445 [KPOPFR]
Repeatedly executing a query can lead to an ORA-7445[kpopfr] error.

Solution
To implement the solution, do one of the following: 

1. Upgrade to 11.1 or 10.2.0.4, when available.
At the time of writing the article these version were not yet available. (July 2007).
2. Apply one-off Patch 5753629 from MetaLink, if available for your platform and version.

There is no known workaround available for this bug.

References

BUG:5753629 - QUERY FAILS BY ORA-7445 [KPOPFR].

Hdr: 5753629 10.2.0.2 RDBMS 10.2.0.2 PRG INTERFACE PRODID-5 PORTID-23 ORA-7445
Abstract: QUERY FAILS BY ORA-7445 [KPOPFR].

*** 01/09/07 06:12 pm ***
TAR:
----

PROBLEM:
--------
When executing query again and again from one session, query fails
by ORA-7445[kpopfr].

  ====================================================================
  Sat Dec 30 00:22:39 2006
  Errors in file /var/log/oracle/trace/felica2_ora_6156.trc:
  ORA-7445: exception encountered: core dump [kpopfr()+536] [SIGFPE]
  [Integer divide by zero] [0x1023BCE18] [] []
  ====================================================================

DIAGNOSTIC ANALYSIS:
--------------------
From disassemble, %o3 is devided by %o0 and %o0 seems to be 0x0.

  0x1023f43d0 :       umul  %g5, %g4, %o0
  0x1023f43d4 :       mov  %g0, %y
  0x1023f43d8 :       udiv  %o3, %o0, %o3

%o0 is calcurated by %g5 X %g4 at kpopfr+528.

From our trace file, this value(%g5) is 0x100000.

  ub4 kponc_p [FFFFFFFF7B22AAAC, FFFFFFFF7B22AAB0) = 00100000

And %g4 seems to be 0x1000 from below trace file output.
 ========== FRAME [6] (kpopsdi()+148 -> kposdi()) ==========
  %l0 0000000100302B00 %l1 0000000000000002 %l2 FFFFFFFF7B263CA8
  %l3 00000003C36AFB80 %l4 0000000105D7CE20 %l5 0000000000000001
  %l6 0000000000000007 %l7 0000000105D7A920 %i0 0000000000000000
  %i1 FFFFFFFF7FFFB9EC %i2 0000000000001000 %i3 0000000105E03220
                       ~~~~~~~~~~~~~~~~~~~~ <--(*) here
  %i4 0000000000800000 %i5 0000000000105C00 %fp FFFFFFFF7FFFB241 

If %g4=0x1000 and %g5=0x100000, %g4 X %g5 = 0x100000000.
0x100000000 is 0x0 as ub4, and this may bring 0 divide and ORA-7445.

I can reproduce the similar problem in my house, so I'll upload testcase.
Problem reproduce at the following case.

 * sum of all column size is 1048576(0x100000)
 * run query again and again from one session (about 4096(0x1000) times)

From this results, above guess seems to be correct.

WORKAROUND:
-----------
n/a

RELATED BUGS:
-------------
n/a

REPRODUCIBILITY:
----------------
I have confirmed that this problem reproduces at the below env.

 * Linux x86 32bit, 10.2.0.3 : ORA-7445[kpopfr()+300]
 * Linux x86 64bit, 10.2.0.2 : ORA-7445[kpopfr()+339]
 * Solaris 64bit, 10.2.0.2   : ORA-7445[kpopfr()+536]
 * HP-UX Itanium, 10.2.0.2   : ORA-7445[_div32U()+34]

TEST CASE:
----------
At first, creating table like follows.

conn scott/tiger
drop table test;
create table test
( c000 char(2000),
  c001 char(2000),
    ... 
  c523 char(2000),
  c524 char(576));

  --> sum of all column size is 1048576(0x100000).

Run next shell script.

  while [ 1 ]
  do
  echo "set feedback off"
  echo "select * from test where c001 = 'A';"
  done | sqlplus -s scott/tiger

It takes 3-10 minutes to reproduce the problem.
Required time for reproducing depends on hardware spec.

STACK TRACE:
------------
 ksedmp ssexhd sighndlr call_user_handler kposdi kpopsdi kpoal8
 opiodr ttcpip opitsk opiino opiodr opidrv sou2o opimai_real
 main start

具体向Oracle GCS提交SR以后确认为 BUG 5753629. Oracle GCS给出了2种解决方案:
1.升级到10.2.0.4或更高版本
2.应用Apply one-off Patch 5753629

© 2011, www.oracledatabase12g.com. 版权所有.文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.

相关文章 | Related posts:

  1. Oracle内部错误ORA-07445[kpopfr()+339] [SIGFPE]一例
  2. ora-00600[kkocxj:pjpCtx]内部错误一例
  3. ORA-00600: [7005], [192]内部错误一例
  4. ORA-00600:[kclchkinteg_2]及[kjmsm_epc]内部错误一例
  5. ORA-00600:[qctcte1]内部错误一例
  6. ORA-00600:[15570]内部错误一例
  7. ORA-00600 [3756]内部错误一例
  8. ORA-00600: [qksrcBuildRwo]内部错误一例
  9. ORA-07445: [__lwp_kill()+8] [SIGIOT]错误一例
  10. How to trigger ORA-00600,ORA-7445 by manual

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>