Oracle: Rolling Database Upgrade with minimal downtime using PHYSRU

Hi.
The article will list my efforts to do a rolling database upgrade using the Oracle Supplied Script physru to automate the rolling upgrade process partially. The advantage of using the script is it streamlines the switch over between primary and standby databases reducing the chance of an error and amount of downtime required to patching a database. The script can be found in Metalink note: 949322.1

If the primary database downtime is not a concern in your environment, then using this method is going to unnecessarily complicate the upgrade process.

In this example I’m going to upgrade an 11.2.0.2.0 Oracle database to 11.2.0.3.0 on Redhat Linux using out of place (separate homes for the new and old Oracle binaries).

Assumptions
1) You have a single node primary – single node physical standby, setup using Oracle 11.2.0.2.0
2) You have enough space for 2 Oracle homes
3) You know how to install multiples Oracle homes and switch between them
4) You can install Oracle 11.2.0.3.0 or can clone from another 11.2.0.3.0 home
5) You have a backup of the 11.2.0.2.0 binaries and the database before you begin
6) You have root access (in case you are doing a 11.2.0.3.0 install)
7) You have a fair bit of Oracle DBA experience.

Environment
1) Primary hostname : primserver.eng.ham.uk.myco or primserver/ Database SID: PRIMDB1
2) Standby hostname : stbyserver.eng.ham.uk.myco or stbyserver/ Database SID: STBYDB1

# On stbyserver check unsupported objects (for when the db is converted to logical standby) using SQLPLUS
SQL> select * from dba_logstdby_unsupported;
SQL> SELECT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_NOT_UNIQUE WHERE (OWNER, TABLE_NAME) NOT IN (SELECT DISTINCT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_UNSUPPORTED) AND BAD_COLUMN = ‘Y’;
You will have to make a decision on what to do with these objects and if they are show stoppers. If so, then this method may not be appropriate for you

If your application ensures the rows in a table are unique, you can create a disabled primary key RELY constraint on the table. Use ALTER TABLE command to add a disabled primary-key RELY constraint.
The following example creates a disabled RELY constraint on a table named mytab, for which rows can be uniquely identified using the id and name columns
ALTER TABLE mytab ADD PRIMARY KEY (id, name) RELY DISABLE;

Steps
Pre-Start

# Install/Clone 11203 in its own home on the primserver and stbyserver
# Download the latest physru script and copy it to the primserver and stbyserver (say in /var/tmp)
# Create Scripts for switching Oracle Homes on the primserver and stbyserver
example:

11202.sh on primserver
export ORACLE_HOME11203=/opt/oracle/product/db/11.2.0.3
export ORACLE_HOME11202=/opt/oracle/product/db/11.2.0.2
export ORACLE_HOME=$ORACLE_HOME11202
export ORACLE_SID=PRIMDB1
export PATH=$ORACLE_HOME/bin:$PATH
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH

11203.sh on primserver
export ORACLE_HOME11203=/opt/oracle/product/db/11.2.0.3
export ORACLE_HOME11202=/opt/oracle/product/db/11.2.0.2
export ORACLE_HOME=$ORACLE_HOME11203
export ORACLE_SID=PRIMDB1
export PATH=$ORACLE_HOME/bin:$PATH
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH

You can use the above to substitute the relevant values for the Standby Database Environment.

# To source the environment file
$ . 11202.sh

# Check your DB version using SQLPLUS on PRIMDB1 and STBYDB1
SQL> select * from v$version;
SQL> select status from v$instance;

# Execute the script utlu112i_5.sql as sysdba using SQLPLUS. Read the recommendations and action them, if needed.
SQL> @/var/tmp/utlu112i_5.sql

# Get a list of invalid objects. If you have run utlu112i_5.sql, then the list can also be seen in registry$nonsys_inv_objs and registry$sys_inv_objs
SQL> SELECT owner, object_type, object_name, status FROM dba_objects WHERE status = ‘INVALID’ ORDER BY owner, object_type, object_name ;

# Purge DBA Recycle pool
SQL> PURGE DBA_RECYCLEBIN

# Gather dictionary stats
SQL> EXECUTE dbms_stats.gather_dictionary_stats;

# Check current replication status on standby
– primserver
SQL> select max(sequence#) from v$archived_log where applied = ‘YES’ ;
– Stbyserver
SQL> select max(sequence#) from v$archived_log where applied = ‘YES’ ;
SELECT ARCH.THREAD# “Thread”, ARCH.SEQUENCE# “Last Sequence Received”, APPL.SEQUENCE# “Last Sequence Applied” FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ORDER BY 1;

# Check and turn Flashback on
– primserver
SQL> select log_mode,flashback_on from v$database;
SQL> alter database flashback on;
SQL> select log_mode,flashback_on from v$database;
– Stbyserver
SQL> select log_mode,flashback_on from v$database;
SQL> alter database recover managed standby database cancel ;
SQL> alter database flashback on;
SQL> select log_mode,flashback_on from v$database;
SQL> alter database recover managed standby database disconnect from session ;

# Create Standby Redo Logs on primserver and stbyserver
# Below is a script to help you generate the commands to create standby redo logs. Make sure there’s enough space in the folder you are creating the standby logs in. Also use the statements generated to make 1 or 2 extra standby redo log groups (as is recommended by Oracle)

SQL> set lines 400 pages 100
SQL> column COMMAND format a170
SQL> select decode( mod( rownum ,members) ,1, a1.command2 , a1.command1) command from (
SELECT members, l.group# grp, ‘alter database add standby logfile group ‘
|| to_char( l.group# + ((SELECT MAX (group#) FROM v$log)) )
|| ‘ ‘
|| ””
|| REGEXP_SUBSTR (MEMBER, ‘/.+/’)
|| sys_context(‘userenv’,’instance_name’)
|| ‘_stdbyredo_’
|| to_char( l.group# + ((SELECT MAX (group#) FROM v$log)) )
|| decode( mod( rownum ,members) ,0,’a’,’b’)
|| ‘.rdo’
|| ”’ size ‘
|| bytes/1024/1024
|| ‘M ;’
command1 ,
‘alter database add standby logfile member ‘
|| ””
|| REGEXP_SUBSTR (MEMBER, ‘/.+/’)
|| sys_context(‘userenv’,’instance_name’)
|| ‘_stdbyredo_’
|| to_char( l.group# + ((SELECT MAX (group#) FROM v$log)) )
|| decode( mod( rownum ,members) ,0,’a’,’b’)
|| ‘.rdo’ || ”” || ‘ to group ‘
|| to_char( l.group# + ((SELECT MAX (group#) FROM v$log)) )
|| ‘;’
command2
FROM v$logfile lf, v$log l
WHERE l.group# = lf.group# ) a1 order by grp, mod( rownum ,members) ;

# Check and add tnsnames entry on primserver and stbyserver IN 11202 AND 11203 ORACLE HOMES (in this case in 11202 and 11203 homes)
PRIMDB1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = primserver.eng.ham.uk.myco)(PORT = 1521))
)
(CONNECT_DATA =
(SID = PRIMDB1)
(SERVER = DEDICATED)
)
)

STBYDB1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = stbyserver.eng.ham.uk.myco)(PORT = 1521))
)
(CONNECT_DATA =
(SID = STBYDB1)
(SERVER = DEDICATED)
)
)

# Check tns connectivity on primserver and Stbyserver
$ tnsping PRIMDB1
$ tnsping STBYDB1
$ sqlplus “sys/@PRIMDB1 as sysdba”
$ sqlplus “sys/@STBYDB1 as sysdba”

# Check the listener in use and get what home its running from.
$ ps -ef | grep -i listener
$ lsnrctl status

# Add Static Listener Entry in listener.ora on primserver and stbyserver for their corresponding databases
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PRIMDB1 OR STBYDB1)
(ORACLE_HOME = /opt/oracle/product/db/11.2.0.2)
)
)

$ lsnrctl reload
$ lsnrctl status

Start Main Rolling Upgrade
# Run the physru script – Run 1. The physru script requires six parameters
# $./physru
$ ./physru sys PRIMDB1 STBYDB1 PRIMDB1 STBYDB1 11.2.0.3.0
the script finishes and then lets you upgrade the standby server, which needs to be done manually using DBUA or CATUPGRD

# Upgrade the Stbyserver database
$ . 11203
# On the stbyserver, copy the relevant files from the 11202 home to the new 11203 home
$ cp $ORACLE_HOME11202/dbs/orapwSTBYDB1 $ORACLE_HOME11203/dbs/
$ cp $ORACLE_HOME11202/dbs/spfileSTBYDB1.ora $ORACLE_HOME11203/dbs/
$ cp $ORACLE_HOME11202/dbs/initSTBYDB1.ora $ORACLE_HOME11203/dbs/

$ sqlplus / as sysdba
SQL> startup upgrade
SQL> select protection_level,protection_mode from v$database;
SQL> spool /var/tmp/upgrade.log
SQL> spool /var/tmp/upgrade11203.log
SQL> @?/rdbms/admin/catupgrd
SQL> spool off
SQL> exit

# Execute the post upgrade scripts
$ sqlplus / as sysdba
SQL> startup
SQL> spool /var/tmp/post-upgrade.log
SQL> @?/rdbms/admin/utlrp
SQL> @?/rdbms/admin/utlu112s.sql
SQL> @?/rdbms/admin/catuppst.sql
SQL> shutdown immediate
SQL> spool off
SQL> exit

# Modify the LISTENER.ORA on stbyserver entry for the upgraded transient logical standby database to point to the new ORACLE_HOME and reload the listener
$ lsnrctl reload
# Modify the /etc/oratab file on stbyserver to reflect the new oracle home

# Run the physru script – Run 2
$./physru sys PRIMDB1 STBYDB1 PRIMDB1 STBYDB1 11.2.0.3.0
The script finishes with the standby server as the new primary.

# On the primserver, copy the relevant files from the 11202 home to the new 11203 home
$ cp $ORACLE_HOME11202/dbs/orapwSTBYDB1 $ORACLE_HOME11203/dbs/
$ cp $ORACLE_HOME11202/dbs/spfileSTBYDB1.ora $ORACLE_HOME11203/dbs/
$ cp $ORACLE_HOME11202/dbs/initSTBYDB1.ora $ORACLE_HOME11203/dbs/

# On the primserver, reset your environment and mount the physical standby database using the $ORACLE_HOME of the new release
$ . 11203

# On STBYDB1 database, set the DB parameter for LOG_ARCHIVE_DEST_ so that logs from stbyserver can be sent to primserver, for data replication while the original primary is acting as a standby (temporarily)
SQL> alter system set LOG_ARCHIVE_DEST_3=”SERVICE=PRIMDB1 LGWR ASYNC NOAFFIRM OPTIONAL REOPEN=60 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=PRIMDB1″ ;

# On the primserver, modify the oratab to reflect the new home
# On the primserver, modify the listener.ora to reflect the new home

# Run the physru script – Run 3
$./physru sys PRIMDB1 STBYDB1 PRIMDB1 STBYDB1 11.2.0.3.0
The script finishes with switching the databases back to their original roles.

# In the end , you can get some statistics about the time, the main operations took and the downtime
### Stage 8: Statistics
script start time: 15-Jan-13 11:22:49
script finish time: 15-Jan-13 12:50:54
total script execution time: +00 01:28:05
wait time for user upgrade: +00 00:30:30
active script execution time: +00 00:57:35
transient logical creation start time: 15-Jan-13 11:22:53
transient logical creation finish time: 15-Jan-13 11:23:23
primary to logical switchover start time: 15-Jan-13 12:03:13
logical to primary switchover finish time: 15-Jan-13 12:03:30
primary services offline for: +00 00:00:18
total time former primary in physical role: +00 00:46:02
time to reach upgrade redo: +00 00:00:17
time to recover upgrade redo: +00 00:04:09
primary to physical switchover start time: 15-Jan-13 12:50:09
physical to primary switchover finish time: 15-Jan-13 12:50:43
primary services offline for: +00 00:00:35

SUCCESS: The physical rolling upgrade is complete

Post Upgrade
# On Stbydb1
SQL> alter database recover managed standby database disconnect from session ;

# Check for recovery lag
# On Stbydb1
SQL> column current_scn format 9999999999999
SQL> select current_scn from v$database ;
# On Primdb1
SQL> column current_scn format 9999999999999
SQL> select current_scn from v$database ;

# Check your primary and standby alert logs for any reported errors / warnings
# Decommission the old Oracle Home IF no other databases are using it.

================================================

Possible Errors and Solutions

# Issue
[oracle@hamddb15 ~]$ /var/tmp/physru sys PRIMDB1 STBYDB1 PRIMDB1 STBYDB1 11.2.0.3
-bash: /var/tmp/physru: /bin/sh^M: bad interpreter: No such file or directory
#Solution
dos2unix /var/tmp/physru

# Issue
# The following error was encountered:
alter database recover managed standby database using current logfile through next switchover disconnect
*
ERROR at line 1:
ORA-38500: USING CURRENT LOGFILE option not available without standby redo logs
# Solution
Create Standby redo logs

# Issue
In ALERT LOG on Standby
ORA-1653: unable to extend table SYS.WRM$_SNAPSHOT_DETAILS by 128 in tablespace SYSAUX
MMON Flush encountered SYSAUX out of space error(1653).
# Solution
alter database datafile ‘/u01/oradata/master_dev1/sysaux01.dbf’ resize 5G ;

# Issue (while runing one of the upgrade scripts in logical standby db) – Line 305334 in upgrd.log
SYS@STBYDB1 AS SYSDBA 24-APR-13> EXECUTE dbms_java.loadjava(‘-v -r -grant PUBLIC -s rdbms/jlib/servlet.jar’);
BEGIN dbms_java.loadjava(‘-v -r -grant PUBLIC -s rdbms/jlib/servlet.jar’); END;
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
ERROR:
ORA-03114: not connected to ORACLE
# Solution
rerun the catupgrd script and check spool file

# Issue
### Stage 3: Validate upgraded transient logical standby
Apr 25 15:03:39 2013 [3-1] database STBYDB1 is no longer in OPEN MIGRATE mode
Apr 25 15:03:39 2013 [3-1] —- Amit —- database 11.2.0.3.0 11.2.0.3
Apr 25 15:03:39 2013 [3-1] ERROR: STBYDB1 is not at version 11.2.0.3
# Solution
./physru sys PRIMDB1 STBYDB1 PRIMDB1 STBYDB1 11.2.0.3.0
— Note that initially I had given 11.2.0.3 as the target version

# Issue
# Stage 5: Flashback former primary to pre-upgrade restore point and convert to physical
Apr 25 15:20:33 2013 [5-1] shutting down database PRIMDB1
Apr 25 15:20:52 2013 [5-1] mounting database PRIMDB1
Apr 25 15:20:52 2013 [5-1] ERROR: failed to mount database PRIMDB1
# Solution
Restart the physru bit after manually restarting the Original Primary DB

================================================

Tips
1)
When you see an error during the script execution, to debug the issue, open the physru script and look for the error message (do not search on databases names).
Insert echo / print / display statements that print the values of variables etc at the appropriate places.

2)
Most of the issues are with incorrect config files or connections.
When you see any error just use command prompt to check connectivity using the tnsnames from the correct Oracle Home.
Also check the listener.ora (if it has the right home for the static db details)
Also check the listener status to confirm if the db is being listened for.

3)
CATUPRD script is executed only once in the entire upgrade process. One of the databases (the original primary database) is upgraded using Redo Apply.

 

Underwater shot of a penguin

Underwater shot of a penguin

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s