Pages from a DBA diary: 2014

Thursday, 25 December 2014

SQL Plan Management

SQL Plan Management:

Purpose:

Prevents performance regressions resulting from sudden changes to the execution plan of a SQL statement by providing components for capturing, selecting, and evolving SQL plan information.

Factors affecting plan Management:

New optimizer version
Changes to optimizer statistics and optimizer parameters
Changes to schema and metadata definitions
Changes to system settings
SQL profile creating

Steps:

1. Capture Plan Baseline: 2 ways

* Automatic Plan Capture:
Setting OPTIMIZER_CAPTURE_SQL_PLAN_BASELINES initialization parameter to TRUE

* Manual Plan Capture

Loading Plans from SQL Tuning Sets and AWR Snapshots.
Loading Plans from the Cursor Cache

2. Selecting SQL Baseline

OPTIMIZER_USE_SQL_PLAN_BASELINES

3. Evolving base line plans: 2 ways
* Manual Method: Plans loaded from SQL tuning set or cursor cache.

* Evolving Plans With DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE

To review SQL Plan Baselines

select * from table(

dbms_xplan.display_sql_plan_baseline(

sql_handle=>'SYS_SQL_209d10fabbedc741',

format=>'basic'));

select sql_handle, plan_name, enabled, accepted, fixed from dba_sql_plan_baselines;

SQL Management Base:

Set supplied PL/SQL packages to facilitate the disk usage, retention time and movement of baselines
across environments.

Tried out example

Test:: [Extended from the example above]
1. Dropped the SQL Plan baseline which was created for FTS.
2. Now i dropped the index SPM_TST_TAB_IDX from the table.
3. Now we have SQL Plan baseline created for the index range scan. But in the table the index is dropped.
4. Now when i ran the SQL statement, a new SQL Plan baseline is created and is used.
5. But the new baseline is not a accepted one.

Oracle Documentation
DBMS_SPM

Sunday, 21 December 2014

How CBO infers stale statistics

Oracle CBO infers STALE statistics primarily from view *_TAB_MODIFICATIONS. This particular view shows the changes in your table but not immediately. Lets check

SQL> create table OBJECTS as select * from dba_objects;
Table created.

SQL> insert into objects select * from dba_objects;
72565 rows created.

SQL> commit;
Commit complete.

SQL> select * from user_tab_modifications where table_name='OBJECTS';
no rows selected

SQL> select table_name, stale_stats from user_tab_statistics where table_name='OBJECTS';
TABLE_NAME                     STA
------------------------------ ---
OBJECTS

This is because, the monitoring information is in the shared_pool and is flushed by Oracle only at a periodical intervals. When i manually flush it,

SQL> exec dbms_stats.flush_database_monitoring_info;
PL/SQL procedure successfully completed.

SQL> select table_name,inserts, timestamp from user_tab_modifications where table_name='OBJECTS';
TABLE_NAME                        INSERTS TIMESTAMP
------------------------------ ---------- ---------
OBJECTS                             72565 21-DEC-14

Now i see 72565 records inserted. To further test I run inserts, followed by few deletions.

SQL> insert into objects select * from dba_objects;
72565 rows created.

SQL> delete from objects where rownum<1000;
999 rows deleted.

SQL> commit;
Commit complete.

SQL> exec dbms_stats.flush_database_monitoring_info;
PL/SQL procedure successfully completed.

SQL> select table_name,inserts, deletes from user_tab_modifications where table_name='OBJECTS';
TABLE_NAME                        INSERTS  DELETES
------------------------------ ---------- ----------
OBJECTS                            145130        999

SQL> select table_name, stale_stats from user_tab_statistics where table_name='OBJECTS';
TABLE_NAME                     STA
------------------------------ ---
OBJECTS

So if a table is under monitoring(YES for MONITORING in DBA_TABLES) the DML's are monitored which is flushed to the MODIFICATIONS table when we call the procedure DBMS_STAT. FLUSH_DATABASE_MONITORING_INFO manually.

The same flush also happens when GATHER_*_STATS procedure is called, Let's test that. Please note the STALE_STATS is NULL before collecting statistics as shown before.

SQL> exec dbms_stats.gather_table_stats('STATS','OBJECTS');
PL/SQL procedure successfully completed.

SQL> select * from user_tab_modifications where table_name='OBJECTS';
no rows selected

As the document says, the *_TAB_MODIFICATIONS view is populated with the modifications for tables set MONITORING to TRUE and is purged immediately after the statistics is collected.

Actually the statistics staleness recorded in DBA_TAB_STATISTICS uses this data from *_TAB_MODIFICATIONS to identify if the statistics is stale or not. For example,

SQL> select table_name, stale_stats from user_tab_statistics where table_name='OBJECTS';
TABLE_NAME                     STA
------------------------------ ---
OBJECTS                        NO

SQL> update objects set owner='SRI';
216696 rows updated.

SQL> commit;
Commit complete.

SQL> select table_name, stale_stats from user_tab_statistics where table_name='OBJECTS';
TABLE_NAME                     STA
------------------------------ ---
OBJECTS                        NO

SQL> exec dbms_stats.flush_database_monitoring_info;
PL/SQL procedure successfully completed.

SQL> select table_name, stale_stats from user_tab_statistics where table_name='OBJECTS';
TABLE_NAME                     STA
------------------------------ ---
OBJECTS                        YES

In the above case, initially the STALE_STATS is NO. Then i Updated the whole table. But again when i check the stats is not updated as stale as the monitoring data is not flushed. When i flush it manually, it is set to YES. We can also control for what % of change this STALE_STAT should be changed to YES with the help of STALE_PCT from DBA_TAB_PREFS.

Monday, 8 December 2014

Manual creation of Oracle RAC DB on Windows

Manual RAC DB creation on Windows.

Assumption:
1. The clusterware is installed and is up and running on both the nodes.
2. The Oracle Home is installed on both the nodes.

Steps:
1. First we need a shared storage to store the Oracle DB files, Redo Logs, Control Files etc. So we need to create ASM diskgroups on which the mentioned files can be accomodated. CREATE ASM DISKGROUPS

2. Create the pwd(orapwd)file and service(oradim) utility. Please note the service to be created for the instance names so, SID will change between the 1st and 2nd node.

3. Create the init.ora parameters as we do for normal single instance and create the DB using CREATE DATABASE statement. Please exercise caution while mentioning

MAXINSTANCES
MAXLOGFILES
MAXLOGMEMBERS

3. Make sure to point the control file to the right file name. Just pointing the control_files parameter to a disk group will not help. After DB creation find the exact file name of the control_files from

select * from v$controlfile;

Update the value in init.ora
At this point you ll have one instance fully functional in open mode.

4. Now make the necessary parameter changes to the init.ora.

Add cluster_database=true
Add cluster_database_instance=n where n is the number of instances
Then add instance_name, instance_number, Undo_tablespace & Thread specific to instances

5. Create respective Undo tablespace & Threads for the second instance.

6. Copy the init.ora & pwd file to node 2 and start the instance. The instance should come up fine w/o issues.

7. Once started, stop the instances and add them to the OCR throught the srvctl command so that the instances can be controlled as a clustered resource.

8. Create a centric spfile and update the same in OCR so that the agent of the CRS knows which file to refer to start the DB.

Hope this helps!!!

Saturday, 6 December 2014

PGA Study

PGA Study:

Every time when we talk about memory of a Oracle DB the emphasis is more to the SGA than the PGA. Last month when I was asked to look after an issue for an application which is exhausting the server memory due to high consumption of the PGA, then i realised that I lack the basic details of PGA and assured myself with a detailed study about the same. In this blog am hoping to cover the need for PGA, Sizing of PGA and performance impact due to improper PGA sizing.

What is PGA?
From documentation, A PGA is a memory region that contains data and control information for a server process. It is non-shared memory created by Oracle Database when a server process is started. Access to the PGA is exclusive to the server process. There is one PGA for each server process. Background processes also allocate their own PGAs. The total memory used by all individual PGAs is known as the total instance PGA memory, and the collection of individual PGAs is referred to as the total instance PGA, or just instance PGA. You use database initialization parameters to set the size of the instance PGA, not individual PGAs.

An analogy for a PGA is a temporary countertop workspace used by a file clerk. The file clerk is the server process doing work on behalf of the customer (client process). The clerk clears a section of the countertop, uses the workspace to store details about the customer request and to sort the folders requested by the customer, and then gives up the space when the work is done.

What are contents of PGA?
1. Session memory(UGA):
* Holds session variables, Logon Information. In Shared server this is stored in SGA.
2. Private SQL area:
* Holds bind variable values, query state information & query execution work areas.PGA while using dedicated & SGA while using shared.
* A cursor is a name or handle to a specific private SQL area. You can think of a cursor as a pointer on the client side and as a state on the server side. Because cursors are closely associated with private SQL areas, the terms are sometimes used interchangeably.
* The client process is responsible for managing private SQL areas. The allocation and deallocation of private SQL areas depends largely on the application, although the number of private SQL areas that a client process can allocate is limited by the initialization parameter OPEN_CURSORS.
* Sub-divided to
a. Run-time area: contains query execution state information. For example, the run-time area tracks the number of rows retrieved so far in a full table scan. The run-time area is freed when the SQL statement is closed.
b. Persistent area: contains bind variable values. A bind variable value is supplied to a SQL statement at run time when the statement is executed. The persistent area is freed only when the cursor is closed.
3. SQL Work area:
* Work area is a private allocation of PGA memory used for memory-intensive operations. Few ex: Sort area for a sort operation, hash area for hash join to build hash table, bitmap merge area for bitmap merge.

Documentation Reference

Enough of long texts now we ll jump in to few queries for a better understanding.
From sesstat or mystat we can view what is the Pga and Uga allocated for a session.

Here below, we see the PGA stats for my current session.

SQL> select sid||'--'||name||'--'||value from v$mystat natural join v$statname where name like '%memory%' and sid=sys_context('USERENV','SID');

SID||'--'||NAME||'--'||VALUE
--------------------------------------------------------------------------------
130--session uga memory--1302772
130--session uga memory max--1302772
130--session pga memory--1970404
130--session pga memory max--1970404
130--redo k-bytes read (memory)--0
130--redo k-bytes read (memory) by LNS--0
130--workarea memory allocated--1204
130--sorts (memory)--0

Now we ll see the allocation at the process level.

SQL> select s.sid,p.pid from v$session s,v$process p where s.paddr=p.addr and sid=sys_context('USERENV','SID');
       SID        PID
---------- ----------
       130         22

SQL> select pid,pga_used_mem,pga_alloc_mem,pga_freeable_mem,pga_max_mem from v$process where pid=22;
       PID PGA_USED_MEM PGA_ALLOC_MEM PGA_FREEABLE_MEM PGA_MAX_MEM
---------- ------------ ------------- ---------------- -----------
        22       660114       1280022                0     2263062

SQL> select * from v$process_memory where pid=22;
       PID    SERIAL# CATEGORY         ALLOCATED       USED MAX_ALLOCATED
---------- ---------- --------------- ---------- ---------- -------------
        22         13 SQL                  46760      11856       1262732
        22         13 PL/SQL                2024        136          2024
        22         13 Other              1231238                  1231238

SQL> select sum(allocated),sum(max_allocated) from v$process_memory where pid=22;
SUM(ALLOCATED) SUM(MAX_ALLOCATED)
-------------- ------------------
       1280022            2487838

We can see from v$process & v$process_memory and equate sum of individual allocated memory with the PGA_ALLOC_MEM from v$process, though we see a slight difference in the max allocated. Below view gives us an overview at the instance level.

SQL> select * from v$pgastat;
NAME                                                    VALUE UNIT
-------------------------------------------------- ---------- ------------
aggregate PGA target parameter                      306184192 bytes
aggregate PGA auto target                           254923776 bytes
global memory bound                                  61236224 bytes
total PGA inuse                                      22934528 bytes
total PGA allocated                                  29167616 bytes
maximum PGA allocated                                52337664 bytes
total freeable PGA memory                                   0 bytes
process count                                              26
max processes count                                        32
PGA memory freed back to OS                                 0 bytes
total PGA used for auto workareas                           0 bytes
maximum PGA used for auto workareas                   1341440 bytes
total PGA used for manual workareas                         0 bytes
maximum PGA used for manual workareas                       0 bytes
over allocation count                                       0
bytes processed                                      53265408 bytes
extra bytes read/written                                    0 bytes
cache hit percentage                                      100 percent
recompute count (total)                                  1428

Documentation Reference

In next post, Ill discuss about few parameters which determines the size of the PGA allocation to a DB.

Tuesday, 19 August 2014

NLJ(Nested Loop Join) Batching

We have recently upgraded our production databases from 10.2.0.4 to 11.2.0.4. Though the whole transition was smooth after a month we started hearing drop in performance. Most of the issue was reported from ETL loads and am planning to discuss few of the cases here.

While analysing the plan of one of a query we could see that there is a difference in plan between 10g & 11g. The difference was a additional nested loop.

Execution Plan
----------------------------------------------------------
Plan hash value: 2756736908

------------------------------------------------------------------------------------------

| Id | Operation                    | Name      | Rows | Bytes | Cost (%CPU)|Time     |

------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           | 4938 |    38M| 9681   (2)|00:01:57 |

|   1 | NESTED LOOPS                |           |       |       |            |         |

|   2 |   NESTED LOOPS               |           | 4938 |    38M| 9681   (2)|00:01:57 |

|   3 |    TABLE ACCESS FULL         | T2        |   929K| 3585M| 4655   (1)|00:00:56 |

|* 4 |    INDEX UNIQUE SCAN         | T3_IDX_ID |     1 |       |     0   (0)|00:00:01 |

|   5 |   TABLE ACCESS BY INDEX ROWID| T3        |     1 | 4043 |     1   (0)|00:00:01 |

-----------------------------------------------------------------------------------------
When the same query is executed with 10g optimizer the plan is as follows.

SQL> select /*+ optimizer_features_enable('10.2.0.5')*/* From t2, t3 where t2.id=t3.id;

4999 rows selected.

Elapsed: 00:00:01.29

Execution Plan
----------------------------------------------------------
Plan hash value: 4148881243

------------------------------------------------------------------------------------------

| Id | Operation                    | Name      | Rows | Bytes | Cost (%CPU)|Time     |

------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT             |           | 4938 |    38M| 9681   (2)|00:01:57 |

|   1 | NESTED LOOPS                |           | 4938 |    38M| 9681   (2)|00:01:57 |

|   2 |   TABLE ACCESS FULL          | T2        |   929K| 3585M| 4655   (1)|00:00:56 |

|   3 |   TABLE ACCESS BY INDEX ROWID| T3        |     1 | 4043 |     1   (0)|00:00:01 |

|* 4 |    INDEX UNIQUE SCAN         | T3_IDX_ID |     1 |       |     0   (0)|00:00:01 |

------------------------------------------------------------------------------------------

What is NLJ Batching?

From the documentation "When an index or a table block is not in the buffer cache and is needed to process the join, a physical I/O is required. In Oracle Database 11g Release 1 (11.1), Oracle Database can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time. As part of the new implementation for nested loop joins, two NESTED LOOPS join row sources might appear in the execution plan where only one would have appeared in prior releases. In such cases, Oracle Database allocates one NESTED LOOPS join row source to join the values from the table on the outer side of the join with the index on the inner side. A second row source is allocated to join the result of the first join, which includes the rowids stored in the index, with the table on the inner side of the join"

Going by the definition in this case:

2   NESTED LOOPS
3 |    TABLE ACCESS FULL         | T2        |   929K| 3585M| 4655   (1)|
4 |    INDEX UNIQUE SCAN

one NESTED LOOPS[2] row source to join the values from the table(Table T2)[3] on the outer side of the join with the index[4] on the inner side.

1 | NESTED LOOPS
|   2 |   NESTED LOOPS
|   5 |   TABLE ACCESS BY INDEX ROWID| T3        |     1 | 4043 |     1   (0)|

A second row source is allocated to join the result of the first join[2], which includes the rowids stored in the index, with the table[5] on the inner side of the join

To find out how effective the IO batching has been/If IO batching is really happening the session statistics can be checked.

SQL> select name,value from v$mystat natural join v$statname where name like '%Ba%';

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
Batched IO vector read count                                              0
Batched IO vector block count                                             0
Batched IO single block count                                             0
Batched IO zero block count                                               0
Batched IO block miss count                                               0
Batched IO double miss count                                              0
Batched IO (full) vector count                                            0
Batched IO (space) vector count                                           0
Batched IO (bound) vector count                                           0
Batched IO same unit count                                                0
Batched IO buffer defrag count                                            0
Batched IO slow jump count                                                0
The behaviour of the NL batching can be controlled by Hints(no_nlj_batching), parameters(_nlj_batching_enabled), it also gets disabled with optimizer version when set to pre 11.2.

more to come....

Thursday, 31 July 2014

Dataguard - STANDBY_FILE_MANAGEMENT

In this blog, i am planning to discuss the impact of the DG parameter STANDBY_FILE_MANAGEMENT. From Oracle document the description of this parameter is as follows:

"STANDBY_FILE_MANAGEMENT enables or disables automatic standby file management. When automatic standby file management is enabled, operating system file additions and deletions on the primary database are replicated on the standby database. STANDBY_FILE_MANAGEMENT is only applicable to physical standby databases."

Now lets see the behavior for various values of this parameter.

1. STANDBY_FILE_MANAGEMENT=AUTO

Now am adding a tablespace in primary:
SQL> create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbf' size 30M;

File added in Primary:

Alert log:

Thu Jul 31 05:17:39 2014
Archived Log entry 36 added for thread 1 sequence 26 ID 0xb4d94b9e dest 1:
Thu Jul 31 09:50:00 2014
create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbf' size 30M
Completed: create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbf' size 30M

File System:

SQL> !ls /u02/app/oracle/oradata/dbdg/
archivelog redo01.log sysaux01.dbf test1.dbf
control01.ctl redo02.log system01.dbf undotbs01.dbf
example01.dbf redo03.log temp01.dbf users01.dbf

Checking the Standby:

Alert log:

Thu Jul 31 05:17:41 2014
Media Recovery Log /u02/app1/oracle/flash_recovery_area/DBDG_STBY/archivelog/2014_07_31/o1_mf_1_26_9xm10tnm_.arc
Media Recovery Waiting for thread 1 sequence 27 (in transit)

File System
SQL> !ls /u02/app1/oracle/oradata/dbdg
archivelog     stbyredo1.log stbyredo4.log temp01.dbf
control01.ctl stbyredo2.log sysaux01.dbf   undotbs01.dbf
example01.dbf stbyredo3.log system01.dbf   users01.dbf

Nothing happened. Now executing switch log file in primary:

SQL> alter system switch logfile;
System altered.
Primary Alert log:
Thu Jul 31 09:59:15 2014
Thread 1 advanced to log sequence 28 (LGWR switch)
Current log# 1 seq# 28 mem# 0: /u02/app/oracle/oradata/dbdg/redo01.log
Thu Jul 31 09:59:15 2014
LNS: Standby redo logfile selected for thread 1 sequence 28 for destination LOG_ARCHIVE_DEST_2
Thu Jul 31 09:59:15 2014
Archived Log entry 38 added for thread 1 sequence 27 ID 0xb4d94b9e dest 1:

Standby alertlog:
Thu Jul 31 09:59:15 2014
Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST
Thu Jul 31 09:59:15 2014
RFS[2]: Selected log 4 for thread 1 sequence 28 dbid -1260808290 branch 853848671
Archived Log entry 17 added for thread 1 sequence 27 ID 0xb4d94b9e dest 1:
Thu Jul 31 09:59:19 2014
Media Recovery Log /u02/app1/oracle/flash_recovery_area/DBDG_STBY/archivelog/2014_07_31/o1_mf_1_27_9xmkjvjr_.arc
Recovery created file /u02/app1/oracle/oradata/dbdg/test1.dbf
Successfully added datafile 6 to media recovery
Datafile #6: '/u02/app1/oracle/oradata/dbdg/test1.dbf'
Media Recovery Waiting for thread 1 sequence 28 (in transit)

So the tablespace with the datafile is added in standby. And it got added only after a alter system switch logfile. The above case is in maximum performance mode now lets test the same in maximum protection mode.

In Max-Protection Mode:

There is no difference in Max-Protection & Max-Performance modes. Even in Max-Protection mode the datafile got created only after switch log file.

With Real-time apply:

Stopping the recovery process and starting the real-time apply.

Standby:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE;

alert log:
MRP0: Background Media Recovery process shutdown (dbdg)
Managed Standby Recovery Canceled (dbdg)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
Thu Jul 31 10:34:15 2014
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE
Media Recovery Start: Managed Standby Recovery (dbdg)
started logmerger process
Thu Jul 31 10:34:15 2014
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 2 slaves
Media Recovery Waiting for thread 1 sequence 31 (in transit)
Recovery of Online Redo Log: Thread 1 Group 5 Seq 31 Reading mem 0
Mem# 0: /u02/app1/oracle/oradata/dbdg/stbyredo2.log

Now am dropping & creating a tablespace in primary:
Primary:
SQL> drop tablespace test1 including contents and datafiles;
Tablespace dropped.

SQL> create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' size 10M;
Tablespace created.

alert log:
Thu Jul 31 10:35:12 2014
drop tablespace test1 including contents and datafiles
Deleted file /u02/app/oracle/oradata/dbdg/test1.dbg
Completed: drop tablespace test1 including contents and datafiles
Thu Jul 31 10:35:27 2014
create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' size 10M
Completed: create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' size 10M

Standby alert log:
Thu Jul 31 10:35:12 2014
Recovery deleting file #6:'/u02/app1/oracle/oradata/dbdg/test1.dbg' from controlfile.
Deleted file /u02/app1/oracle/oradata/dbdg/test1.dbg
Recovery dropped tablespace 'TEST1'
Thu Jul 31 10:35:27 2014
Recovery created file /u02/app1/oracle/oradata/dbdg/test1.dbg
Successfully added datafile 6 to media recovery
Datafile #6: '/u02/app1/oracle/oradata/dbdg/test1.dbg'

So with real time appply the changes are moving to standby the same moment. No switch logfile is necessary. Now lets test and see what is behavior when the parameter is set to Manual.

2. STANDBY_FILE_MANAGEMENT=MANUAL

Standby:
SQL> alter system set standby_file_management=manual;
System altered.

Primary:
SQL> create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' size 10M;
Tablespace created.

alert log:
Thu Jul 31 10:42:19 2014
create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' size 10M
Completed: create tablespace test1 datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' size 10M

Standby:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-01274: cannot add datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' - file
could not be created

Alert log:
Thu Jul 31 10:42:01 2014
ALTER SYSTEM SET standby_file_management='MANUAL' SCOPE=BOTH;
Thu Jul 31 10:42:20 2014
File #6 added to control file as 'UNNAMED00006' because
the parameter STANDBY_FILE_MANAGEMENT is set to MANUAL
The file should be manually created to continue.
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 1281916
Media Recovery failed with error 1274
Errors in file /u02/app1/oracle/diag/rdbms/dbdg_stby/dbdg/trace/dbdg_pr00_7279.trc:
ORA-00283: recovery session canceled due to errors
ORA-01274: cannot add datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' - file could not be created
Slave exiting with ORA-283 exception
Errors in file /u02/app1/oracle/diag/rdbms/dbdg_stby/dbdg/trace/dbdg_pr00_7279.trc:
ORA-00283: recovery session canceled due to errors
ORA-01274: cannot add datafile '/u02/app/oracle/oradata/dbdg/test1.dbg' - file could not be created
Thu Jul 31 10:42:20 2014
ORA-283 signalled during: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE...

Since the file could not be created the real time apply got interrupted and aborted. The same will happen with normal recovery also. If you notice the previous snippet of the alert log file. A new datafile is recorded with control file.

File #6 added to control file as 'UNNAMED00006' because
the parameter STANDBY_FILE_MANAGEMENT is set to MANUAL
The file should be manually created to continue.
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!

That is a dummy file and not created on the file system. Now in order to restart the apply process we have create the file manually and start the MRP. For creating the file the following command can be used @ the standby.

ALTER DATABASE CREATE DATAFILE '/u02/app1/oracle/product/db_2/dbs/UNNAMED00006' as '/u02/app1/oracle/oradata/dbdg/test1.dbg';

Once the file is created the recovery could be started again.

Hope its useful!!

Tuesday, 29 July 2014

11.2 Dataguard Physical Standby

Environment:

Test Machine: OEL 5.5(2 servers) with 6GB RAM

Primary Instance : dbdg
ORACLE_BASE=/u02/app/oracle
ORACLE_HOME=/u02/app/oracle/product/db_2
ORACLE_SID=dbdg

Standby Instance: dbdg_stby
ORACLE_BASE=/u02/app1/oracle
ORACLE_HOME=/u02/app1/oracle/product/db_2
ORACLE_SID=dbdg

This configuration comes with a different directory structure between primary and standby. so few specific steps are added which is not the case with a regular setup.

Configuration (Primary Server):

Enable Archiving:

Update the archive log location to a local destination and enable archiving

SQL> startup mount pfile="/u02/app/oracle/product/db_2/dbs/pfile_dbdg.ora"
ORACLE instance started.

Total System Global Area 2488635392 bytes
Fixed Size                  2215904 bytes
Variable Size            1392508960 bytes
Database Buffers         1073741824 bytes
Redo Buffers               20168704 bytes
Database mounted.
SQL> alter database archivelog;
Database altered.
SQL> alter database open;
Database altered.
SQL> alter database force logging;
Database altered.

SQL> sho parameter name
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert                 string
db_name                              string      dbdg
db_unique_name                       string      dbdg
global_names                         boolean     FALSE
instance_name                        string      dbdg
lock_name_space                      string
log_file_name_convert                string
service_names                        string      dbdg.localdomain
log_archive_dest_1                   string      LOCATION=/u02/app/oracle/orada
                                                 ta/dbdg/archivelog

Parameter Changes:

LOG_ARCHIVE_CONFIG:

"LOG_ARCHIVE_CONFIG enables or disables the sending of redo logs to remote destinations and the receipt of remote redo logs, and specifies the unique database names (DB_UNIQUE_NAME) for each database in the Data Guard configuration"

ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='DG_CONFIG=(dbdg,dbdg_stby)';

Archiving to Remote Destination(Log Shipping):

Set the second archival destination to point to the service of the standby so that logs can be shipped.

ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=dbdg_stby NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dbdg_stby';

ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE;

FAL_SERVER:

"FAL_SERVER specifies the FAL (fetch archive log) server for a standby database. The value is an Oracle Net service name, which is assumed to be configured properly on the standby database system to point to the desired FAL server."

alter system set fal_server=dbdg_stby;

STANDBY_FILE_MANAGEMENT:

"STANDBY_FILE_MANAGEMENT enables or disables automatic standby file management. When automatic standby file management is enabled, operating system file additions and deletions on the primary database are replicated on the standby database. STANDBY_FILE_MANAGEMENT is only applicable to physical standby databases."

ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO; Click Here to know More!!

TNS Setup:

Create the TNS names.
dbdg =
(DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.25.164.20)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = dbdg.localdomain) <<<<<This should reflect the service name in lsnrctl status>>>>
    )
)
dbdg_stby =
(DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.25.164.21)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = dbdg_stby.localdomain)
    )
)

To note,
1. The service name should point to the service name from your DB
2. The Alias for the TNS(dbdg_stby) is what is being referred by primary for the remote archival.

Backup the Primary DB:

Backup the primary database(for manual method)
$ rman target=/
RMAN> BACKUP DATABASE PLUS ARCHIVELOG;

Create Standby controlfile:

ALTER DATABASE CREATE STANDBY CONTROLFILE AS '/tmp/dbdg_stby.ctl';

Create standby pfile:

CREATE PFILE='/tmp/initdbdg_stby.ora' FROM SPFILE;

Amend the following params in the standby pfile,

*.db_unique_name='dbdg_stby'
*.fal_server='dbdg'
*.log_archive_dest_2='SERVICE=dbdb ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dbdg'

Log_Archive_dest_2 is set in case of a role reversal. Not necessary otherwise.

Configuration (Standby Server):

Create necessary folders in standby server

mkdir -p /u02/app1/oracle/flash_recovery_area/DBDG
mkdir -p /u02/app1/oracle/flash_recovery_area/dbdg
mkdir -p /u02/app1/oracle/oradata/dbdg
mkdir -p /u02/app1/oracle/admin/dbdg/adump

Update the parameter in standby pfile to suit the location in standby server. Because here we have a difference in path between primary and standby. So control_file, flash_recovery_area, diag_dest and such parameters have to be updated accordingly for standby.

Move files from primary to standby:

The following files to be moved from primary to standby.

1. Standby control file -- Move to the path as mentioned in initora of standby

2. DB Backupsets -- Move to FRA as mentioned in standby controlfile.

3. Password file -- Default patch $ORACLE_HOME/dbs

4. Standby initora file -- Default patch $ORACLE_HOME/dbs

Start Standby Instance:

1. Create spfile from the standby pfile from primary.

2. Startup the standby instance in mount state.

sqlplus sys as sysdba

SQL>startup mount

Restore Database:

Since the location of the files has changed[See flash recovery area & oradata] the backup references from control file can no more be used. Hence cataloging the backup piece with the standby control file.

bash-3.2$ pwd
/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29
bash-3.2$ ls -lrt
total 1211524
-rw-r----- 1 oracle oinstall    9830400 Jul 29 15:42 o1_mf_ncsnf_TAG20140729T145324_9xgt24kn_.bkp
-rw-r----- 1 oracle oinstall 1199472640 Jul 29 15:42 o1_mf_nnndf_TAG20140729T145324_9xgt0dn1_.bkp
-rw-r----- 1 oracle oinstall     509952 Jul 29 15:42 o1_mf_annnn_TAG20140729T145421_9xgt25w9_.bkp
-rw-r----- 1 oracle oinstall   29550080 Jul 29 15:42 o1_mf_annnn_TAG20140729T145322_9xgt0bso_.bkp
bash-3.2$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Tue Jul 29 17:46:33 2014
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DBDG (DBID=3034159006, not open)
RMAN> catalog backuppiece '/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_ncsnf_TAG20140729T145324_9xgt24kn_.bkp';
using target database control file instead of recovery catalog
cataloged backup piece
backup piece handle=/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_ncsnf_TAG20140729T145324_9xgt24kn_.bkp RECID=5 STAMP=854214423
RMAN> catalog backuppiece '/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_nnndf_TAG20140729T145324_9xgt0dn1_.bkp';
cataloged backup piece
backup piece handle=/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_nnndf_TAG20140729T145324_9xgt0dn1_.bkp RECID=6 STAMP=854214440
RMAN> catalog backuppiece '/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_annnn_TAG20140729T145421_9xgt25w9_.bkp';
cataloged backup piece
backup piece handle=/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_annnn_TAG20140729T145421_9xgt25w9_.bkp RECID=7 STAMP=854214460
RMAN> catalog backuppiece '/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_annnn_TAG20140729T145322_9xgt0bso_.bkp';
cataloged backup piece
backup piece handle=/u02/app1/oracle/flash_recovery_area/DBDG/backupset/2014_07_29/o1_mf_annnn_TAG20140729T145322_9xgt0bso_.bkp RECID=8 STAMP=854214481

After cataloging the backups. The backups have to be restored in the new place.

Control file has the path reference to /u01/app, but the files have to restored to /u01/app1 hence using newname to restore the files in the new path. Then using the switch command to update the new location in the standby control file.

run
{
set newname FOR DATAFILE '/u02/app/oracle/oradata/dbdg/system01.dbf' TO '/u02/app1/oracle/oradata/dbdg/system01.dbf';
#restore tablespace system;
set newname FOR DATAFILE '/u02/app/oracle/oradata/dbdg/sysaux01.dbf'TO '/u02/app1/oracle/oradata/dbdg/sysaux01.dbf';
#restore tablespace sysaux;
set newname FOR DATAFILE '/u02/app/oracle/oradata/dbdg/undotbs01.dbf' TO '/u02/app1/oracle/oradata/dbdg/undotbs01.dbf';
#restore tablespace UNDOTBS1;
set newname FOR DATAFILE '/u02/app/oracle/oradata/dbdg/users01.dbf' TO '/u02/app1/oracle/oradata/dbdg/users01.dbf';
#restore tablespace users;
set newname FOR DATAFILE '/u02/app/oracle/oradata/dbdg/example01.dbf' TO '/u02/app1/oracle/oradata/dbdg/example01.dbf';
#restore tablespace example;
restore database;
switch datafile all;
}

Create standby redo logs:

ALTER DATABASE ADD STANDBY LOGFILE ('/u02/app1/oracle/oradata/dbdg/stbyredo1.log') SIZE 50M;
ALTER DATABASE ADD STANDBY LOGFILE ('/u02/app1/oracle/oradata/dbdg/stbyredo2.log') SIZE 50M;
ALTER DATABASE ADD STANDBY LOGFILE ('/u02/app1/oracle/oradata/dbdg/stbyredo3.log') SIZE 50M;
ALTER DATABASE ADD STANDBY LOGFILE ('/u02/app1/oracle/oradata/dbdg/stbyredo4.log') SIZE 50M;

Please note there are four standby redo logs created. Usually you should have n+1 standby redo log where n is the number of online redo logs from primary. {}

Online redo logs:

Create online redologs (optional for role reversal) But this cannot be done when the standby_file_management is set to AUTO. So we have to change this to MANUAL and then create the online log files.

ALTER DATABASE ADD LOGFILE ('/u02/app1/oracle/oradata/dbdg/onlineredo1.log') SIZE 50M;
ALTER DATABASE ADD LOGFILE ('/u02/app1/oracle/oradata/dbdg/onlineredo2.log') SIZE 50M;
ALTER DATABASE ADD LOGFILE ('/u02/app1/oracle/oradata/dbdg/onlineredo3.log') SIZE 50M;

Start recovery:

alter database recover managed standby database disconnect from session;

Now MRP process in the standby would start and the changes would be applied to the standby.

Verification:

One usual issue with the DG setup is the log shipping may not happen due to incorrect service configuration. We can check the same with the following Query.

select dest_id, status, error from v$archive_dest;

In this setup I had issue with log shipping and the above query clearly denotes that.

select dest_id,status,error from v$archive_dest;

   DEST_ID STATUS
---------- ---------
ERROR
-----------------------------------------------------------------
         1 VALID

         2 ERROR
ORA-12154: TNS:could not resolve the connect identifier specified

So from the error from primary it is clear that the second archive log destination is not valid because of the TNS issue. Further checking with we could see that the issue was with the TNS names. The service_name in TNS was incorrect as it was missing the domain. After changing that MRP started applying and destination 2 became valid.

You may further check if logs are shipped and applied by
SQL>archive log list

From both the primary and standby.