Tuesday, May 24, 2011

Another RMAN Confusion

There is a general thinking that as long as we backup the database using RMAN, whether to tape or disk, we should be able to use RMAN restore command to restore the backupset to whichever server we want, and use that backupset to recover the database on a different server.

This is completely false. RMAN restore command is not same as OS level 'restore a file' command. RMAN restore reads out the datafiles from the backupset and lays them out under data directories specified by db_file_name_convert and log_file_name convert.

Having said that, there is absolutely no way to restore a RMAN backupset from a tape to disk and then use it to build the database.

When RMAN creates a backup directly on tape, it's header will be different than a backupset created by rman to disk. Thus, if that tape backup is manually copied to disk, RMAN will not recognize the backupset header and will report it as corrupted.

So real solution is, configure Media Management software to the alternate server and restore the db using "duplicate" command.

Thursday, November 04, 2010

How to create Datafiles in a Data Guard (10g) environment

I came across this Data Guard situation lately where a datafile was added to the primary database, but failed during creation on the standby. The database version was 10.2.0.3 and standby_file_management was set to AUTO.

Standby_file_management is a parameter, and when set to AUTO on the standby site, will create datafiles automatically on the standby site, for every datafile created on the primary site.

I started getting alerts in a few minutes on the standby for the following error.

WARNING: File being created with same name as in Primary Existing file may be overwritten
File #309 added to control file as 'UNNAMED00309'.
Originally created as: '/disk1/oradata/primarydb01/sys06.dbf'
Recovery was unable to create the file as: '/disk1/oradata/primary/sys06.dbf'
Errors with log /disk1/archive/stbydb/sf2_1_215423_528599320.arc
MRP0: Background Media Recovery terminated with error 1119
Tue Feb 5 07:42:09 2008
Errors in file /disk1/oracle/admin/stbydb2/bdump/stbydb2_mrp0_12552.trc:
ORA-01119: error in creating database file '/disk1/oradata/sf12/sys06.dbf'
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
SVR4 Error: 13: Permission denied
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Tue Feb 5 07:42:13 2008
Errors in file /disk1/oracle/admin/stbydb2/bdump/stbydb2_mrp0_12552.trc:
ORA-01119: error in creating database file '/ora-disk/oradata/sf12/sys06.dbf'
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
SVR4 Error: 13: Permission denied
Tue Feb 5 07:42:13 2008

From the message it was clear that the datafile creation had failed on the standby, even though standby_file_management was set to AUTO. It failed for the reason that the directory the datafile was created in, "/disk1/oradata/primary/", was not there on the Physical Standby. However, we want the directory "/disk1/oradata/primary/" to be "/disk1/oradata/standby/" on the standby. This is taken care of by "DB_FILE_NAME_CONVERT" parameter set on the standby. However, if you have several lines for this parameter,

SQL> Show parameter db_file_name_convert

will display as empty. At that point, the only way to check that value is to view the parameter file.

So login to standby database and issue

Step 1: SQL> create pfile from spfile;

Now go to $ORACLE_HOME/dbs directory for that initialization parameter file.

In our case, we had clearly missed the directory that the datafile was created on the standby.

Change the pfile to add that destination

Db_file_name_convert= '/disk1/oradata/primary','/disk1/oradata/standby'

Step 2: Now login to the standby and cancel the recovery

SQL> alter database recover standby managed database cancel;

Database altered

Step 3: Shutdown and startup the standby database using new parameter file.



SQL> shutdown immediate;
ORA-01507: database not mounted



ORACLE instance shut down.

SQL> startup mount pfile=$ORACLE_HOME/dbs/initstandby.ora
ORACLE instance started.







SQL> create spfile from pfile;

Step 4: Start the managed recovery on the standby.

SQL> alter database recover managed standby database disconnect from session;

Database altered.

The above command will tell standby to start applying archivelogs from where it was missing. I got the following errors in the alert log this time:

MRP0: Background Managed Standby Recovery process started (stbydb2)
Managed Standby Recovery not using Real Time Apply
MRP0: Background Media Recovery terminated with error 1111
Wed Feb 6 13:50:05 2008
Errors in file /disk1/oracle/admin/stbydb2/bdump/stbydb2_mrp0_22855.trc:
ORA-01111: name for data file 309 is unknown - rename to correct file
ORA-01110: data file 309: '/disk1/oracle/product/10.2.0/dbs/UNNAMED00309'
ORA-01157: cannot identify/lock data file 309 - see DBWR trace file
ORA-01111: name for data file 309 is unknown - rename to correct file
ORA-01110: data file 309: '/disk1/oracle/product/10.2.0/dbs/UNNAMED00309'
Wed Feb 6 13:50:05 2008
Errors in file /disk1/oracle/admin/stbydb2/bdump/stbydb2_mrp0_22855.trc:
ORA-01111: name for data file 309 is unknown - rename to correct file
ORA-01110: data file 309: '/disk1/oracle/product/10.2.0/dbs/UNNAMED00309'
ORA-01157: cannot identify/lock data file 309 - see DBWR trace file
ORA-01111: name for data file 309 is unknown - rename to correct file
ORA-01110: data file 309: '/disk1/oracle/product/10.2.0/dbs/UNNAMED00309'
Wed Feb 6 13:50:05 2008
MRP0: Background Media Recovery process shutdown (stbydb2)
Wed Feb 6 13:50:05 2008
Completed: alter database recover managed standby database disconnect from session
Wed Feb 6 13:52:09 2008
Using STANDBY_ARCHIVE_DEST parameter default value as /disk2/archive/stbydb2/
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[1]: Assigned to RFS process 23024

This is because whenever datafile creation fails, Oracle server will create an invisible datafile called UNNAMED and make it look like it is there under $ORACLE_HOME/dbs. When the automatic recovery is resumed, it looks at that UNNAMED file, and the standby fails again with the message:

ORA-01111: name for data file 309 is unknown - rename to correct file

Now you need to rename that file to the one that you want to create in order for the physical standby to continue applying remaining logs.

Step 5: So now when you try to issue the "rename" command, you will get this error:

SQL> alter database rename file '/disk1/oracle/product/10.2.0/dbs/UNNAMED00309' to
'/disk1/oradata/stbydb2/sys06.dbf'
*
ERROR at line 1:
ORA-01511: error in renaming log/data files
ORA-01275: Operation RENAME is not allowed if standby file management is automatic.

This is because RENAME (ADD/DROP,CREATE) operations on the standby is not allowed, when standby_file_manageent is set to AUTO. Now you need to change the parameter to MANUAL for the above command to work.

Step 6: SQL> alter system set STANDBY_FILE_MANAGEMENT=MANUAL scope=memory;

System altered.

Step 7: SQL> alter database rename file
'/disk1/oracle/product/10.2.0/dbs/UNNAMED00309' to '/disk1/oradata/stbydb2/sys06.dbf';
alter database rename file '/disk1/oracle/product/10.2.0/dbs/UNNAMED00309' to '/disk1/oradata/stbydb2/sys06.dbf'
*
ERROR at line 1:
ORA-01511: error in renaming log/data files
ORA-01141: error renaming data file 309 - new file
'/disk1/oradata/stbydb2/sys06.dbf' not found
ORA-01111: name for data file 309 is unknown - rename to correct file
ORA-01110: data file 309:
'/disk1/oracle/product/10.2.0/dbs/UNNAMED00309'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3



Oh... What now? Well, we can not rename to a file that doesn't already exist, so we actually need to create the datafile.

Step 8: SQL> alter database create datafile
'/disk1/oracle/product/10.2.0/dbs/UNNAMED00309' as '/disk1/oradata/stbydb2/sys06.dbf';

Database altered.

Yes, finally! Ok, datafile created. Let's put the automatic recovery back on the standby.

Step 9: SQL> alter database recover standby database;
alter database recover standby database
*
ERROR at line 1:
ORA-00279: change 9999250009 generated at 02/05/2008 07:28:43 needed for thread 1
ORA-00289: suggestion : /disk2/archive/stbydb2/sf2_1_215423_528599320.arc
ORA-00280: change 9999250009 for thread 1 is in sequence #215423



Now Oracle wants me to manually specify the archive log files to be applied. This is because I changed the standby_file_management parameter earlier to MANUAL. I need to put it back to AUTO.

Step 10: SQL> alter system set STANDBY_FILE_MANAGEMENT=AUTO scope=memory;

System altered.

Step 11: Now issue the command
SQL> recover automatic standby database;

Media recovery complete.

Check the alert log

Tail –f alert*

RFS[1]: Assigned to RFS process 27167
RFS[1]: Identified database type as 'physical standby'
Wed Feb 6 14:48:32 2008
RFS LogMiner: Client disabled from further notification
Wed Feb 6 14:48:45 2008
Media Recovery Log /disk2/archive/stbydb2/sf2_1_215425_528599320.arc
Wed Feb 6 14:50:04 2008
RFS[1]: No standby redo logfiles created
Wed Feb 6 14:51:32 2008
RFS[1]: Archived Log: '/disk2/archive/stbydb2/sf2_1_215542_528599320.arc'
Wed Feb 6 14:56:42 2008
Recovery created file /disk1/oradata/stbydb2/users121.dbf
Successfully added datafile 310 to media recovery
Datafile #310: '/disk1/oradata/stbydb2/users121.dbf'



Yes, it successfully created the datafile and the recovery resumed. All the missing 105 archive log files were applied against 2.5TB OLTP database.

Best practices and lesson learned: Whenever you add a datafile to the primary, login to standby and make sure it was created successfully. The best way to do that is to issue "tail –f alert*" in the background_dump_dest directory. If for some reason the datafile creation has failed, cancel the recovery on the standby, and do only the steps 6, 8, 10 and 11 in this document.

Tuesday, November 02, 2010

Oracle 10g Compression vs. 11g's Advanced Compression

This document is meant to show the difference between 10gR2 Table Compression feature and 11gR2’s Advanced Compression. Oracle provided table level compression feature in 10gR2. While this compression provided some storage reduction, 10g’s table compression only compressed the data during BULK LOAD operations. New and updated data were not compressed. With 11g’s Advanced Compression new and updated data are also compressed; achieving highest level in storage reduction, while providing performance improvements as compressed blocks result in more data being moved per I/O.

[Please understand that these test cases are only meant as examples to quickly analyze and understand the benefits of Advanced Compression in 11g. Depending on duplicate data in your environments, these results may vary]


Here is the example.

Following test case was executed in 10.2.0.1 database server.

A table called TEST was created without COMPRESSION option.

SQL> select table_name,compression from dba_tables where table_name = 'TEST';

TABLE_NAME COMPRESS
------------------------- -------------
TEST DISABLED


SQL> select bytes from dba_segments where segment_name = 'TEST';

SUM(BYTES)
------------------
92274688


The size of the table was around 92MB.

Now create another table called TEST_COMPRESSED with COMPRESS option.

SQL> create table TEST_COMPRESSED COMPRESS as select * from test;

Table created.

SQL> select table_name, compression from dba_tables where table_name like 'TEST
%';

TABLE_NAME COMPRESS
------------------------------ ---------------
TEST_COMPRESSED ENABLED
TEST DISABLED


Now let’s check the size of the COMPRESSED table.


SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

SUM(BYTES)
----------------
30408704

Check out the size of the COMPRESSED table. It is only 30MB, around 30% reduction in size. So far so good.

Now let’s do a plain insert into the COMPRESSED table.

SQL> insert into TEST_COMPRESSED select * from TEST;

805040 rows created.

SQL> commit;

Commit complete.

Let’s check the size of the COMPRESSED table.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED'

2 /

SUM(BYTES)
----------
117440512


Wow! From 30MB to 117MB? So, plain INSERT statement does not COMPRESS the data in 10g.

(You will see this is not the case with 11g)

Now let’s do the same insert with a BULK LOAD

SQL> insert /*+ APPEND */ into TEST_COMPRESSED select * from TEST;

805040 rows created.

SQL> commit;

Commit complete.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';


SUM(BYTES)
----------
142606336

Ok, now the size of the COMPRESSED table is 142MB from 117MB. For the same number of rows, the table size only increased by 25MB. So BULK LOAD compresses the data.

Let’s check other DML statements such as DELETE and UPDATE against the COMPRESSED table.


SQL> delete from test_compressed where rownum < 100000;

99999 rows deleted.

SQL> commit;

Commit complete.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

SUM(BYTES)
----------
142606336

No change in total size of the table. DELETE has no impact as expected.


Let’s check UPDATE.

SQL> update test_compressed set object_name = 'XXXXXXXXXXXXXXXXXXXXXXXXXX' where
rownum < 100000;

99999 rows updated.

SQL> commit;


SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

SUM(BYTES)
----------
150994944

The table size is increased by 8MB? No compression for UPDATE statement either.


All this clearly shows that 10g’s Table COMPRESSION would work great for initial BULK LOADS, however subsequent UPDATE’s, DELETE’s and INSERT’s will not result in COMPRESSED blocks.







Now, let’s see 11g’s Test Results.

The following SQL statements were executed against 11.2.0.1 database version.


TEST table of 100MB in size was created as before.



SQL> select bytes from dba_segments where segment_name = 'TEST';

BYTES
----------
100663296

So 100MB of table created.

Let’s create a table with COMPRESS FOR ALL OPERATIONS option. This is only available in 11g.


SQL> create table test_compressed compress for all operations as select * from
test;

Table created.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

BYTES
----------
31457280


Check out the size of the compressed table vs. uncompressed table. 30% less space usage on a compressed table. Not a big difference compared to 10g.




Let’s check other DML statements.

Let’s do a plain insert to the compressed table.

SQL> insert into TEST_COMPRESSED select * from test;

789757 rows created.

SQL> commit;

Commit complete.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

BYTES
----------
75497472



11g’s Advanced compression, compressed 100MB of data to 40MB and inserted to the compressed table, WITHOUT BULK LOAD option.

Now let’s do the BULK LOAD onto 11g’s COMPRESSED table.


SQL> insert into /*+ APPEND */ test_compressed select * from TEST;

789757 rows created.

SQL> commit;

Commit complete.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

BYTES
----------
109051904
It has a same impact as PLAIN insert.

What about deletes and updates?


SQL> delete from test_compressed where rownum < 100000;

99999 rows deleted.

SQL> commit;

Commit complete.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

BYTES
----------
109051904


No change in deletes. This is expected as the blocks are compressed when the new rows are added to the existing blocks and that the threshold reaches PCTFREE.


SQL> update test_compressed set object_name = 'XXXXXXXXXXXXXXXXXXXXXXXXXX' where

2 rownum < 100000;

99999 rows updated.

SQL> commit;

Commit complete.

SQL> select bytes from dba_segments where segment_name = 'TEST_COMPRESSED';

BYTES
----------
109051904


There is no change in this case as existing blocks were able to accommodate updates. However the same update generated more data in 10g.


Summary:

The above tests prove that 11g’s Advanced Compression compresses Table Data for all INSERTS and UPDATES, resulting in huge storage reduction. However 10g’s compression only compresses data during BULK LOAD operations. Subsequent added/updated data is not compressed.

Friday, June 18, 2010

I/O Errors using NFS on Netapps storage on Oracle 10.2.0.4

I recently faced the following issue, on one of my client environment.

After successfully creating a 10g database using NFS file system on a Linux OS, I turned it over to the client for production implementation. Developers started populating the data using Schema level import. When a large table is imported, the import failed with the following errors

ORA-01115: IO error reading block from file 6 (block # 5631)
ORA-01110: data file 6: '/u0201/oradata/reportdata.dbf'
ORA-27091: unable to queue I/O
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 5630

After searching several hours on Oracle Support, found a document that talked about setting up filesystemio_options to 'setall' on NFS file systems for redo log corruption. We did not have any corruption, but I wanted to try it anyway because since we were using NFS.

Alter system set filesystemio_options=setall scope=spfile;

After setting this, developers did the imports successfully without any problem.


Parameter: FILESYSTEMIO_OPTIONS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"directIO" - This allows directIO to be used where supported by the OS. Direct IO bypasses any Unix buffer cache. As of 10.2 most platforms will try to use "directio" option for NFS mounted disks (and will also check NFS attributes are sensible).
"setall" - Enables both ASYNC and DIRECT IO. The 'setall' option chooses async or direct I/O based on the platform and file system used.
Enabling directio against nfs datafiles/redo logs should always be used so that no writes are ever cached in the os buffer cache. The consistency of writes cannot be guaranteed when relying on stateless nfs to perform asynchronous writes over the network.

Saturday, June 05, 2010

Size of RMAN incremental backups

We had a situation recently where backup destination on a Linux server was 97%. Our system admin, wanted to know why? because RMAN retention has not changed, we wanted to know the db did actually grew that big to make that difference. We were taking Lavel 0 backups every Sunday followed by Level 1 and 2 backups during the week and Level 0 again next Sunday.

We were keeping 45 days worth of backups on disk. Now, we needed to find out the size of backups sets every week and then calculate which week had a enormous growth in size. Messy way of doing this is, at the UNIX level, we can issue, ls -ltr and then total the backups belonging to each week. The database was 500Gb and each backupsets had different sizes. It was a cumbersome effort.

How did we do this elegantly in RMAN? There is a view called rc_backup_datafile that you can use fir this purpose.

First Week

select
sum(blocks * block_size)/(1024*1024*1024)
from rc_backup_datafile
where completion_time between ('04-APR-2010') and ('10-APR-2010')
order by 1

SQL> /

SUM(BLOCKS*BLOCK_SIZE)/(1024*1024*1024)

---------------------------------------
255.587982

Second week

select
sum(blocks * block_size)/(1024*1024*1024)
from rc_backup_datafile
where completion_time between ('11-APR-2010') and ('17-APR-2010')
order by 1
SQL> /



SUM(BLOCKS*BLOCK_SIZE)/(1024*1024*1024)
---------------------------------------
255.705849


select
sum(blocks * block_size)/(1024*1024*1024)
from rc_backup_datafile
where completion_time between ('26-APR-2010') and ('03-MAY-2010')
order by 1

SQL> /


SUM(BLOCKS*BLOCK_SIZE)/(1024*1024*1024)
---------------------------------------
283.666035


select
sum(blocks * block_size)/(1024*1024*1024)
from rc_backup_datafile
where completion_time between ('04-MAY-2010') and ('11-MAY-2010')
order by 1

SQL> /


SUM(BLOCKS*BLOCK_SIZE)/(1024*1024*1024)
---------------------------------------
279.676537

Ok, between 26th Apr and 3rd May, the backups had grown almost 30Gb. It was admitted to me , there indeed was a major upload that happened that week, which would explain the sudden increase in backup directory usage.

Ok, demand for more disk was justified!

Dbca hangs while creating the database

I came across the following weird issue with dbca, that I wanted to share with others.
I had to create a database on a Linux x86_64 server. When I ran ./dbca , the screens ran through and I answered all the questions and then clicked the last step that creates the database.

dbca just was hung , did not proceed. There was no logs, no errors just nothing! I was puzzled and didn't know where to start digging. Finally there was a problem with my Xming utility, I was missing several font files that oracle needs. AFter reinstalling the X utilities by the sys admin, dbca finished to completion.

So if you are stuck with unexplainable issue with dbca, check your X windows utilities and make sure you have all the required set.

Saturday, May 22, 2010

Flashback Query Usage Example

Flashback Query

In my earlier blog, I explained about Flashback Database Feature usage. In this blog I wanted to give an example of a Flashback Query usage.

I received a call one Friday afternoon from one my client DBAs, just before I was getting ready to leave my office. He was extremely nervous on the phone and could not speak clearly. I said "oh no.. don't worry" and then asked him "What happened?".

He just realized that he had just dropped bunch of rows from a critical table in production. He wanted to know how long it will take to restore the database. I asked him if he knew when he performed this operation. He said he didn't know just not in the mindset to guess either. I asked him if the previous RMAN backups were good, he said yes. But he deleted bunch of archivelog files this morning. So point in time recovery just before the delete operation was out of question.

I asked him if the database had undo_retention set. He said yes with 24 hour retention. I asked him the name of the table and told him not to call me or IM me next 10 min.

Here are the steps I followed. I just estimated 3 hours and started querying from the table from last three hours. First to see how many rows were there around 3:00 P.M on Feb 1st.

SELECT count(*) FROM schema_name.card_accounts AS OF TIMESTAMP TO_TIMESTAMP('01-FEB-10 03:00:00 PM','DD-MON-YY HH:MI:SS PM')
/
20004

Ok there were 20004 records at 3:00 P.M. It only had 2 records now at 4:30 P.M. I have used dummy schema_name so you won't guess my client.

Clearly the rows were deleted between 4:00 P.M and 4:30 P.M. But it is a wide window.

SO let's query again

SELECT count(*) FROM schema_name.card_accounts AS OF TIMESTAMP TO_TIMESTAMP('01-FEB-10 03:30:00 PM','DD-MON-YY HH:MI:SS PM')
/
20004

SELECT count(*) FROM schema_name.card_accounts AS OF TIMESTAMP TO_TIMESTAMP('01-FEB-10 04:00:00 PM','DD-MON-YY HH:MI:SS PM')
/
20004

SELECT count(*) FROM schma_name.card_accounts AS OF TIMESTAMP TO_TIMESTAMP('01-FEB-10 04:30:00 PM','DD-MON-YY HH:MI:SS PM')
/
2

Alright, so the rows got deleted between 4:00 P.M and 4:30 P.M. There my phone rings again. I picked up and asked him "Hey Mike, does 20004 sounds good?". "Yeah that is what I had before" Mike replied.

I inserted those rows to a temporary table.

insert into card_accounts_tmp2 select * from schema_name.card_accounts AS OF TIMESTAMP TO_TIMESTAMP('01-FEB-10 03:00:00 PM','DD-MON-YY HH:MI:SS PM')
/
20004 rows created.

Ok go ahead and check that temporary table to see if you have everything you need! He said "No". OH, yeah I forgot to "commit".


I issued commit.

SQL> commit;

Commit complete

Ok, "Now?" I asked. He typed carelessly and said "Wow! yes they are all there! You are my hero!" , slowly raising his voice.

Now I queried the table again and found 8 rows. oh yeah, the users are in and inserting rows into that table.

Mike said that was a easy fix, he can just merge those two tables and insert back.

Ok, Problem Solved and Over! So was my Friday evening! :-(