Alter Table Shrink Taking too Long

When you execute “alter table shrink <segment>” , there is no direct way to tell you how long it will take. A helpful aid to determine when shrinking will finish is to use dbms_space.space_usage package. This function will show you block activity between LHWM and HHWM. I wrote a pipelined function to see output of this function and you can find it here. You’d appreciate this little pipelined function instead of using so many dbms_output.put_line functions.

As of 10g, Oracle introduced LHWM and HHWM.

HHWM is the same as HWM in prior versions which is all blocks above this mark have not been formatted.

LHWM (Low High Water Mark) which all blocks below this mark have been formatted.

Between LHWM and HHWM  there are 5 categories of blocks:

  1. Unformatted
  2. 0 to 25 % empty
  3. 25% to 50% empty
  4. 50% to 75% empty
  5. 75% to 100% empty

You can use dbms_space.space_usage to report on the area between LHWM & HHWM

What to look for?

If you are shrinking a table, you will see the following behavior:

Unformatted Blocks Number of blocks will not change
0 to 25% blocks Number of blocks will decrease until it becomes zero
25% to 50% blocks Number of blocks will decrease until it becomes zero
50% to 75% empty Number of blocks will decrease until it becomes zero
75% to 100% empty Number of blocks will increase
Full Blocks This is number of blocks below LHWM (actual table size) and will increase as the other numbers will decrease

Shrinking index behavior is different.  Full blocks will decrease first, while other block categories will increase,  then Full blocks will increase again.




Hazem

Oracle Pipelined Function for DBMS_SPACE.FREE_USAGE

This is a pipelined function to show dbms_space.space_usage output in a table instead of using dbms_output.

dbms_space.space_usage will show you the details of blocks between LHWM (Low High Water Mark) and HHWM (High High Water Mark). Based on the output of this function, you can decide whether to shrink the segment or not.

There are many online resources to explain pipelined functions so I will not do this here.

Here is the code:

-- create object
create type space_usage_object as object 
(
   Segment_owner varchar2(30),
   segment_name varchar2(30),
   segment_type varchar2(30),
   partition_name varchar2(30),
   blocks_unformatted number,
   bytes_unformatted number,
   
   blocks_0_25 number,
   bytes_0_25 number,   

   blocks_25_50 number,
   bytes_25_50 number,   
   
   blocks_50_75 number,
   bytes_50_75 number,   
   
   blocks_75_100 number,
   bytes_75_100 number,            
   
   blocks_full number,
   bytes_full number
);


-- create type
CREATE TYPE space_usage_type AS TABLE OF space_usage_object;   

-- pipelined function
create or replace FUNCTION SPACE_USAGE (
  in_segment_owner        IN  VARCHAR2 ,
  in_segment_name  IN  VARCHAR2 ,
  in_segment_type  IN  VARCHAR2 ,
  in_partition_name       IN  VARCHAR2 := null)
  RETURN space_usage_type PIPELINED AS

 v_unformatted_blocks number;
 v_unformatted_bytes number;
 v_fs1_blocks number;
 v_fs1_bytes number;
 v_fs2_blocks number;
 v_fs2_bytes number;
 v_fs3_blocks number;
 v_fs3_bytes number;
 v_fs4_blocks number;
 v_fs4_bytes number;
 v_full_blocks number;
 v_full_bytes number;
 
 
begin


  dbms_space.space_usage(segment_owner => in_segment_owner,
                        segment_name => in_segment_name,
                        segment_type => in_segment_type,  
                        unformatted_blocks => v_unformatted_blocks, 
                        unformatted_bytes => v_unformatted_bytes,
                        fs1_blocks => v_fs1_blocks, 
                        fs1_bytes => v_fs1_bytes,
                        fs2_blocks => v_fs2_blocks,
                        fs2_bytes => v_fs2_bytes,
                        fs3_blocks => v_fs3_blocks,
                        fs3_bytes => v_fs3_bytes,
                        fs4_blocks => v_fs4_blocks, 
                        fs4_bytes => v_fs4_bytes,
                        full_blocks => v_full_blocks, 
                        full_bytes => v_full_bytes,
                        partition_name =>  in_partition_name);
                        
  pipe row (space_usage_object (in_segment_owner, in_segment_name, in_segment_type, in_partition_name,
                              v_unformatted_blocks , v_unformatted_bytes, 
 v_fs1_blocks, v_fs1_bytes, v_fs2_blocks,  v_fs2_bytes,  v_fs3_blocks, 
 v_fs3_bytes, v_fs4_blocks , v_fs4_bytes , v_full_blocks , v_full_bytes ));

  RETURN ;
END;


-- selecting 
select * from table(space_usage ('OWNER', 'SEGMENT_NAME', 'SEGMENT_TYPE'));


Hazem Ameen
Senior Oracle DBA

Bulk Deletion Performance with Foreign Keys Constraints (on delete cascade)

The purpose of this post is to discourage DBAs from creating “Enabled” foreign keys on tables that are expected to have bulk operations.  I still suggest to create foreign keys but disable them so designers, DBAs and developers will still know which tables are related.

Our test cases were run in our development environment which is 10.2.0.5 RAC sitting on HP Itanium.

The target was to delete a few million rows from 2 tables which were related 1 to many with an enabled foreign key  “on delete cascade”.

The initial plan was simple:

1)      Delete from child table first so we can eliminate lookups from parent to child.

2)      Delete from parent .

3)      Use a nice parallel hint for both steps.

We tested our plan with about 35,000 rows.

First step went fine, but second step the query went for a long time, when we checked, we found that query is waiting for TX Lock even though we executed the query in the same session.

Why wait for TX Lock in the same session?

This is because the first query was using a parallel hint which created parallel sessions.  These parallel sessions locked the rows in the child table. You couldn’t move forward unless we issue a commit.

You can also produce the same behavior in a single instance environment without using any parallelism by doing the following simple steps:

1)      Delete child row

2)      Create a new session and then delete the parent row.

The deletion of the parent row will hang until you commit in the child row.
 

Foreign Keys Performance

The following test cases were executed for about 35,000 rows. Every time we delete, we copy the data back.

We deleted child data first, committed and then deleted from parent table, it took 75 seconds

We deleted directly from parent table, it took 75 seconds (apparently there is no gain from deleting from child table first then deleting from parent table, bummer!)

We disabled foreign keys, delete from child, then delete from parent, total time was 12 seconds only, 6 times as fast.

 

Conclusion

Foreign Keys are excellent for ERD diagrams and remind everybody which tables are related, but physically disable them and you should live happily every after.
 
Hazem Ameen
Senior Oracle DBA

Oracle Date, Timestamp & Time Zone Explained

Back in 9i Oracle introduced a new data type called Timestamp.

Timestamp data type adds two new parts to the regular “Date” data type:

  1. Fraction of a second in microseconds ( 6 digits)
  2. Time Zone (shown only on selected functions like systimestamp and current_timestamp and shown on data types such as “Timestamp with time zone”)

Break down of Timestamp: 2/9/2011 10:39:36.802604 AM +03:00
“2/9/2011 10:39:36” is exactly like older Date date type.
“802604” is a fraction of seconds only shows only in timestamp data types.
“+03:00” is Riyadh Time Zone.

Why do I need a fraction of a second in my date, isn’t seconds enough?

For most applications, up to second precision is enough, but some applications do require microseconds.

Applications like trading, biding such as eBay, and some physics applications were the operations start and finish in microseconds can benefit from timestamp.

What is Time Zone?

The actual definition you find it here http://en.wikipedia.org/wiki/Time_zone

Time Zone are like EST (Eastern Standard Time) or CST (Central Standard Time), etc… is determined by how far you are from Greenwich Line.

Here in Riyadh we are UTC + 3.00

Do I need Time Zone?

Applications that gets deployed across multiple time zones will benefit from this feature greatly. Worldwide internet applications with one central database  or replicated worldwide databases  are some of application that utilize Time Zone. Prior to this Time Zone feature, we had to create a Date column that hold  database server date/time with database server time zone and an additional Date column that holds date/time with client’s time zone. Lots of manual conversion (not to forget Day Light Saving Time!). A time zone mess in short.

Here is a table that shows Timestamp data type compared to the older Date data type.

Date Equivalent Timestamp Explanation
sysdate systimestamp System clock on the database server.
current_date current_timestamp Will show date/time in client time zone. If client’s time zone changes, then output of these functions will change accordingly. “current_timestamp” will show time zone part
localtimestamp Will show timestamp in client’s time zone. If client’s time zone changes, then output of this function will change accordingly. This function will NOT show time zone as part of the output.

Data Types

Date Timestamp System clock on the database server.
No Equivalent Timestamp with Time Zone Saves timestamp in client time zone. If client time zone changes, the saved value will NOT be converted to new time zone. This data type will show time zone as part of the output.
No Equivalent Timestamp with Local Time Zone Saves timestamp in client time zone. When selected back, the value shown to user will be converted to client’s current time zone. That means, if client’s time zone changes, then value shown to user will be converted to new time zone. This data type will NOT show time zone as part of the output.

 
How can I know my database time zone?
SELECT dbtimezone FROM DUAL;

 
How to know client’s time zone?
SELECT sessiontimezone FROM DUAL;

 
How to change database time zone?
ALTER DATABASE SET TIME_ZONE = ‘+03:00’;

Then restart the database.
 
How to change client time zone?
ALTER DATABASE SET TIME_ZONE=’+03:00′;

OR by using a named region
ALTER DATABASE SET TIME_ZONE=’Asia/Riyadh’;
 

This later “alter” will change current_timestamp  output to :
13-FEB-11 11.54.19.985565000 AM ASIA/RIYADH

OR by using environment variables

export ORA_SDTZ=’+05:00′

 
How to initially set up your database time zone?
CREATE DATABASE testdb
. . .
. . .
SET TIME_ZONE=’+03:00′;

 
Where can I find a list of all time zones?
Look up this view: V$TIMEZONE_NAMES

Hope this helps.
 
 
Hazem Ameen
Senior Oracle DBA

Insert Append vs CTAS “Create Table as Select” to Copy Data

Some of the most basic tasks a DBA is asked to do, is to move data across schemas or tablespaces. This post compares 2 famous options: CTAS and Insert with append hint. Our comparison revealed that parallel CTAS is about 50% faster than Insert Append.

Test Details:
Table: 2 GB approx, no LOBs, source and target tables are not partitioned
Oracle version: 10.2.0.4 installed on HP Itanium with 4 CPUs
Instance: Single instance ( no RAC), database in noarchive mode
File System: UFS

Here are the test results:

Insert Append Test Cases

Test Case SQL Exec in Secs
no append insert into dest select * from source1; 

 

189
append insert /*+ append  */ into dest select * from source1; 

 

87

 

CTAS Test Cases

Test Case SQL Exec in Secs
CTAS, no parallel create table dest  as select * from source1; 

 

93
CTAS Parallel alter session force parallel ddl parallel 3;
 
alter session force parallel query parallel 3; 
create table dest  as select * from source1; 

 

44

 
Hazem Ameen
Senior Oracle DBA

Data Scrambling in Oracle (including Arabic)

A regular part of a DBA job is moving production data to development environment. This sometimes poses a challenge as how to protect sensitive production data without spending more time more than you have to or buying or 3rd party tool to generate random values to replace existing sensitive production data. This post discusses an idea we are currently using in our development environment.

Before detailing our scrambling solution, here is a common question: What is the difference between encryption and scrambling.

Here are some of these differences:

Encryption Scrambling
Encryption is an algorithm applied to data usually you would need an encryption key along with Oracle built-in packages Replace existing characters or numbers with another character. The end result is same string size but with different characters.
Support for Encryption is built in Oracle Not built in Oracle. You have to develop your own.
Reversible. Irreversible.
Encrypted column require an increase in size for DES, 3DES, 3DES encryptions Same size.
Application needs to be aware of encrypted columns so data can be encrypted/unencrypted Application doesn’t need to be aware of it.
Suitable for production Suitable for development
Near impossible to crack Scrambling is easier to crack unless you generate random values.
New data is also encrypted New data is not scrambled but in development it should be safe

 

How to Scramble Data in Oracle

The best way to perform scrambling is through translate built-in function.

Here is an example:

SELECT 'Hazem Ameen' as before_scrambling , TRANSLATE(lower('Hazem Ameen'),
'.@,0123456789abcdefghijklmnopqrstuvxyzابتثجحخدذرزسشصضطظعغفقكلمنوهي', '.@,4214563875qwertyuiop[kjhbvabcdefxzgباتنمكضصثقفغعهخحجدظزوةىرذشسؤ') after_scrambling
FROM dual;
BEFORE_SCRAMBLING
AFTER_SCRAMBLING
Hazem Ameen iqgtj qjtth

 

It is really up to you how you want to apply it in your environment. At the end would have to execute a statement like this on every column that requires scrambling:

Update table_name set <column_name_to_scrambled> =
translate (<column_name_to_scrambled>, ‘.@,0123456789abcdefghijklmnopqrstuvxyzابتثجحخدذرزسشصضطظعغفقكلمنوهي’, ‘.@,4214563875qwertyuiop[kjhbvabcdefxzgباتنمكضصثقفغعهخحجدظزوةىرذشسؤ’)

Here are some tips on how to apply this in your environment:

  1. If you have too many columns or the process of scrambling repeats often, insert table name along with columns to be scrambled in a table, then develop a PL/SQL procedure to read this table and execute the update statement dynamically.
  2. Updating large amount of data via a single update statement will take a very long time; instead you would open a cursor and loop through the data.
  3. Be aware that triggers on the table will slow down this procedure dramatically. If you can disable them first before running this procedure.
  4. This scrambling process doesn’t scramble dates or LOBS.

 

Hazem Ameen
Senior Oracle DBA

Tuning PL/SQL with Multithreading & DBMS_SCHEDULER

We had a piece of PL/SQL code that copies data from remote view to a local table. This code executes once a day and took about 4 hours. As if this is not bad enough, we received a request to run this piece of code 3-4 times daily which will total up to about 16 hours per day.

The original code to copy data from a remote view consists of 3 steps:

  1. Copy over employee numbers (one column only) from remote view to local temp table. This step takes few seconds only.
  2. Use employee numbers copied locally to do a lookup serially (lookup row by row) from remote view. This step takes about 4 hours to copy 20,000 rows. The reason it takes that long is one column from the view is an actual function. Most of the time is spent executing this function and for so many reasons (not all are technical) we can’t get to function to tune it.
  3. Commit one time at the end.

Here how the original code looks like:

— Step 1:
— truncate local temp table so we can copy employee numbers
— copy over employees numbers to local temp table from the remote view

execute immediate 'truncate table temp_emp_no';
execute immediate 'truncate table employees';

insert into temp_emp_no select distinct EMPLOYEE_NUMBER from remote_employee_view@remote_link;
commit;

— Step 2:
— Loop through copied over employee numbers and do a remote lookup 1 row at a time

cursor c1 is
   select emp_no from temp_emp_no;

for i in c1 loop
   exit when c1%notfound;
   insert into employees
   select EMPLOYEE_NUMBER, first_name, last_name, RATE, PERIOD, total_rate
     from remote_employee_view@remote_link;
   where EMPLOYEE_NUMBER = i.emp_no;
end loop;

commit;

At this point we were left with one option only which is run the PL/SQL code in parallel (multithreading), but as you might now, PL/SQL doesn’t support multithreading natively,

To simulate multithreading we need to accomplish 2 steps:

  1. Break the job in multiple pieces (threads).
  2. Schedule every thread to run concurrently.

Remember first step which is coping employee numbers only from remote view to local table. We hash-partitioned this local table into 4 partitions. Each of these partitions will translate into a thread as you will see.

CREATE TABLE vb_emp_no
   (emp_no  VARCHAR2(30))
    PARTITION BY HASH (EMP_NO)
     PARTITIONS 4
/

This is the thread code (stored procedure) which takes in a partition name as a parameter and copies employee data from remote view for that partition only.

CREATE procedure refresh_employee_data_part (p_name in varchar2 )
  authid definer
is
  TYPE EmpCurTyp  IS REF CURSOR;
  v_emp_cursor    EmpCurTyp;
  sql_stmt varchar2(2048);
  v_emp_no varchar2(30);

begin
  sql_stmt := 'select emp_no from vb_emp_no partition(' || p_name || ')';
  open v_emp_cursor for sql_stmt ;

loop
  fetch v_emp_cursor into v_emp_no;
  EXIT WHEN v_emp_cursor%NOTFOUND;
  insert into employees
          (EMPLOYEE_NUMBER,
           VACATION_BALANCE,
           RATE, PERIOD, TOTAL_RATE)
  select distinct employee_number, vacation_balance, rate, period, total_rate
    from   remote_employee_view@remote_link;
  where  EMPLOYEE_NUMBER = v_emp_no;
end loop;

commit;
close v_emp_cursor;
end;
/

This procedure glues all parts together. We schedule the previous procedure 4 times and achieve concurrency.

CREATE procedure refresh_employees
  authid definer
is
begin
  execute immediate 'truncate table vb_emp_no';
  insert into vb_emp_no
  select distinct employee_number from remote_view@db_link;
  commit;
  execute immediate 'truncate table employees';

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P1_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P1''); end;',
   comments => 'Thread 1 to refresh employees',
   enabled => true,
   auto_drop => true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P2_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P2''); end;',
   comments => 'Thread 2 to refresh employees',
   enabled => true,
   auto_drop => true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P3_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P3''); end;',
   comments => 'Thread 3 to refresh employees',
   enabled => true,
   auto_drop = true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P4_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P4''); end;',
   comments => 'Thread 4 to refresh employees,
   enabled => true,
   auto_drop => true);

end;
/

The procedure would finish immediately even if there are errors. To check the execution status look under DBA_SCHEDULER_JOB_RUN_DETAILS to look for errors and running time.

End Result the job finished in 50 minutes more than 4x faster.

Hazem Ameen
Senior Oracle DBA