Tuning PL/SQL with Multithreading & DBMS_SCHEDULER

We had a piece of PL/SQL code that copies data from remote view to a local table. This code executes once a day and took about 4 hours. As if this is not bad enough, we received a request to run this piece of code 3-4 times daily which will total up to about 16 hours per day.

The original code to copy data from a remote view consists of 3 steps:

  1. Copy over employee numbers (one column only) from remote view to local temp table. This step takes few seconds only.
  2. Use employee numbers copied locally to do a lookup serially (lookup row by row) from remote view. This step takes about 4 hours to copy 20,000 rows. The reason it takes that long is one column from the view is an actual function. Most of the time is spent executing this function and for so many reasons (not all are technical) we can’t get to function to tune it.
  3. Commit one time at the end.

Here how the original code looks like:

— Step 1:
— truncate local temp table so we can copy employee numbers
— copy over employees numbers to local temp table from the remote view

execute immediate 'truncate table temp_emp_no';
execute immediate 'truncate table employees';

insert into temp_emp_no select distinct EMPLOYEE_NUMBER from remote_employee_view@remote_link;
commit;

— Step 2:
— Loop through copied over employee numbers and do a remote lookup 1 row at a time

cursor c1 is
   select emp_no from temp_emp_no;

for i in c1 loop
   exit when c1%notfound;
   insert into employees
   select EMPLOYEE_NUMBER, first_name, last_name, RATE, PERIOD, total_rate
     from remote_employee_view@remote_link;
   where EMPLOYEE_NUMBER = i.emp_no;
end loop;

commit;

At this point we were left with one option only which is run the PL/SQL code in parallel (multithreading), but as you might now, PL/SQL doesn’t support multithreading natively,

To simulate multithreading we need to accomplish 2 steps:

  1. Break the job in multiple pieces (threads).
  2. Schedule every thread to run concurrently.

Remember first step which is coping employee numbers only from remote view to local table. We hash-partitioned this local table into 4 partitions. Each of these partitions will translate into a thread as you will see.

CREATE TABLE vb_emp_no
   (emp_no  VARCHAR2(30))
    PARTITION BY HASH (EMP_NO)
     PARTITIONS 4
/

This is the thread code (stored procedure) which takes in a partition name as a parameter and copies employee data from remote view for that partition only.

CREATE procedure refresh_employee_data_part (p_name in varchar2 )
  authid definer
is
  TYPE EmpCurTyp  IS REF CURSOR;
  v_emp_cursor    EmpCurTyp;
  sql_stmt varchar2(2048);
  v_emp_no varchar2(30);

begin
  sql_stmt := 'select emp_no from vb_emp_no partition(' || p_name || ')';
  open v_emp_cursor for sql_stmt ;

loop
  fetch v_emp_cursor into v_emp_no;
  EXIT WHEN v_emp_cursor%NOTFOUND;
  insert into employees
          (EMPLOYEE_NUMBER,
           VACATION_BALANCE,
           RATE, PERIOD, TOTAL_RATE)
  select distinct employee_number, vacation_balance, rate, period, total_rate
    from   remote_employee_view@remote_link;
  where  EMPLOYEE_NUMBER = v_emp_no;
end loop;

commit;
close v_emp_cursor;
end;
/

This procedure glues all parts together. We schedule the previous procedure 4 times and achieve concurrency.

CREATE procedure refresh_employees
  authid definer
is
begin
  execute immediate 'truncate table vb_emp_no';
  insert into vb_emp_no
  select distinct employee_number from remote_view@db_link;
  commit;
  execute immediate 'truncate table employees';

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P1_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P1''); end;',
   comments => 'Thread 1 to refresh employees',
   enabled => true,
   auto_drop => true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P2_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P2''); end;',
   comments => 'Thread 2 to refresh employees',
   enabled => true,
   auto_drop => true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P3_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P3''); end;',
   comments => 'Thread 3 to refresh employees',
   enabled => true,
   auto_drop = true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P4_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P4''); end;',
   comments => 'Thread 4 to refresh employees,
   enabled => true,
   auto_drop => true);

end;
/

The procedure would finish immediately even if there are errors. To check the execution status look under DBA_SCHEDULER_JOB_RUN_DETAILS to look for errors and running time.

End Result the job finished in 50 minutes more than 4x faster.

Hazem Ameen
Senior Oracle DBA

Advertisements

9 thoughts on “Tuning PL/SQL with Multithreading & DBMS_SCHEDULER

  1. How did you get your partition names to come out like that as when I ran similar DDL I get completely different naming convention. Ideally I’d like to be able to define the partitions with my own name.

  2. I’ve tried this solution, but I get a permission error when I run it. The schema I’m logged on as has scheduler_admin permissions. I can execute an anonymous block to drop a job for the scheduler, but when it’s part of a package it doesn’t seem to work. Any ideas?

      1. I’m using 11G. We ended up finding a fix for the problem. The schema I was using has been grated SCHEDULER_ADMIN privilege because we thought that included the ability to create jobs. Apparently it does….if you’re making the jobs directly in anonymous blocks. But if you’re dropping jobs during a package run then it won’t be the correct permission. We gave CREATE JOB and CREATE ANY JOB permissions to that schema and it fixed things.

  3. A very good article on scheduling jobs in oracle.But i have a scenario where i dont have a table with all the Empno insteasd I have a cursor holding all the empno.
    Can you suggest how to pass the cursor values to the scheduler w.r.t the above scenario.
    It would be grt help.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s