Tuning PL/SQL with Multithreading & DBMS_SCHEDULER

We had a piece of PL/SQL code that copies data from remote view to a local table. This code executes once a day and took about 4 hours. As if this is not bad enough, we received a request to run this piece of code 3-4 times daily which will total up to about 16 hours per day.

The original code to copy data from a remote view consists of 3 steps:

  1. Copy over employee numbers (one column only) from remote view to local temp table. This step takes few seconds only.
  2. Use employee numbers copied locally to do a lookup serially (lookup row by row) from remote view. This step takes about 4 hours to copy 20,000 rows. The reason it takes that long is one column from the view is an actual function. Most of the time is spent executing this function and for so many reasons (not all are technical) we can’t get to function to tune it.
  3. Commit one time at the end.

Here how the original code looks like:

— Step 1:
— truncate local temp table so we can copy employee numbers
— copy over employees numbers to local temp table from the remote view

execute immediate 'truncate table temp_emp_no';
execute immediate 'truncate table employees';

insert into temp_emp_no select distinct EMPLOYEE_NUMBER from remote_employee_view@remote_link;
commit;

— Step 2:
— Loop through copied over employee numbers and do a remote lookup 1 row at a time

cursor c1 is
   select emp_no from temp_emp_no;

for i in c1 loop
   exit when c1%notfound;
   insert into employees
   select EMPLOYEE_NUMBER, first_name, last_name, RATE, PERIOD, total_rate
     from remote_employee_view@remote_link;
   where EMPLOYEE_NUMBER = i.emp_no;
end loop;

commit;

At this point we were left with one option only which is run the PL/SQL code in parallel (multithreading), but as you might now, PL/SQL doesn’t support multithreading natively,

To simulate multithreading we need to accomplish 2 steps:

  1. Break the job in multiple pieces (threads).
  2. Schedule every thread to run concurrently.

Remember first step which is coping employee numbers only from remote view to local table. We hash-partitioned this local table into 4 partitions. Each of these partitions will translate into a thread as you will see.

CREATE TABLE vb_emp_no
   (emp_no  VARCHAR2(30))
    PARTITION BY HASH (EMP_NO)
     PARTITIONS 4
/

This is the thread code (stored procedure) which takes in a partition name as a parameter and copies employee data from remote view for that partition only.

CREATE procedure refresh_employee_data_part (p_name in varchar2 )
  authid definer
is
  TYPE EmpCurTyp  IS REF CURSOR;
  v_emp_cursor    EmpCurTyp;
  sql_stmt varchar2(2048);
  v_emp_no varchar2(30);

begin
  sql_stmt := 'select emp_no from vb_emp_no partition(' || p_name || ')';
  open v_emp_cursor for sql_stmt ;

loop
  fetch v_emp_cursor into v_emp_no;
  EXIT WHEN v_emp_cursor%NOTFOUND;
  insert into employees
          (EMPLOYEE_NUMBER,
           VACATION_BALANCE,
           RATE, PERIOD, TOTAL_RATE)
  select distinct employee_number, vacation_balance, rate, period, total_rate
    from   remote_employee_view@remote_link;
  where  EMPLOYEE_NUMBER = v_emp_no;
end loop;

commit;
close v_emp_cursor;
end;
/

This procedure glues all parts together. We schedule the previous procedure 4 times and achieve concurrency.

CREATE procedure refresh_employees
  authid definer
is
begin
  execute immediate 'truncate table vb_emp_no';
  insert into vb_emp_no
  select distinct employee_number from remote_view@db_link;
  commit;
  execute immediate 'truncate table employees';

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P1_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P1''); end;',
   comments => 'Thread 1 to refresh employees',
   enabled => true,
   auto_drop => true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P2_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P2''); end;',
   comments => 'Thread 2 to refresh employees',
   enabled => true,
   auto_drop => true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P3_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P3''); end;',
   comments => 'Thread 3 to refresh employees',
   enabled => true,
   auto_drop = true);

dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name('VB_P4_'),
   job_type => 'PLSQL_BLOCK',
   job_action => 'begin refresh_employee_data_part(''VB_EMP_NO_P4''); end;',
   comments => 'Thread 4 to refresh employees,
   enabled => true,
   auto_drop => true);

end;
/

The procedure would finish immediately even if there are errors. To check the execution status look under DBA_SCHEDULER_JOB_RUN_DETAILS to look for errors and running time.

End Result the job finished in 50 minutes more than 4x faster.

Hazem Ameen
Senior Oracle DBA

Advertisements