For a 9.2.0.5 database, I have been asked to add few columns and update them with new values from another table. Base table contained 35 million rows.
To accomplish this, we researched 3 solutions: update via insert, merge statement and pl/sql with bulk collect
Update via Insert
With this method, you don’t execute any updates, instead you create a new table with the old and new data.
Here are the steps:
1. Create a new empty table with both old and new columns.
2. Insert data using append hint into the new table.
3. Recreate indexes on new table.
4. Gather statistics on new table.
5. Rename old table or drop.
6. Rename new table to old name.
Pros:
- The fastest method to finish.
- Not complicated, steps are straight forward.
Cons:
- When executing steps 5 & 6 there is a sub-second where the table is not accessible. Even that tiny bit is not acceptable in our environment.
- If you have materialized view log or CDC (change data capture), you would have to drop it and then recreate it.
We didn’t benchmark this method because it wouldn’t work in our environment due the sub-second unavailability of base table.
Merge Statement
Use Oracle built-in function merge introduced in 9i.
Pros:
- Easiest to implement, just 1 SQL statement.
Cons:
- Merge statement was very slow. It took 17 hours to complete which was unacceptable.
Bulk Collect
Using rowid to perform bulk updates on the base table.
Here are the steps:
1. Add/populate rowid of the base table to new table containing new data.
2. If possible, try to sort the new data in the same way like base table. In my case, we sorted the new data by the primary key. This will reduce multiple reads/writes of the same base table blocks.
Pros:
- It took 74 minutes down from 17 hours
Cons:
- Writing pl/sql was a bit of pain and must be adjusted and retested every time there is a change.
- You might have to drop materialized view log and change data capture tables and then recreate them again after the operation finishes.
Here is sample code for bulk collect
DECLARE
– source table cursor (only columns to be updated)
CURSOR base_table_cur IS
SELECT rowid, col1, col2
FROM base_table;
type base_type IS TABLE OF base_table_cur%ROWTYPE INDEX BY PLS_INTEGER;
base_tab base_TYPE;
– new data
CURSOR new_data_cur IS
SELECT row_id, new_col1, new_col2
FROM new_data;
type new_data_type IS TABLE OF new_data_cur%ROWTYPE INDEX BY PLS_INTEGER;
new_data_tab new_data_TYPE;
type row_id_type IS TABLE OF ROWID INDEX BY PLS_INTEGER;
row_id_tab row_id_TYPE;
TYPE rt_update_cols IS RECORD
(
new_col1 new_data.new_col1%type,
new_col2 new_data.new_col1 %type
);
TYPE update_cols_type IS TABLE OF rt_update_cols INDEX BY PLS_INTEGER;
update_cols_tab update_cols_type;
dml_errors EXCEPTION;
PRAGMA EXCEPTION_INIT(dml_errors, -24381);
BEGIN
open base_table_cur;
open new_data_cur;
loop
fetch base_table_cur bulk collect into base_tab limit 5000;
fetch new_data_cur bulk collect into new_data_tab limit 5000;
FOR i in base_tab.FIRST .. base_tab.LAST LOOP
row_id_tab(i) := new_data_tab(i).row_id;
update_cols_tab(i).col1 := new_data_tab(i).new_col1;
update_cols_tab(i).col2 := new_data_tab(i).new_col2;
end loop;
FORALL i IN base_tab.FIRST .. base_tab.LAST SAVE EXCEPTIONS
update (select col1, col2
from base_table )
set row = update_cols_tab(i)
where rowid = row_id_tab(i);
commit;
exit when base_tab.count < 5000;
end loop;
commit;
close base_table_cur;
close new_data_cur;
EXCEPTION
WHEN dml_errors THEN
FOR i IN 1 .. SQL%BULK_EXCEPTIONS.COUNT LOOP
dbms_output.put_line(’Output your error here’);
END LOOP;
END;
/
Conclusion
Use insert via update if sub-second of downtime is acceptable
Use merge if you have small set data
User pl/sql bulk collect if you have large set of data and even sub-second of downtime is not acceptable
Hazem Ameen
Senior Oracle DBA