How to Select BLOB Chunk Size
Posted: September 11, 2012 Filed under: Uncategorized | Tags: BLOB, Chunk Size 1 Comment »Chunk size is parameter you set on a BLOB columns only. Setting this parameter correctly can improve BLOB reading performance and/or storage utilized by BLOB segment.
Here are some scenarios to help you determine which size is best for you:
If your BLOB size is less than 4K and you are using inline option (which you should always use anyway regardless of your BLOB size), then chunk size parameter doesn’t apply. “LOGSEGMENT” and “LOGINDEX” size will not change because data is saved within data block.
If your BLOB size is bigger than 4k and less or equal 8k then set chunk size to 8k. Best of both worlds, you should get the best performance and space utilization.
If your BLOB size is bigger than 8k and you care about good performance of BLOB (read/write) over space utilization, then set chunk size to 32K.
If your BLOB size is bigger than 8k and you care about space utilization over performance, then set chunk size to 8K.
If your BLOB size is unknown and you only care about performance (read/write), then set chunk size to 32K.
If your BLOB size is unknown and you only care about saving space, then set chunk size to 8K.
Note: You might think why not aim for speed over storage, after all storage is cheap. I agree with this premise, but, if you are storing TBs of BLOBs like we are, and it is not accessed as much, you actually might start considering saving storage over performance gains.
These finding were based on tests we ran in a 10.2.0.5, 64-bit environment.
Reading BLOBs from Database Tests
First test case is reading 35 MB BLOB from the database, notice about 40% increase in speed with 32K block size over 8K block.
Second test is reading 4 MB BLOB from the database, again notice about 60% increase in speed with 32K block size over 8K block.
|
Read Performance Tests 32k vs 8k |
|||
|
BLOB Size |
Chunk Size |
Number of I/O Reads |
Time in MSEC |
|
35 MB |
32K |
1074 |
300 |
|
|
8K |
4295 |
500 |
|
|
|
|
|
|
4 MB |
32k |
128 |
0.047 |
|
|
8K |
509 |
0.078 |
Writing BLOBs to Database Test
Writing test is inserting in the database a 10K BLOB, repeated 10 times. Now notice LOB segment size is about 2.5 time larger with 32K than 8K.
|
Space Utilization Test 32k vs 8k |
||
|
BLOB Size |
Chunk Size |
LOB Segment Size |
|
10K, 10 times |
32K |
655360 |
|
|
8k |
262144 |
Hazem Ameen
Senior Oracle DBA
Alter Table Shrink Taking too Long
Posted: April 9, 2012 Filed under: Uncategorized Leave a comment »When you execute “alter table shrink <segment>” , there is no direct way to tell you how long it will take. A helpful aid to determine when shrinking will finish is to use dbms_space.space_usage package. This function will show you block activity between LHWM and HHWM. I wrote a pipelined function to see output of this function and you can find it here. You’d appreciate this little pipelined function instead of using so many dbms_output.put_line functions.
As of 10g, Oracle introduced LHWM and HHWM.
HHWM is the same as HWM in prior versions which is all blocks above this mark have not been formatted.
LHWM (Low High Water Mark) which all blocks below this mark have been formatted.
Between LHWM and HHWM there are 5 categories of blocks:
- Unformatted
- 0 to 25 % empty
- 25% to 50% empty
- 50% to 75% empty
- 75% to 100% empty
You can use dbms_space.space_usage to report on the area between LHWM & HHWM
What to look for?
If you are shrinking a table, you will see the following behavior:
| Unformatted Blocks | Number of blocks will not change |
| 0 to 25% blocks | Number of blocks will decrease until it becomes zero |
| 25% to 50% blocks | Number of blocks will decrease until it becomes zero |
| 50% to 75% empty | Number of blocks will decrease until it becomes zero |
| 75% to 100% empty | Number of blocks will increase |
| Full Blocks | This is number of blocks below LHWM (actual table size) and will increase as the other numbers will decrease |
Shrinking index behavior is different. Full blocks will decrease first, while other block categories will increase, then Full blocks will increase again.
Hazem
Oracle Pipelined Function for DBMS_SPACE.FREE_USAGE
Posted: June 27, 2011 Filed under: Oracle | Tags: dbms_space, Pipelined Function Leave a comment »This is a pipelined function to show dbms_space.space_usage output in a table instead of using dbms_output.
dbms_space.space_usage will show you the details of blocks between LHWM (Low High Water Mark) and HHWM (High High Water Mark). Based on the output of this function, you can decide whether to shrink the segment or not.
There are many online resources to explain pipelined functions so I will not do this here.
Here is the code:
-- create object
create type space_usage_object as object
(
Segment_owner varchar2(30),
segment_name varchar2(30),
segment_type varchar2(30),
partition_name varchar2(30),
blocks_unformatted number,
bytes_unformatted number,
blocks_0_25 number,
bytes_0_25 number,
blocks_25_50 number,
bytes_25_50 number,
blocks_50_75 number,
bytes_50_75 number,
blocks_75_100 number,
bytes_75_100 number,
blocks_full number,
bytes_full number
);
-- create type
CREATE TYPE space_usage_type AS TABLE OF space_usage_object;
-- pipelined function
create or replace FUNCTION SPACE_USAGE (
in_segment_owner IN VARCHAR2 ,
in_segment_name IN VARCHAR2 ,
in_segment_type IN VARCHAR2 ,
in_partition_name IN VARCHAR2 := null)
RETURN space_usage_type PIPELINED AS
v_unformatted_blocks number;
v_unformatted_bytes number;
v_fs1_blocks number;
v_fs1_bytes number;
v_fs2_blocks number;
v_fs2_bytes number;
v_fs3_blocks number;
v_fs3_bytes number;
v_fs4_blocks number;
v_fs4_bytes number;
v_full_blocks number;
v_full_bytes number;
begin
dbms_space.space_usage(segment_owner => in_segment_owner,
segment_name => in_segment_name,
segment_type => in_segment_type,
unformatted_blocks => v_unformatted_blocks,
unformatted_bytes => v_unformatted_bytes,
fs1_blocks => v_fs1_blocks,
fs1_bytes => v_fs1_bytes,
fs2_blocks => v_fs2_blocks,
fs2_bytes => v_fs2_bytes,
fs3_blocks => v_fs3_blocks,
fs3_bytes => v_fs3_bytes,
fs4_blocks => v_fs4_blocks,
fs4_bytes => v_fs4_bytes,
full_blocks => v_full_blocks,
full_bytes => v_full_bytes,
partition_name => in_partition_name);
pipe row (space_usage_object (in_segment_owner, in_segment_name, in_segment_type, in_partition_name,
v_unformatted_blocks , v_unformatted_bytes,
v_fs1_blocks, v_fs1_bytes, v_fs2_blocks, v_fs2_bytes, v_fs3_blocks,
v_fs3_bytes, v_fs4_blocks , v_fs4_bytes , v_full_blocks , v_full_bytes ));
RETURN ;
END;
-- selecting
select * from table(space_usage ('OWNER', 'SEGMENT_NAME', 'SEGMENT_TYPE'));
Hazem Ameen
Senior Oracle DBA
Bulk Deletion Performance with Foreign Keys Constraints (on delete cascade)
Posted: March 29, 2011 Filed under: Oracle | Tags: Foreign Keys, TX Locks Leave a comment »The purpose of this post is to discourage DBAs from creating “Enabled” foreign keys on tables that are expected to have bulk operations. I still suggest to create foreign keys but disable them so designers, DBAs and developers will still know which tables are related.
Our test cases were run in our development environment which is 10.2.0.5 RAC sitting on HP Itanium.
The target was to delete a few million rows from 2 tables which were related 1 to many with an enabled foreign key “on delete cascade”.
The initial plan was simple:
1) Delete from child table first so we can eliminate lookups from parent to child.
2) Delete from parent .
3) Use a nice parallel hint for both steps.
We tested our plan with about 35,000 rows.
First step went fine, but second step the query went for a long time, when we checked, we found that query is waiting for TX Lock even though we executed the query in the same session.
Why wait for TX Lock in the same session?
This is because the first query was using a parallel hint which created parallel sessions. These parallel sessions locked the rows in the child table. You couldn’t move forward unless we issue a commit.
You can also produce the same behavior in a single instance environment without using any parallelism by doing the following simple steps:
1) Delete child row
2) Create a new session and then delete the parent row.
The deletion of the parent row will hang until you commit in the child row.
Foreign Keys Performance
The following test cases were executed for about 35,000 rows. Every time we delete, we copy the data back.
We deleted child data first, committed and then deleted from parent table, it took 75 seconds
We deleted directly from parent table, it took 75 seconds (apparently there is no gain from deleting from child table first then deleting from parent table, bummer!)
We disabled foreign keys, delete from child, then delete from parent, total time was 12 seconds only, 6 times as fast.
Conclusion
Foreign Keys are excellent for ERD diagrams and remind everybody which tables are related, but physically disable them and you should live happily every after.
Hazem Ameen
Senior Oracle DBA
Insert Append vs CTAS “Create Table as Select” to Copy Data
Posted: January 4, 2011 Filed under: Oracle 2 Comments »Some of the most basic tasks a DBA is asked to do, is to move data across schemas or tablespaces. This post compares 2 famous options: CTAS and Insert with append hint. Our comparison revealed that parallel CTAS is about 50% faster than Insert Append.
Test Details:
Table: 2 GB approx, no LOBs, source and target tables are not partitioned
Oracle version: 10.2.0.4 installed on HP Itanium with 4 CPUs
Instance: Single instance ( no RAC), database in noarchive mode
File System: UFS
Here are the test results:
|
Insert Append Test Cases |
||
| Test Case | SQL | Exec in Secs |
| no append | insert into dest select * from source1;
|
189 |
| append | insert /*+ append */ into dest select * from source1;
|
87 |
|
CTAS Test Cases |
||
| Test Case | SQL | Exec in Secs |
| CTAS, no parallel | create table dest as select * from source1;
|
93 |
| CTAS Parallel | alter session force parallel ddl parallel 3; alter session force parallel query parallel 3; create table dest as select * from source1;
|
44 |
Hazem Ameen
Senior Oracle DBA
Data Scrambling in Oracle (including Arabic)
Posted: August 21, 2010 Filed under: Oracle | Tags: Scramble Leave a comment »A regular part of a DBA job is moving production data to development environment. This sometimes poses a challenge as how to protect sensitive production data without spending more time more than you have to or buying or 3rd party tool to generate random values to replace existing sensitive production data. This post discusses an idea we are currently using in our development environment.
Before detailing our scrambling solution, here is a common question: What is the difference between encryption and scrambling.
Here are some of these differences:
| Encryption | Scrambling |
| Encryption is an algorithm applied to data usually you would need an encryption key along with Oracle built-in packages | Replace existing characters or numbers with another character. The end result is same string size but with different characters. |
| Support for Encryption is built in Oracle | Not built in Oracle. You have to develop your own. |
| Reversible. | Irreversible. |
| Encrypted column require an increase in size for DES, 3DES, 3DES encryptions | Same size. |
| Application needs to be aware of encrypted columns so data can be encrypted/unencrypted | Application doesn’t need to be aware of it. |
| Suitable for production | Suitable for development |
| Near impossible to crack | Scrambling is easier to crack unless you generate random values. |
| New data is also encrypted | New data is not scrambled but in development it should be safe |
How to Scramble Data in Oracle
The best way to perform scrambling is through translate built-in function.
Here is an example:
SELECT 'Hazem Ameen' as before_scrambling , TRANSLATE(lower('Hazem Ameen'),
'.@,0123456789abcdefghijklmnopqrstuvxyzابتثجحخدذرزسشصضطظعغفقكلمنوهي', '.@,4214563875qwertyuiop[kjhbvabcdefxzgباتنمكضصثقفغعهخحجدظزوةىرذشسؤ') after_scrambling
FROM dual;
| BEFORE_SCRAMBLING |
AFTER_SCRAMBLING |
| Hazem Ameen | iqgtj qjtth |
It is really up to you how you want to apply it in your environment. At the end would have to execute a statement like this on every column that requires scrambling:
Update table_name set <column_name_to_scrambled> = translate (<column_name_to_scrambled>, ‘.@,0123456789abcdefghijklmnopqrstuvxyzابتثجحخدذرزسشصضطظعغفقكلمنوهي’, ‘.@,4214563875qwertyuiop[kjhbvabcdefxzgباتنمكضصثقفغعهخحجدظزوةىرذشسؤ’)Here are some tips on how to apply this in your environment:
- If you have too many columns or the process of scrambling repeats often, insert table name along with columns to be scrambled in a table, then develop a PL/SQL procedure to read this table and execute the update statement dynamically.
- Updating large amount of data via a single update statement will take a very long time; instead you would open a cursor and loop through the data.
- Be aware that triggers on the table will slow down this procedure dramatically. If you can disable them first before running this procedure.
- This scrambling process doesn’t scramble dates or LOBS.
Hazem Ameen Senior Oracle DBA