Agenda
- The Three Layers of ETL
- Think about Hardware
- Database
- Staging area
- Use and Understand the Database features
- SQL*Loader or External Tables
- Direct path
- Parallelism
- Partition exchange load
- Compression
- Q&A
In this session will not cover ETL tools. The main focus will be on the database
And how you can do use a key set of features to ensure your data loads happen
As quickly and efficiently as possible. I will also briefly touch on some hardware
Considerations you should keep in mind when designing your loading strategy.
We have talked about having a balanced configuration for the
Database but there is another key hardware configuration
That will greatly impact the load performance and that is the staging area. In every
Tuning engagement I did while part of the Real World Performance group
I invariably found that the performance of the data load was in fact throttled
Or constricted by the throughput of the staging area. Given the fact the data
Simply lands in the staging a couple of minutes before its loaded tends to make
People feel its not important. BUT it is actually critically import to load performance
So basic guidelines are
Spread the staging area across as many physical spindles as you can. Just with any non Exadata storage
Each physical disk will only be able to read ~20 MB/s. Stuff all your flat files on a single physical disk and its
Maximum throughtput will by the throughput for the entire load.
Use as much network bandwidth as you can. Don’t forget the HBA to CPU ratio you learnt for the database
The same should be applied to staging. In the DBM you have spare IB connections seriously consider using them
For your load.
Internal files systems on the database node are a bad idea for a lot of reason,
Few physical disks
Not visible across all nodes
In the case of the DBM the final bullet point is not feasible but that is not ture of a custom installation.
Oracle offer a varitey of data loading options depending on
The source system. If the data is coming from an other Oracle
Environment both Data Pump or Transportable tablespaces are
Good options, as both allow you to move large amount of data
Quickly and in parallel.
I would avoid using Dblinks as they can’t do parallel query.
However, you are much more likely to see flat files being used
As every other database vender (other tha Oracle)offers a fast unload
Feature that puts data into tha flat file. For the rest of this session I
Will be assuming we are loading from falt files.
If you are loading from flat files, you have two options with
Oracle SQL*Loader and External tables
© 2010, www.oracledatabase12g.com. 版权所有.文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.
相关文章 | Related posts:
- Designing a 300TB Data Warehouse with ASM
- HP-UX: Dynamic (runtime) Loading of Shared Libraries
- How to Make Use of Oracle's Data Compression with Materialized Views.
- Expert Oracle Practices Oracle Database Administration from the Oak Table PDF
- Data gathering for troubleshooting Oracle Real Application Cluster issues
- Oracle Database 11g Unstructured Data
- 【书籍推荐】Expert Oracle Practices
- Oracle 11gR1 Streams Practices and Solutions
- Oracle Data Integrator 11g Product Overview and New Features
- Oracle Real Application Clusters Installation and Configuration Best Practices




最新评论