Concepts, notions and terminology used in this presentation
– Part of Oracle tuning is to reduce the expensive disk I/O (physical reads).
– One way to accomplish this, is to reduce the logical I/O (logical reads).
– A Materialized View (MV) is an Oracle schema object that contains the results of a query.
– Materialized Views (MVs) were first introduced in oracle 8i and are part of a concept known as “Summary Management”.
– The concept of “Summary Management” includes five components:
- Dimensions
- Materialized views
- Refresh
- Query rewrite
- SQL access advisor
– We can think of a “MV” as an object which is a hybrid between a table and a view. It is a table, in the sense that it stores data (the results of a SQL query). It is a view, in the sense that it stores the definition of a SQL query. In the Oracle data dictionary, a “MV” presents itself as both a table and a view.
– The “MV” can act as an “aggregate view”, used in Data Warehousing to improve query performance, or to provide replicated data.
– The “MV” contains data obtained by pre-calculating expensive joins and aggregation operations, prior to execution. The results are stored in a table in the database.
– To create a “MV” we use the “create materialized view”
statement. The “from” clause of the query can reference db objects such as tables, views and other “MV”s. Collectively, these objects are called “detail tables”.
– “Summary Management” provides mechanisms to:
- Fully refresh the data in the “mv”
- Perform a fast refresh of the “mv” by adding / merging what has changed
- Automatically update a “mv” every time changes are made to the source / underlying tables
– These refresh operations may be performed:
- On demand for: specific “mv”s, “mv”s dependent on a table, or for all “mv”s
- On commit, whenever the underlying tables of the “mv” are changed
– A Dimension, within the context of “summary management”, is an oracle object, defined as metadata, which describes a set of related columns, and which defines hierarchical relationships (parent / child) between them. Those columns do not have to belong to the same table.
– When defining a hierarchy, multiple columns may be specified, in order to describe the hierarchy structure. Example: day is a child of month which is a child of quarter which is a child of year etc.
– A dimension can be created by using the sql statement
create dimension time_cust_dim …
– A dimension can be described by using the sql statement
execute dbms_dimension.describe_dimension (‘time_cust_dim’)
– “Query Rewrite” is an oracle optimization technique which transforms a user request that references “detail tables” (see above), into a semantically equivalent request that references one or more “MV”s, and which executes faster.
– This results in significantly improved query response times.
– For performance optimization in a data warehouse environment, the materialized views typically used are “Materialized Aggregate Views”, “Single – Table Materialized Aggregate Views” and “Materialized Join Views”.
– Query Rewrite can use all 3 types of “MV”s mentioned above.
– Query Rewrite using Materialized Views: The database administrator creates one or more materialized views, which are the equivalent of a summary. The end user queries the tables and views at the detail data level. The query rewrite mechanism in the Oracle server automatically rewrites the SQL query to use the summary tables / views.
– There are several types of query rewrite methods, as described below:
– Within the scope of this case study, I have used “rollup using a dimension” as a method of query rewrite. The definition of this method is outlined below:
- When reporting is required at different levels in a hierarchy, “mv”s do not have to be created at each level of the hierarchy, IF relational dimensions have been defined. Query rewrite can use the relationships defined in the dimension metadata, to roll up the data in the “mv” at the levels required.
– The cost is an estimated value proportional to the expected resource use needed to execute the statement with a particular plan.
– The optimizer calculates the cost of access paths and join orders based on the estimated computer resources, which includes I/O, CPU, and memory.
– The value of column COST does not have any particular unit of measurement; it is a weighted value used to compare costs of execution plans. The value of this column is a function of the CPU_COST and IO_COST columns.
– “consistent gets” is the term used by Oracle to describe an Oracle I/O that is done exclusively from the buffer cache.
– Oracle has several types of “consistent gets”, which are all undocumented
– Types of “consistent gets”:
- consistent gets
- consistent gets from cache
- consistent gets – examination
- consistent gets direct
SUMMARY
– The value of column COST does not have any particular unit of measurement; it is a weighted value used to compare costs of execution plans.
– Consistent Gets – a normal reading of a block from the buffer cache. A check will be made if the data needs reconstructing from rollback info to give you a consistent view but most of the time it won’t.
– DB Block Gets – internal processing. When the very latest version of the block is needed. Don’t worry about them unless you are interested in oracle internals and have a lot of time to spend on it.
– Physical Reads – Where Oracle has to get a block from the IO subsystem.
The Relationships Among Segments, Extents, and Data Blocks
- At the finest level of granularity, Oracle stores data in data blocks(also called logical blocks, Oracle blocks, or pages). One data block corresponds to a specific number of bytes of physical database space on disk.
- The next level of logical database space is an extent. An extent is a specific number of contiguous data blocks allocated for storing a specific type of information.
- The level of logical database storage greater than an extent is called a segment. A segment is a set of extents, each of which has been allocated for a specific data structure and all of which are stored in the same tablespace.
- As we can see in Figure 2 and in Figure 3, a segment can be made of extents of various sizes.
For Figure 3, we have: 23 * 1048576 + 16 * 65536 = 25165824
The Oracle Query Optimizer
Depending on some of the database initialization parameters, as well as the modified system or session parameters, either one of, or both, paths / flows may be executed by the Oracle optimizer, as depicted in Figure 4.
There are 2 parameters which directly influence the oracle optimizer with regards to query rewrite:
- query_rewrite_enabled
- query_rewrite_integrity
For example, if “query_rewrite_enabled = false”, then only the left path is executed.
If “query_rewrite_enabled = true” and the query can be rewritten, then both paths / flows are executed, and the plan with the lowest cost is chosen for execution.
There are 3 possible values for the parameter query_rewrite_integrity:
- Stale_tolerated
- Trusted (value used in this case study)
- Enforced (default value)
– When “query_rewrite_integrity = trusted”, the query optimizer trusts that the relationships declared in the dimensions and the rely constraints are correct. The optimizer will also use pre-built “mv”s, even though it cannot verify their content against their defining query. This mode offers greater query rewrite capabilities. To ensure correct results, the declared trusted relationships must be correct.
– When “query_rewrite_integrity = stale_tolerated”, the optimizer uses the same trusted relationships as in “trusted” mode, and it will use “mv”s that are known to be stale.
– When “query_rewrite_integrity = enforced”, the optimizer will only use “mv”s that it knows to contain fresh data, and it will only use validated relationships (all constraints must be validated). Sometimes, query rewrite might not occur by using this setting, but it will occur by using the “trusted” or “stale_tolerated” settings. For example, when using constraints with the “enable novalidate rely
” clause, certain types of query rewrite might not work, in which case we should set the integrity level to “trusted” or ”stale_tolerated”.
– Note: “rely” constraints are not validated, and are not used for data validation. They can enable more sophisticated query rewrites for “mv”s, and can enable DWH tools to retrieve constraint information directly from the data dictionary. They can be created using minimal overhead during DML or data load.
– There are 3 possible values for the parameter query_rewrite_enabled:
- true (enable query rewrite)
- false (disable query rewrite)
- force (this will deactivate any costing evaluation of a rewritten plan, and it will rewrite a query whenever possible)
Continued on next page…
<< Prev Page | Next Page >>