Monday 8 June 2020

Materialized Views in Oracle

Materialized Views in Oracle
A materialized view is a database object that contains the results of a query. They are local copies of data located remotely, or are used to create summary tables based on aggregations of a table's data. Materialized views, which store data based on remote tables are also, know as snapshots.
A materialized view can query tables, views, and other materialized views. Collectively these are called master tables (a replication term) or detail tables (a data warehouse term).
For replication purposes, materialized views allow you to maintain copies of remote data on your local node. These copies are read-only. If you want to update the local copies, you have to use the Advanced Replication feature. You can select data from a materialized view as you would from a table or view.
For data warehousing purposes, the materialized views commonly created are aggregate views, single-table aggregate views, and join views.
In this article, we shall see how to create a Materialized View and discuss Refresh Option of the view.
In replication environments, the materialized views commonly created are primary key, rowid, and subquery materialized views.
Primary Key Materialized Views
The following statement creates the primary-key materialized view on the table emp located on a remote database.
SQL>    CREATE MATERIALIZED VIEW mv_emp_pk
     REFRESH FAST START WITH SYSDATE
     NEXT  SYSDATE + 1/48
     WITH PRIMARY KEY
     AS SELECT * FROM emp@remote_db;

Materialized view created.
Note: When you create a materialized view using the FAST option you will need to create a view log on the master tables(s) as shown below:
SQL> CREATE MATERIALIZED VIEW LOG ON emp;
Materialized view log created.
Rowid Materialized Views
The following statement creates the rowid materialized view on table emp located on a remote database:
SQL>    CREATE MATERIALIZED VIEW mv_emp_rowid
     REFRESH WITH ROWID
     AS SELECT * FROM emp@remote_db;

Materialized view log created.
Subquery Materialized Views
The following statement creates a subquery materialized view based on the emp and dept tables located on the remote database:
SQL> CREATE MATERIALIZED VIEW  mv_empdept
AS SELECT * FROM emp@remote_db e
WHERE EXISTS
  (SELECT * FROM dept@remote_db d
  WHERE e.dept_no = d.dept_no)

REFRESH CLAUSE

[refresh [fast|complete|force]
     [on demand | commit]
     [start with date] [next date]
     [with {primary key|rowid}]]
The refresh option specifies:
  1. The refresh method used by Oracle to refresh data in materialized view
  2. Whether the view is primary key based or row-id based
  3. The time and interval at which the view is to be refreshed
Refresh Method - FAST Clause
The FAST refreshes use the materialized view logs (as seen above) to send the rows that have changed from master tables to the materialized view.
You should create a materialized view log for the master tables if you specify the REFRESH FAST clause.
SQL> CREATE MATERIALIZED VIEW LOG ON emp;

Materialized view log created.
Materialized views are not eligible for fast refresh if the defined subquery contains an analytic function.
Refresh Method - COMPLETE Clause
The complete refresh re-creates the entire materialized view. If you request a complete refresh, Oracle performs a complete refresh even if a fast refresh is possible.
Refresh Method - FORCE Clause
When you specify a FORCE clause, Oracle will perform a fast refresh if one is possible or a complete refresh otherwise. If you do not specify a refresh method (FAST, COMPLETE, or FORCE), FORCE is the default.
PRIMARY KEY and ROWID Clause
WITH PRIMARY KEY is used to create a primary key materialized view i.e. the materialized view is based on the primary key of the master table instead of ROWID (for ROWID clause). PRIMARY KEY is the default option. To use the PRIMARY KEY clause you should have defined PRIMARY KEY on the master table or else you should use ROWID based materialized views.
Primary key materialized views allow materialized view master tables to be reorganized without affecting the eligibility of the materialized view for fast refresh.
Rowid materialized views should have a single master table and cannot contain any of the following:
  • Distinct or aggregate functions
  • GROUP BY Subqueries , Joins & Set operations

Timing the refresh

The START WITH clause tells the database when to perform the first replication from the master table to the local base table. It should evaluate to a future point in time. The NEXT clause specifies the interval between refreshes
SQL>    CREATE MATERIALIZED VIEW mv_emp_pk
     REFRESH FAST
     START WITH SYSDATE
     NEXT  SYSDATE + 2
     WITH PRIMARY KEY
     AS SELECT * FROM emp@remote_db;

Materialized view created.
In the above example, the first copy of the materialized view is made at SYSDATE and the interval at which the refresh has to be performed is every two days.

Summary

Materialized Views thus offer us flexibility of basing a view on Primary key or ROWID, specifying refresh methods and specifying time of automatic refreshes.

Question:  I need to understand how refresh groups work in my Oracle database.  Can you explain the refresh group, how a refresh group works and when to use a refresh group?

Answer:  Sometimes you need multiple Materialized Views to be consistent with each other. This requires that they be refreshed together. To accomplish this, you can assign Materialized Views to refresh groups. When you refresh theMaterialized Views in a refresh group, all of the Materialized Views in that group will be refreshed to the same consistent point and time.
If they are not refreshed, the data in an Materialized View will become "stale" and will not reflect the data as it looks in the target table. As a review, Oracle offers several different methods of refreshing the Materialized View in a refresh group:
  • Fast Refresh – This causes the Materialized View to be updated with only the rows that have changed since it was last refreshed. You must have anMaterialized View Log on the target table in order to be able to fast refresh a view in Oracle. Fast refreshes have the benefit of not taking much time.
  • Complete Refresh – A complete refresh will cause the entire Materialized View to be truncated and then repopulated by data from the master table.This can be a very expensive operation in terms of both resources and time.
  • Force Refresh – This refresh method will cause Oracle to refresh using a fast refresh, if that is possible. If a fast refresh is not possible, Oracle will perform a complete refresh. This refresh method helps eliminate problems in certain cases where the Materialized View fast refresh would fail.
A refresh group refreshes all materialized views in the group together. By creating a refresh group and including both the employee and salary materialized views, the data integrity between the materialized views is maintained.
Refresh groups are created and maintained using the dbms_refresh package. There are a few practical limitations on refresh groups. The number of materialized views that can be in one refresh group is limited by the time required to execute the refresh.
If a refresh group is created with 100 replicated materialized views, the time it takes to refresh the 100 views will probably be prohibitive. None of the tables in the refresh group will be available during the long refresh. Especially with replicated materialized views, consideration must be given to the amount of time it takes to refresh across the network.
Another concern is the amount of redo/undo information generated during a refresh. A refresh group will generate more redo/undo than individual materialized views. For this reason, it is recommended that only materialized views that require referential integrity be placed into groups. Allowing the remainder of the materialized views to refresh on their own will reduce blocking waits caused by the refresh.
To create a refresh group called REP_GROUP1 for the replicated materialized views EMP AND DEPT use the following commands.
BEGIN
DBMS_REFRESH.MAKE(
name => '"SCOTT"."REP_GROUP1"',
list => '',
next_date => SYSDATE,
interval => '/*3:Mins*/ sysdate + 3/(60*24)',
implicit_destroy => TRUE,
lax => FALSE,
job => 0,
rollback_seg => NULL,
push_deferred_rpc => FALSE,
refresh_after_errors => TRUE,
purge_option => NULL,
parallelism => NULL,
heap_size => NULL);
END;
/
The command above creates an empty refresh group in the SCOTT schema called REP_GROUP1. Materialized views could be added using the MAKE command by listing them in the list parameter, such as:
list=>’emp,dept’
Both materialized views EMP and DEPT are added to REP_GROUP1. Because the lax clause is TRUE, if either view belonged to another refresh group, they would be added to REP_GROUP1 and deleted from the other group. Notice that the views are in the list clause. The same results could be obtained using the following command:
BEGIN
DBMS_REFRESH.ADD(
name => '"SCOTT"."REP_GROUP1"',
list => '"SCOTT"."EMP","SCOTT"."DEPT"',
lax => TRUE);
END;
/
To remove SCOTT.DEPT from REP_GROUP1, use the dbms_refresh.subtractprocedure.
BEGIN
DBMS_REFRESH.SUBTRACT(
name => '"SCOTT"."REP_GROUP1"',
list => 'SCOTT"."DEPT"',
lax => FALSE);
END;
/
If a materialized view is removed from a refresh group without being placed in another group, it will be placed in its own refresh group.
The dbms_refresh package handles the jobs that execute the refresh groups.
To remove SCOTT.DEPT from REP_GROUP1, use the dbms_refresh.subtractprocedure.
 BEGIN
  DBMS_REFRESH.SUBTRACT(
  name => '"SCOTT"."REP_GROUP1"',
  list => 'SCOTT"."DEPT"',
  lax => FALSE);
END;
/
If a materialized view is removed from a refresh group without being placed in another group, it will be placed in its own refresh group. 
Warning! - When a materialized view is placed in a refresh group, it will be refreshed at the interval set in the group, not in the materialized view. 
For example, if a materialized view is created with a refresh interval of 3 mins and is then placed in a refresh group with an internal of 5 mins, the materialized view will refresh every 5 mins.  However, the interval setting in the materialized view will still be 3 mins.  Once the materialized view is removed from the refresh group it will again refresh at 3 mins.  This causes confusion, especially when using OEM, because the materialized view indicates it is refreshing every 3 mins when in fact it is not.
The CHANGE procedure in the dbms_refresh package should also be mentioned.  The most common change is the refresh interval.
BEGIN
   DBMS_REFRESH.CHANGE(
  name => '"SCOTT"."REP_GROUP1"',
  next_date => SYSDATE,
  interval => '/*5:Mins*/ sysdate + 5/(60*24)');
END;
/
Here, the next execution date is set to SYSDATE (in other words, NOW) and the interval is set to 5 mins. 



The Secrets of Materialized Views

Overview
Materialized views are a data warehousing/decision support system tool that can increase by many orders of magnitude the speed of queries that access a large number of records. In basic terms, they allow a user to query potentially terabytes of detail data in seconds. They accomplish this by transparently using pre-computed summarizations and joins of data. These pre-computed summaries would typically be very small compared to the original source data.
In this article you'll find out what materialized views are, what they can do and, most importantly, how they work - a lot of the ' magic ' goes on behind the scenes. Having gone to the trouble of creating it, you'll find out how to make sure that your materialized view is used by all queries to which the view is capable of providing the answer. Sometimes, you know Oracle could use the materialized view, but it is not able to do so simply because it lacks important information.
Setup of Materialized Views
There is one mandatory INIT.ORA parameter necessary for materialized views to function, this is the COMPATIBLE parameter. The value of COMPATIBLE should be set to 8.1.0, or above, in order for query rewrites to be functional. If this value is not set appropriately, query rewrite will not be invoked.
There are two other relevant parameters that may be set at either the system-level via the INIT.ORA file, or the session-level via the ALTER SESSION command.
   QUERY_REWRITE_ENABLED
Unless the value of this parameter is set to TRUE, query rewrites will not take place. The default value is FALSE.
   QUERY REWRITE INTEGRITY
This parameter controls how Oracle rewrites queries and may
be set to one of three values:
ENFORCED - Queries will be rewritten using only constraints and rules that are enforced and guaranteed by Oracle. There are mechanisms by which we can tell Oracle about other inferred relationships, and this would allow for more queries to be rewritten, but since Oracle does not enforce those relationships, it would not make use of these facts at this level.
TRUSTED - Queries will be rewritten using the constraints that are enforced by Oracle, as well as any relationships existing in the data that we have told Oracle about, but are not enforced by the database.
STALE TOLERATED - Queries will be rewritten to use materialized views even if Oracle knows the data contained in the materialized view is ' stale ' (out-of-sync with the details). This might be useful in an environment where the summary tables are refreshed on a recurring basis, not on commit, and a slightly out-of-sync answer is acceptable.
The needed privileges are as follows:
   CREATE SESSION
   CREATE TABLE
   CREATE MATERIALIZED VIEW
   QUERY REWRITE
Finally, you must be using the Cost Based Optimizer CBO in order to make use of query rewrite. If you do not use the CBO, query rewrite will not take place.
Example
The example will demonstrate what a materialized view entails. The concept is that of reducing the execution time of a long running query transparently, by summarizing data in the database. A query against a large table will be transparently rewritten into a query against a very small table, without any loss of accuracy in the answer. For the example we create our own big table based on the system view ALL_OBJECTS.
Prepare the large table BIGTAB:
sqlplus scott/tiger
set echo on
set termout off
drop table bigtab;

create table bigtab
  nologging
  as
  select * from all_objects
  union all
  select * from all_objects
  union all
  select * from all_objects
/
insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;
insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;
insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;
analyze table bigtab compute statistics;
select count(*) from bigtab;
   COUNT(*)
----------
708456
Run query against this BIGTABLE
Initially this quewry will require a full scan of the large table.
set autotrace on
set timing on
select owner, count(*) from bigtab group by owner;
OWNER                            COUNT(*)
------------------------------ ----------
CTXSYS                               6264
ELAN                                 1272
HR                                    816
MDSYS                                5640
ODM                                  9768
ODM_MTR                               288
OE                                   2064
OLAPSYS                             10632
ORDPLUGINS                            696
ORDSYS                              23232
OUTLN                                 168
PM                                    216
PUBLIC                             278184
QS                                    984
QS_ADM                                168
QS_CBADM                              576
QS_CS                                 552
QS_ES                                 936
QS_OS                                 936
QS_WS                                 936
SCOTT                                 264
SH                                   4176
SYS                                324048
SYSTEM                              15096
TEST                                 4536
WKSYS                                6696
WMSYS                                3072
XDB                                  6240

28 rows selected.

Elapsed: 00:00:07.06

Execution Plan
----------------------------------------------------------
   0  SELECT STATEMENT Optimizer=CHOOSE
          (Cost=2719 Card=28 Bytes=140)
   1 0   SORT (GROUP BY) (Cost=2719 Card=28 Bytes=140)
   2 1 TABLE ACCESS (FULL) OF 'BIGTAB'
              (Cost=1226 Card=708456 Bytes=3542280)

Statistics
----------------------------------------------------------
       0  recursive calls
       0  db block gets
   19815  consistent gets
   18443  physical reads
       0  redo size
     973  bytes sent via SQL*Net to client
     510  bytes received via SQL*Net from client
       3  SQL*Net roundtrips to/from client
       1  sorts (memory)
       0  sorts (disk)
      28  rows processed
In order to get the aggregate count, we must count 700'000+ records on over 19800 blocks. If you need this summary often per day, you can avoid counting the details each and every time by creating a materialized view of this summary data.
Create the Materialized View
sqlplus scott/tiger
grant query rewrite to scott;
alter session set query_rewrite_enabled=true;
alter session set query_rewrite_integrity=enforced;
create materialized view mv_bigtab
  build immediate
  refresh on commit
  enable query rewrite
as
select owner, count(*)
  from bigtab
 group by owner
/
analyze table mv_bigtab compute statistics;
Basically, what we've done is pre-calculate the object count, and define this summary information as a materialized view. We have asked that the view be immediately built and populated with data. You'll notice that we have also specified REFRESH ON COMMIT and ENABLE QUERY REWRITE. Also notice that we may have created a materialized view, but when we ANALYZE, we are analyzing a table. A materialized view creates a real table, and this table may be indexed, analyzed, and so on.
Now let's see the materialized view in action by issuing the same query again
set timing on
set autotrace traceonly
select owner, count(*)
  from bigtab
 group by owner;
set autotrace off
set timing off
28 rows selected.

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
   0  SELECT STATEMENT Optimizer=CHOOSE
          (Cost=2 Card=28 Bytes=252)
   1 0   TABLE ACCESS (FULL) OF 'MV_BIGTAB'
            (Cost=2 Card=28 Bytes=252)

Statistics
----------------------------------------------------------
      11  recursive calls
       0  db block gets
      17  consistent gets
       0  physical reads
       0  redo size
     973  bytes sent via SQL*Net to client
     510  bytes received via SQL*Net from client
       3  SQL*Net roundtrips to/from client
       4  sorts (memory)
       0  sorts (disk)
      28  rows processed
No physical I/O this time around as the data was found in the cache. Our buffer cache will be much more efficient now as it has less to cache. W could not even begin to cache the previous query's working set, but now I can. Notice how our query plan shows we are now doing a full scan of the MV_BIGTAB table, even though we queried the detail table BIGTAB. When the SELECT OWNER, ... query is issued, the database automatically directs it to the materialized view.
Now, add a new row to the BIGTAB table and commit te change
insert into bigtab
  (owner, object_name, object_type, object_id)
  values ('Martin', 'Zahn', 'Akadia', 1111111);

commit;
set timing on
set autotrace traceonly
select owner, count(*)
  from bigtab
 where owner = 'Martin'
 group by owner;
set autotrace off
set timing off
Execution Plan
----------------------------------------------------------
   0  SELECT STATEMENT Optimizer=CHOOSE
          (Cost=2 Card=1 Bytes=9)
   1 0   TABLE ACCESS (FULL) OF 'MV_BIGTAB'
            (Cost=2 Card=1 Bytes=9)

Statistics
----------------------------------------------------------
       0  recursive calls
       0  db block gets
       4  consistent gets
       0  physical reads
       0  redo size
     439  bytes sent via SQL*Net to client
     499  bytes received via SQL*Net from client
       2  SQL*Net roundtrips to/from client
       0  sorts (memory)
       0  sorts (disk)
       1  rows processed
The analysis shows that we scanned the materialized view MV_BIGTAB and found the new row. By specifying REFRESH ON COMMIT in our original definition of the view, we requested that Oracle maintain synchronization between the view and the details, the summary will be maintained as well.
Uses of Materialized Views
This is relatively straightforward and is answered in a single word - performance. By calculating the answers to the really hard questions up front (and once only), we will greatly reduce the load on the machine, We will experience:
   Less physical reads - There is less data to scan through.
   Less writes - We will not be sorting/aggregating as frequently.
   Decreased CPU consumption - We will not be calculating aggregates and functions on the data, as we will have already done that.
   Markedly faster response times - Our queries will return incredibly quickly when a summary is used, as opposed to the details. This will be a function of the amount of work we can avoid by using the materialized view, but many orders of magnitude is not out of the question.
Materialized views will increase your need for one resource - more permanently allocated disk. We need extra storage space to accommodate the materialized views, of course, but for the price of a little extra disk space, we can reap a lot of benefit.
Materialized views work best in a read-only, or read-intensive environment. They are not designed for use in a high-end OLTP environment. They will add overhead to modifications performed on the base tables in order to capture the changes.
There are concurrency issues with regards to rising the REFRESH ON COMMIT option. Consider the summary example from before. Any rows that are inserted or deleted from this table will have to update one of 28 rows in the summary table in order to maintain the count in real time. This does not preclude the use of materialized views in an OLTP environment. For example if you use full refreshes on a recurring basis (during off-peak time) there will be no overhead added to the modifications, and there would be no concurrency issues. This would allow you to report on yesterday's activities, for example, and not query the live OLTP data for reports.
How Materialized Views Work
Materialized views may appear to be hard to work with at first. There will be cases where you create a materialized view, and you know that the view holds the answer to a certain question but, for some reason, Oracle does not. The more meta data provided, the more pieces of information about the underlying data you can give to Oracle, the better.
So, now that we can create a materialized view and show that it works, what are the steps Oracle will undertake to rewrite our queries? Normally, when QUERY REWRITE ENABLED is set to FALSE, Oracle will take your SQL as is, parse it, and optimize it. With query rewrites enabled, Oracle will insert an extra step into this process. After parsing, Oracle will attempt to rewrite the query to access some materialized view, instead of the actual table that it references. If it can perform a query rewrite, the rewritten query (or queries) is parsed and then optimized along with the original query. The query plan with the lowest cost from this set is chosen for execution. If it cannot rewrite the query, the original parsed query is optimized and executed as normal.
Conclusion
Summary table management, another term for the materialized view, has actually been around for some time in tools such as Oracle Discoverer. If you ran a query in SQL*PLUS, or from your Java JDBC client, then the query rewrite would not (could not) take place. Furthermore, the synchronization between the details (original source data) and the summaries could not be performed or validated for you automatically, since the tool ran outside the database.
Furthermore, since version 7.0, the Oracle database itself has actually implemented a feature with many of the characteristics of summary tables - the Snapshot. This feature was initially designed to support replication, but many would use it to ' pre-answer ' large queries. So, we would have snapshots that did not use a database link to replicate data from database to database, but rather just summarized or pre-joined frequently accessed data. This was good, but without any query rewrite capability, it was still problematic. The application had to know to use the summary tables in the first place, and this made the application more complex to code and maintain. If we added a new summary then we would have to find the code that could make use of it, and rewrite that code.
In Oracle 8.1.5 Oracle took the query rewriting capabilities from tools like Discoverer, the automated refresh and scheduling mechanisms from snapshots (that makes the summary tables ' self maintaining ' ), and combined these with the optimizer's ability to find the best plan out of many alternatives. This produced the materialized view.
With all of this functionality centralized in the database, now every application can take advantage of the automated query rewrite facility, regardless of whether access to the database is via SQL*PLUS, JDBC, ODBC, Pro*C, OCI, or some third party tool. Every Oracle 8i enterprise database can have summary table management. Also, since everything takes place inside the database, the details can be easily synchronized with the summaries, or at least the database knows when they aren't synchronized, and might bypass stale summaries.

No comments:

Post a Comment