Thursday, February 14, 2013

AWR Analysis


http://www.toadworld.com/Experts/GuyHarrisonsImprovingOraclePerformance/ResolvingOracleContention/tabid/257/Default.aspx
http://www.dba-oracle.com/art_sql_tune.htm
http://www.pafumi.net/Tips_for_Effective_Queries.html


Rows processed—Queries that process a large number of rows will have high I/O and may also have impact on the TEMP tablespace.

Buffer gets—High buffer gets may indicate a resource-intensive query.

Disk reads—High disk reads indicate a query that is causing excessive I/O.

Memory KB—The memory allocation of a SQL statement is useful for identifying statements that are doing in-memory table joins.

CPU secs—This identifies the SQL statements that use the most processor resources.

Sorts—Sorts can be a huge slowdown, especially if they’re being done on a disk in the TEMP tablespace.

Executions—The more frequently executed SQL statements should be tuned first, since they will have the greatest impact on overall performance.

http://docs.oracle.com/cd/B10500_01/server.920/a96536/apc2.htm
http://savvinov.com/2012/04/06/awr-reports-interpreting-cpu-usage/
http://dbaregistry.blogspot.in/2010/01/oracle-wait-events.html
http://www.dba-oracle.com/t_online_redo_logs_raid.htm
http://samadhandba.wordpress.com/


STATISTICS_LEVEL should be TYPICAL or ALL for getting AWR report

---------------How to reduce buffer gets
Buffer gets—High buffer gets may indicate a resource-intensive query.

you should be able to reduce buffer gets by reducing read operations This can sometimes be done by indexed lookups, sometimes not. It depends on the situation.

Buffer gets means that the data block is in memory and Oracle is returning data from the buffer cache and NOT having to go to disk, which is bad

when the value of buffer gets for SQL statement increased ,The amount of Oracle CPU has been increased .
-------------------

RAM is indirectly proportional to disk IO
if RAM has high memory then disk IO will be low

RAID 1 is best for Oracle DB.
----------------
Check DB Time
If DB time is significantly higher than the elapsed time then things are waiting.
 example, the Elapsed Time is around 60 minutes while the DB Time is around 700 minutes. This means that 700 minutes of time is spent by the sessions on waiting.
---------------------

Look for the Top 5 Timed Foreground Events. Investigate what is triggering these events? (Ignore idle events)

SQL ordered by Elapsed Time => Always helps to find the time consuming sql (Observe the elapsed time as well as number of executions also).
SQL ordered by Reads should also be noticed for I/O intensive queries.


------------------------
PGA Memory Advisory
When using Auto Memory Mgmt, minimally choose a pga_aggregate_target value where Estd PGA Overalloc Count is 0
when Estd PGA Cache Hit %=99% starts means this is sufficient pga for DB.
------------------------
Tablespace IO Stats

5-10 msec is normal
not exceed from 20msec

------------------------
File IO Stats depends on disk storage
8-10 msec is normal

------------------------
Enqueue Activity
Enqueue Type (Request Reason)
 TM lock
this could be caused by missing indexes on foreign key columns.
Problem with standard edition, fine with enterprise edition
The foreign-key issue is that DML on the child table locks the parent table
-----------------

“Starting with Oracle 10.2,  you may notice a significant degradation in relative performance when you combine a GROUP BY with an ORDER BY on the same columns.

When a GROUP BY is associated with an ORDER BY on the same columns,  the Oracle optimizer may choose a SORT GROUP BY rather than the usually more efficient HASH GROUP BY.

Using the SORT GROUP BY avoids adding a SORT ORDER BY to the plan, but the overall result is usually disappointing.

To get a better result, you can perform the GROUP BY in an in-line view and perform the ORDER BY in the outer query.  Use the NO_MERGE hint to prevent the two operations from being combined.”
-------------

Performance issue is caused by soft parsing, which can be solved by holding cursors (session_cached_cursors). It may be due to oversized shared pool, which is causing delay in searching child cursors. My second thought is, there is large number of reloads, which can be due to under-sized shared pool, if invalidation activities are not going (CBO statistics collection, DDL etc), cursors are being flushed frequently.

-------------
Recursive calls are also generated due to the inavailability of dictionary info in the dictionary cache, firing of database triggers, execution of DDL, execution of SQL within PL/SQL blocks, functions or stored procedures and enforcement of referential integrity constraints.

If the recursive CPU usage percentage is a large proportion of total CPU usage, you may want to make sure the shared pool memory allocation is adequate. However, a PL/SQL-based application will always have a significant amount of recursive CPU usage.

To reduce the recursive CPU usage, make sure that shared pool is sized correctly. Also you can check the parameters like

CURSOR_SHARING – values can be EXACT, SIMILAR, FORCE
SESSION_CACHED_CURSORS – value can be anything numeric like 500.
-------------------
Remove unnecessary large-table full table scans Unnecessary full table scans cause a huge amount of unnecessary I/O, and can drag down an entire database.
---------------------
Load Profile

Parse should be near 0

Parse activity statistics should be checked carefully because they can immediately indicate a problem within the application. For example, a database has been running several days with a fixed set of applications, it should, within a course of time, parse most SQLs issued by the applications, and these statistics should be near zero.

If there are high values of Soft Parses  or especially Hard Parses statistics, such values should be taken as an indication that the applications make little use of bind variables and produce large numbers of unique SQLs. However, if the database serves developmental purposes, high vales of these statistics are not bad.

% Blocks changed per Read:
The % Blocks changed per Read statistic indicates that only 4.85 percent of all blocks are retrieved for update, and in this example, the Recursive Call % statistic is extremely high with about 90 percent. However, this fact does not mean that nearly all SQL statements executed by the database are caused by parsing activity, data dictionary management, space management, and so on.

It is also useful to check the value of the Rollback per transaction % statistic. This statistic reports the percent of transactions rolled back. In a production system, this value should be low. If the output indicates a high percentage of transactions rolled back, the database expends a considerable amount of work to roll back changes made.  This should be further investigated in order to see why the applications roll back so often.

Rollback per transaction %  should be low.

---------------------

CPU time represents time spent on CPU and does not include time waiting for CPU.

CPU usage is described by “CPU time” (or “DB CPU”) statistics. Somewhat counterintuitively, AWR report showing CPU time close to 100% in the top timed events section does not necessarily indicate a problem. It simply means that database is busy using CPU to do work for its users. However, if CPU time (expressed in CPU seconds) becomes commensurate to the total CPU power available on the host (or shows consistent growth patterns), then it becomes a problem, and a serious one: this means that at best, Oracle processes will wait lots of time to get on CPU runqueue. In the worst case scenario, the host OS won’t have adequate resources to run and may eventually hang.

DB CPU usage (% of CPU power available) = CPU time / NUM_CPUS / elapsed time

If DB CPU usage is at 80-90% of the capacity (or 70-80% and growing) then you try to reduce CPU usage or if not possible, buy more CPU power before the system freezes.

----------------------Instance Efficiency Percentage
 these all will be above 90%

Parse CPU to Parse Elapsd %: 0.00 % shows how much time the CPU is parsing SQL statments, so the lower the better.

The Instance Efficiency Percentage report section contains ratios or calculations that may provide information regarding different structures and operations in the Oracle instance. Database tuning must never be driven by hit ratios. Hit ratios only provide additional information to help the DBA understand how the instance is operating.

http://www.dba-oracle.com/oracle10g_tuning/t_instance_efficiency_percentage.htm


The following list includes the meanings of particular hit ratios:
Buffer Hit Ratio: Measures how many times a required block was found in memory rather than having to execute an expensive read operation on disk to get the block.
Buffer Nowait %: Shows the percentage of times when data buffers were accessed directly without any wait time.
Library Hit %: Shows the percentage of times when SQL statements and PL/SQL packages were found in the shared pool.
Execute to Parse %: Shows how often parsed SQL statements are reused without reparsing.
Parse CPU to Parse Elapsd %: Gives the ratio of CPU time spent to parse SQL statements.
Redo NoWait %: Shows whether the redo log buffer has sufficient size.
In-memory Sort %: Shows the percentage of times when sorts are performed in memory instead of using temporary tablespaces.
Soft Parse %: Shows how often sessions issued a SQL statement that is already in the shared pool and how it can use an existing version of that statement.
Latch Hit %: Shows how often latches were acquired without having to wait.
% Non-Parse CPU: Shows the percentage of how much CPU resources were spent on the actual SQL execution.

In the above list of statistics, special attention should be paid to parse-related statistics. The Instance Efficiency Percentage report provided previously shows that about 95 percent of the parses are soft as indicated by the Soft Parse %. This is good enough, indicating that the SQL statements are actively reused by Oracle.


The next interesting item to review is the Parse CPU to Parse Elapsd % statistic. In this case, it is about three percent, which is very low. This fact reveals that Oracle waits for some resources during parsing of SQL statements. This should be investigated further to find the cause.

In this case, % Non-Parse CPU statistic is about 97 percent, which is quite high.  This indicates Oracle utilizes the CPU mostly for statement execution but not for parsing.

As a rule of thumb, one should always minimize the number of hard parses in your production database. This reduction yields the benefit of minimizing CPU overhead spent performing costly parse work.


------------------Shared Pool Statistics ------------------------

Idealy this should be lesser. If it is very high like beyond 90, this shows the contention in the shared pool.

This following sample report section shows shared pool related statistics:

Shared Pool Statistics        Begin    End
                              ------  ------
             Memory Usage %:   92.70   92.49
    % SQL with executions>1:   86.73   84.20
  % Memory for SQL w/exec>1:   84.12   71.86

In this example, the Memory Usage % statistic shows that almost all, approximately 92 percent, of the shared pool memory is consumed. This could indicate that the system experiences some overhead while aging out old shared memory structures like cursors, PL/SQL programs, and so on. This places additional overhead on the CPU to perform reparsing aging-out. The size of the shared pool should be increased appropriately to eliminate such overhead. In general, this statistic should be near 70 percent after the database has been running a long time.
If it is quite low, memory is being wasted.

Memory Usage % >: The sweet point for the shared pool Memory Usage % is at about ~70%-80% because of aging out. You are at ~90%. It would be advisable to increase the shared pool, but if you're using 10g SGA_TARGET and SGA_MAX_SIZE, then that might require increasing those parameters.

Memory Usage % >90% then increase sga

The % SQL with executions>1 statistic indicates how many SQL statements are executed more than one time. This measures how well production applications are tuned and how well they make use of bind variables.

% Memory for SQL w/exec>1: the percentage of SQLs executed more then once, that were in Memory


-------------------------------------------Buffer Pool Advisory-----------
If you check the column Size for Est(M) for Size Factor 1.0 it is 596M which means 596MB of memory has been allocated by Oracle either automatically if sga_target > 0 OR manually in the parameter file.

Simultaneously we need to check the column “Estimated Physical Reads” which has a value of 37,612,741.

You will notice that increasing buffer cache size will reduce the number of “Estimated Physical Reads”.

Size Factor ”1.0” also has Estimated Phys Read Fator ”1.0” with “Estimated Physical Reads” as37,612,741.

Note how ‘Size Factor’ increases and ‘Estimated Phys Read Factor’ decreases.


 In cases when sga_target > 0 and db_cache_size is determined automatically, similar data would mean that allocated by Oracle size of the buffer cache is fairly balanced, i.e. not too small to have excessive number of physical reads and not too big to have unnecessary overhead while serving a bigger buffer cache.


--------------------------------Top 5 Timed Events
Sometimes, a major external problem such as bad disk RAID can cause the entire database to run slowly, so it is always a good idea to start with a listing of the AWR top five wait event listing.

This tell you where the highest resources are being consumed.

db file sequential read would indicate that there where full table scans being performed.

 Interpretation of CPU Time(s) 1033

We have 60*60=3600 CPU Seconds to use in a particular interval for single CPU in 1 hour snap


In the example we have 8 CPU (Num_CPUs under the Operating System Statistics of AWR) which relates to    60*60*8= 28800 CPU seconds to use in 1 hr interval. ( Single Database Machine is running on machine)

(1033/28800)*100 = 3.58% of Total CPU

Network Bottleneck
A slow or un-tuned network can be a nightmare.  Network overload can also be caused by Oracle TNS issues or by an inefficient application that makes too many trips to Oracle to fetch screen data.

A CPU-bound database may be due to a real overload of the CPU banks, in which case the problem can be remedied with more or faster processors.  High CPU consumption can also be due to un-tuned SQL causing excessive logical I/O in the form of consistent gets or by contention within the shared pool, such as the total lack of SQL bind variables.



----------------------------Tablespace IO Stats
This shows the IO Statistics for each tablespaces in the database.

As the thumb rule, the Av Rd(ms) [Average Reads in milliseconds] should not cross beyond 30, add myself(not greater that 30) which is considered to be IO bottleneck.

Tablespace Reads Av Reads/s Av Rd(ms) Av Blks/Rd Writes Av Writes/s Buffer Waits Av Buf Wt(ms)
TEMP 3,316,0 4.91 1 28,840 8 0 0
DAT1 520,120 146 16.06 1.21 185,846 52 902 13
DAT3 93,411 26 42.82 2.98 13,442 4 16 23.13
DAT2 98,171 28 91.97 7.97 5,333 2 325 34.89

In the above example, the Av Rd(ms) is high in all tablespaces indicating the IO contention.

------------------------------------------------Wait Events
---------------db file sequential reads
Reason:
Use of an unselective index
Fragmented Indexes
High I/O on a particular disk or mount point
Bad application design
Index reads performance can be affected by  slow I/O subsystem and/or poor database files layout, which result in a higher average  wait time

Action:
Check indexes on the table to ensure that the right index is being used
Check the column order of the index with the WHERE clause of the Top SQL statements
Rebuild indexes with a high clustering factor
Use partitioning to reduce the amount of blocks being visited
Make sure optimizer statistics are up to date
Relocate ‘hot’ datafiles
Consider the usage of multiple buffer pools and cache frequently used indexes/tables in the KEEP pool

Suggestion
The Oracle process wants a block that is currently not in the SGA, and it is waiting for the database block to be read into the SGA from disk.
Significant db file sequential read wait time is most likely an application issue.
If the DBA_INDEXES.CLUSTERING_FACTOR of the index approaches the number of blocks in the table, then most of the rows in the table are ordered. This is desirable.
However, if the clustering factor approaches the number of rows in the table, it means the rows in the table are randomly ordered and thus it requires more I/Os to complete the operation. You can improve the index’s clustering factor by rebuilding the table so that rows are ordered according to the index key and rebuilding the index thereafter.

----------------------db file scattered reads
Reason:
The Oracle session has requested and is waiting for multiple contiguous database blocks (up to DB_FILE_MULTIBLOCK_READ_COUNT) to be  read into the SGA from disk.
Full Table scans
Fast Full Index Scans

Action:
Optimize multi-block I/O by setting the parameter DB_FILE_MULTIBLOCK_READ_COUNT
Partition pruning to reduce number of blocks visited
Optimize the SQL statement that initiated most of the waits. The goal is to minimize the number of physical and logical reads.
The objective is to reduce the demands for both the logical and physical I/Os, and this is best achieved through SQL and application tuning.
Make sure all statistics are representative of the actual data. Check the LAST_ANALYZED date

----------------log file parallel write
Reason:
LGWR waits while writing contents of the redo log buffer cache to the online log files on disk
I/O wait on sub system holding the online redo log files

Action:
Reduce the amount of redo being generated
Do not leave tablespaces in hot backup mode for longer than necessary
Do not use RAID 5 for redo log files
Use faster disks for redo log files
Ensure that the disks holding the archived redo log files and the online redo log files are separate so as to avoid contention
Consider using NOLOGGING or UNRECOVERABLE options in SQL statements

----------------------------------------------log file sync
Reason:
Oracle foreground processes are waiting for a COMMIT or ROLLBACK to complete
 Oracle metric indicates the process is waiting for LGWR to finish flushing the log buffer to disk.

Action:
Tune LGWR to get good throughput to disk eg: Do not put redo logs on RAID5 Reduce overall number of commits by batching transactions so that there are fewer distinct COMMIT operations
Move the redo logs to faster disks or a faster I/O subsystem (for example, switch from RAID 5 to RAID 1).

See if any activity can safely be done with NOLOGGING / UNRECOVERABLE options in order to reduce the amount of redo being written.
Check the size of the log buffer as it may be so large that LGWR is writing too many blocks at one time.


Log File Sync waits occur when sessions wait for redo data to be written to disk
typically this is caused by slow writes
or committing too frequently in the application
CPU overburning(very high demand => LGWR on run queue)
improper Operating System configuration(check 169706.1)
BUGs in Oracle(especially with RAC option) and 3rd Party software(like ODM/DISM)


Recommendations:
tune LGWR process to get good throughput, especially when ‘log file parallel write‘ high too:
do not put redo logs on RAID 5 without good write cache
do not put redo logs on Solid State Disk (SSD)
if CPUs are overburned(check runqueue with vmstat):
check for non-oracle system activity, like GZIP or BZIP2 running in business hours…
lower instance’s CPU usage(for example, tune SQL for LIOs)
increase LGWR priority(renice or _high_priority_processes),
decrease COMMITs count for applications with many short transactions
use COMMIT [BATCH] NOWAIT(10g+) when possible
do some processing with NOLOGGING(or may be even with _disable_logging=TRUE if just testing performance benchmark/impact), but think about database recoverability
lower system’s CPU usage or increase LGWR priority
check if there is some 3rd party software, or utilities like RMAN, activity on the same disks as redo logs placed, like trace/systemstate dump files, e.t.c
trace LGWR as the last option for troubleshooting OS/3rd party issues


---------------buffer busy waits
Reason:
Buffer busy waits are common in an I/O-bound Oracle system.
The two main cases where this can occur are:
Another session is reading the block into the buffer
Another session holds the buffer in an incompatible mode to our request
These waits indicate read/read, read/write, or write/write contention.
The Oracle session is waiting to pin a buffer.
A buffer must be pinned before it can be read or modified. Only one process can pin a buffer at any one time.
This wait can be intensified by a large block size as more rows can be contained within the block
This wait happens when a session wants to access a database block in the buffer cache but it cannot as the buffer is "busy
It is also often due to several processes repeatedly reading the same blocks (eg: if lots of people scan the same index or data block)

Action:
The main way to reduce buffer busy waits is to reduce the total I/O on the system
Depending on the block type, the actions will differ
Eliminate HOT blocks from the application.
Check for repeatedly scanned / unselective indexes.
Try rebuilding the object with a higher PCTFREE so that you reduce the number of rows per block.
 Check for 'right- hand-indexes' (indexes that get inserted into at the same point by many processes).
 Increase INITRANS and MAXTRANS and reduce PCTUSED This will make the table less dense .
Reduce the number of rows per block
Increase of number of FREELISTs and FREELIST GROUPs

----------------------------------------------free buffer waits
Reason:
This means we are waiting for a free buffer but there are none available in the cache because there are too many dirty buffers in the cache
Either the buffer cache is too small or the DBWR is slow in writing modified buffers to disk
DBWR is unable to keep up to the write requests
Checkpoints happening too fast – maybe due to high database activity and under-sized online redo log files
Large sorts and full table scans are filling the cache with modified blocks faster than the DBWR is able to write to disk
If the  number of dirty buffers that need to be written to disk is larger than the number that DBWR can write per batch, then these waits can be observed

Action:
Reduce checkpoint frequency  - increase the size of the online redo log files
Examine the size of the buffer cache – consider increasing the size of the buffer cache in the SGA
Set disk_asynch_io = true set If not using asynchronous I/O increase the number of db writer processes or dbwr slaves
Pre-sorting or reorganizing data can help

------------------------enqueue waits
Reason:
This wait event indicates a wait for a lock that is held by another session (or sessions) in an incompatible mode to the requested mode.
TX Transaction Lock
Generally due to table or application set up issues
This indicates contention for row-level lock. This wait occurs when a transaction tries to update or delete rows that are currently locked by another transaction.
This usually is an application issue.

TM DML enqueue lock
Generally due to application issues, particularly if foreign key constraints have not been indexed.

ST lock
Database actions that modify the UET$ (usedextent) and FET$ (free extent) tables require the ST lock, which includes actions such as drop, truncate, and coalesce.
Contention for the ST lock indicates there are multiple sessions actively performing dynamic disk space allocation or deallocation in dictionary managed tablespaces

Action:
Reduce waits and wait times
Whenever you see an enqueue wait event for the TX enqueue, the first step is to find out who the blocker is and if there are multiple waiters for the same resource
Waits for TM enqueue in Mode 3 are primarily due to unindexed foreign key columns
Create indexes on foreign keys  < 10g
Following are some of the things you can do to minimize ST lock contention in your database:
Use locally managed tablespaces
Recreate all temporary tablespaces using the CREATE TEMPORARY TABLESPACE TEMPFILE… command.

--------------------------------Cache buffer chain latch
Reason:
This latch is acquired when searching for data blocks Buffer cache is a chain of blocks and each chain is protected by a child latch when it needs to be scanned
Hot blocks are another common cause of cache buffers chains latch contention. This happens when multiple sessions repeatedly access one or more blocks that are protected by the same child cache buffers chains latch.
SQL statements with high BUFFER_GETS (logical reads) per EXECUTIONS are the main culprits
Multiple concurrent sessions are executing the same inefficient SQL that is going after the same data set

Action:
Reducing contention for the cache buffer chains latch will usually require reducing logical I/O rates by tuning and minimizing the I/O requirements of  the SQL involved. High I/O rates could be a sign of a hot block (meaning a block highly accessed).
Exporting the table, increasing the PCTFREE significantly, and importing the data. This minimizes the number of rows per block, spreading them over many blocks. Of course, this is at the expense of storage and full table scans operations will be slower
Minimizing the number of records per block in the table
For indexes, you can rebuild them with higher PCTFREE values, bearing in mind that this may increase the height of the index.


--------------------Direct Path Reads
Reason:
These waits are associated with direct read operations which read data directly into the sessions PGA bypassing the SGA
The "direct path read" and "direct path write" wait events are related to operations that are performed in PGA like sorting, group by operation, hash join
In DSS type systems, or during heavy batch periods, waits on "direct path read" are quite normal
However, for an OLTP system these waits are significant
These wait events can occur during sorting operations which is not surprising as direct path reads and writes usually occur in connection with temporary tsegments
SQL statements with functions that require sorts, such as ORDER BY, GROUP BY, UNION, DISTINCT, and ROLLUP, write sort runs to the temporary tablespace when the input size is larger than the work area in the PGA

Action:
Ensure the OS asynchronous IO is configured correctly.
Check for IO heavy sessions / SQL and see if the amount of IO can be reduced.
Ensure no disks are IO bound.
Set your PGA_AGGREGATE_TARGET to appropriate value (if the parameter WORKAREA_SIZE_POLICY = AUTO)
Whenever possible use UNION ALL instead of UNION, and where applicable use HASH JOIN instead of SORT MERGE and NESTED LOOPS instead of HASH JOIN.
Make sure the optimizer selects the right driving table. Check to see if the composite index’s columns can be rearranged to match the ORDER BY clause to avoid sort entirely.


------------------------Direct Path  Writes
These are waits that are associated with direct write operations that write data from users’ PGAs to data files or temporary tablespaces
Direct load operations (eg: Create Table as Select (CTAS) may use this)
Parallel DML operations
Sort IO (when a sort does not fit in memory

Action:
If the file indicates a temporary tablespace check for unexpected disk sort operations.
Ensure <Parameter:DISK_ASYNCH_IO> is TRUE . This is unlikely to reduce wait times from the wait event timings but may reduce sessions elapsed times (as synchronous direct IO is not accounted for in wait event timings).
Ensure the OS asynchronous IO is configured correctly.
Ensure no disks are IO bound


-------------------------Latch Free Waits
This wait indicates that the process is waiting for a latch that is currently busy (held by another process).
A latch is a fast, inexpensive and non-sophisticated lock. The latch is used when you need to serialize access to operations, functions and data structures in Oracle.
We could check the shared pool tuning to determine whether that might be a valid suggestion to increase the size of the shared pool. Cursor sharing can be implemented to prevent thrashing within the shared pool and the session_cached_cursors can be used to reduce the soft parses. The following Oracle initialization parameter changes resolved the performance degradation that resulted from the latch free wait event.


alter system set session_cached_cursors=150 scope=spfile;
alter system set shared_pool_size=900M scope=spfile;
alter system set cursor_sharing='FORCE' scope=spfile;

--------------------Library cache latch
The library cache latches protect the cached SQL statements and objects definitions held in the library cache within the shared pool. The library cache latch must be acquired in order to add a new statement to the library cache
Application is making heavy use of literal SQL- use of bind variables will reduce this latch considerably
Action:
You can reduce the library cache latch hold time by properly setting the SESSION_CACHED_CURSORS parameter
Note:
Larger shared pools tend to have long free lists and processes that need to allocate space in them must spend extra time scanning the long free lists while holding the shared pool latch
Oversize shared pool is also problem.

--------------------Shared pool latch
The shared pool latch is used to protect critical operations when allocating and freeing memory in the shared pool

Action:
Ways to reduce the shared pool latch are, avoid hard parses when possible, parse once, execute many.
The workaround is to set the initialization parameter CURSOR_SHARING to FORCE. This allows statements that differ in literal values but are otherwise identical to share a cursor and therefore reduce latch contention, memory usage, and hard parse.

-------------Log Buffer Space
Log Buffer Space wait event occurs when server processes write data into the log buffer faster than the LGWR process can write it out. The LGWR process begins writing entries to the online redo log file if any of the following conditions are true:

The log buffer reaches the _log_io_size threshold. By default, this parameter is set to one third of the log buffer size.
A server process performing a COMMIT o ROLLBACK posts to the LGWR process.
The DBWR process posts to the LGWR process before it begins writing.


Solution
Put log files on faster disks.
Look at tuning checkpoint or archive processes (log files switches).
Review application design, use NOLOGGING operations where appropriate, and avoid changing more data than required.
Finally, check refresh methods when using Materialized Views to reduce logging activity.


----------------------------------------------------------------------Wait Event Meaning----------------------------------------------------------------


db file sequential read  => tune indexing, tune SQL (to do less I/O), tune disks, increase buffer cache. This event is indicative of disk contention on index reads. Make sure all objects are analyzed. Redistribute I/O across disks. The wait that comes from the physical side of the database. It related to memory starvation and non selective index use. Sequential read is an index read followed by table read because it is doing index lookups which tells exactly which block to go to.
db file scattered read => disk contention on full table scans. Add indexes, tune SQL, tune disks, refresh statistics, and create materialized view. caused due to full table scans may be because of insufficient indexes or unavailability of updated statistics.
Oracle monitoring tools Make money blogging Web hosting servers Oracle Dba Job Interview
db file parallel read  => tune SQL, tune indexing, tune disk I/O, increase buffer cache. if you are doing a lot of partition activity then expect to see that wait even. it could be a table or index partition.
db file parallel write  => if you are doing a lot of partition activity then expect to see that wait even. it could be a table or index partition.
db file single write  => if you see this event than probably you have a lot of data files in your database.

control file sequential read
control file parallel write

log file sync    => committing too often, archive log generation is more. Tune applications to commit less, tune disks where redo logs exist, try using nologging/unrecoverable options, log buffer could be too large.
log file switch completion => May need more log files per group.
log file parallel write  => Deals with flushing out the redo log buffer to disk. Disks may be too slow or have an I/O bottleneck. Look for log file contention.
log buffer space   => Increase LOG_BUFFER parameter or move log files to faster disks. Tune application, use NOLOGGING, and look for poor behavior that updates an entire row when only a few columns change.
log file switch (checkpoint incomplete) => May indicate excessive db files or slow IO subsystem.
log file switch (archiving needed)    => Indicates archive files are written too slowly.
redo buffer allocation retries  => shows the number of times a user process waited for space in the redo log buffer.
redo log space wait time  => shows cumulative time (in 10s of milliseconds) waited by all processes waiting for space in the log buffer.

buffer busy waits/ read by other session  => Increase DB_CACHE_SIZE. Tune SQL, tune indexing, we often see this event along with full table scans, if the SQL is inserting data, consider increasing FREELISTS and/or INITRANS, if the waits are on segment header blocks, consider increasing extent sizes.
free buffer waits  => insufficient buffers, process holding buffers too long or i/o subsystem is over loaded. Also check you db writes may be getting clogged up.
cache buffers lru chain  => Freelist issues, hot blocks.
no free buffers   => Insufficient buffers, dbwr contention.

latch free
latch: session allocation
latch: in memory undo latch  => If excessive could be bug, check for your version, may have to turn off in memory undo.
latch: cache buffer chains  => check hot objects.
latch: cache buffer handles  => Freelist issues, hot blocks.

direct path write => You wont see them unless you are doing some appends or data loads.
direct path reads => could happen if you are doing a lot of parallel query activity.
direct path read temp or direct path write temp => this wait event shows Temp file activity (sort,hashes,temp tables, bitmap) check pga parameter or sort area or hash area parameters. You might want to increase them.

library cache load lock
library cache pin => if many sessions are waiting, tune shared pool, if few sessions are waiting, lock is session specific.
library cache lock  => need to find the session holding the lock, look for DML manipulating an object being accessed, if the session is trying to recompile PL/SQL, look for other sessions executing the code.

undo segment extension  => If excessive, tune undo.
wait for a undo record   => Usually only during recovery of large transactions, look at turning off parallel undo recovery.

enque wait events   => Look at V$ENQUEUE_STAT





No comments:

Post a Comment

Followers