USING DATA PUMP
TABLE EXPORT IMPORT
expdp scott/tiger@db10g tables=EMP,DEPT directory=TEST_DIR dumpfile=EMP_DEPT.dmp logfile=expdpEMP_DEPT.log
impdp scott/tiger@db10g tables=EMP,DEPT directory=TEST_DIR dumpfile=EMP_DEPT.dmp logfile=impdpEMP_DEPT.log
1. The following SQL statements creates a user, a directory object named dpump_dir1 and grants the permissions to the user.
SQLPLUS ldbo/linux@sns0809srv as sysdba
SQL> create user dpuser identified by dpuser;
SQL> grant connect, resource to dpuser;
SQL> CREATE DIRECTORY dpump_dir1 AS 'e:\dpdirectory';
SQL> grant read, write on directory dpump_dir1 to dpuser
$ expdp dpuser/dpuser@sns0809srv schemas=dpuser directory=dpump_dir1 dumpfile=dpuser.dmp log=dpuser.log
$expdp dpuser/dpuser@sns0809srv schemas=dpuser directory=dpump_dir1 dumpfile=dpuser2.dmp logfile=dpuser.log
$expdp dpuser/dpuser@TDB10G schemas=dpuser directory=dpump_dir1 parallel=4 dumpfile=dpuser_%U.dmp logfile=dpuser.log
---------------difference between traditional exp/imp and data pump-------
The main differences are listed below:
1)Expdp/Impdp access files on the server rather than on the client.
2))Expdp/Impdp operate on a group of files called a dump file set rather than on a single sequential dump file.
3)To improved performance Impdp/Expdp use parallel execution rather than a single stream of execution.
1) Ability to estimate jobs times
2) Ability to restart failed jobs
3) Perform fine-grained object selection
4) Monitor running jobs
5) Directly load a database from a remote instance via the network
6) Remapping capabilities
7) Improved performance using parallel executions
-------
DBMS_DATAPUMP package
SQLPLUS system/manager@TDB10G as sysdba
SQL> create user dpuser identified by dpuser;
grant connect, resource to dpuser;
CREATE DIRECTORY dpump_dir1 AS 'E:\app\kshitij';
grant read, write on directory dpump_dir1 to dpuser;
========
$ expdp dpuser/dpuser@orcl schemas=dpuser
include= TABLE:\"IN (\'EMP\', \'DEPT\')\"
directory=dpump_dir1 dumpfile=dpuser.dmp logfile=dpuser.log
$expdp dpuser/dpuser@TDB10G schemas=dpuser
exclude=TABLE:\"= \'EMP_DETAILS\'\"
directory=dpump_dir1 dumpfile=dpuser2.dmp lo
==============
The following steps list the basic activities involved in using Data Pump API.
1. Execute DBMS_DATAPUMP.OPEN procedure to create job.
2. Define parameters for the job like adding file and filters etc.
3. Start the job.
4. Optionally monitor the job until it completes.
5. Optionally detach from job and attach at later time.
6. Optionally, stop the job
7. Restart the job that was stopped.
Example of the above steps:
Declare
P_handle number; --- -- Data Pump job handle
P_last_job_state varchar2(45); ---- -- To keep track of job state
P_job_state varchar2(45);
P_status ku$_Status ----- -- The status object returned by get_status
BEGIN
P_handle:=DBMS_DATAPUMP.OPEN ('EXPORT','SCHEMA', NULL,'EXAMPLE','LATEST');
-- Specify a single dump file for the job (using the handle just returned)
-- and a directory object, which must already be defined and accessible
-- to the user running this procedure
DBMS_DATAPUMP.ADD_FILE (p_handle,'example.dmp','DMPDIR');
-- A metadata filter is used to specify the schema that will be exported.
DBMS_DATAPUMP.METADATA_FILTER (p_handle,'SCHEMA_EXPR','IN (''dpuser'')');
-- Start the job. An exception will be generated if something is not set up
-- Properly.
DBMS_DATAPUMP.start_job (p_handle);
----The export job should now be running.
The status of the job can be checked by writing a separate procedure and capturing the errors and status until it is completed. Overall job status can also be obtained by querying “SELECT * from dba_datapump_jobs”.
=======================
Data Pump ( expdp / impdp )
Now in 10g we can have total control over the job running (stop it, pause it, check it, restart it). Data pump is a server side technology and it can transfer large amounts of data very quickly using parallel streams to achieve maximum throughput, they can be 15-45% faster than the older import/export utilities.
Advantages using data pump
1) Ability to estimate jobs times
2) Ability to restart failed jobs
3) Perform fine-grained object selection
4) Monitor running jobs
5) Directly load a database from a remote instance via the network
6) Remapping capabilities
7) Improved performance using parallel executions
Note (1) You cannot export to a tape device only to disk, and the import will only work with version of oracle 10.1 or greater.
Note (2) The expdp and impdp are command line tools and run from within the Operating System.
Use of data pump
1) Migrating databases
2) Copying databases
3) Transferring oracle databases between different operating systems
4) Backing up important tables before you change them
5) Moving database objects from one tablespace to another
6) Transporting tablespace's between databases
7) Reorganizing fragmented table data
8) Extracting the DDL for tables and other objects such as stored procedures and packages
Data Pump components
1) dbms_datapump - the main engine for driving data dictionary metadata loading and unloading 2) dbms_metadata - used to extract the appropriate metadata
3) command-line - expdp and impdp are the import/export equivalents
Data Access methods
1) Direct path
2) External table path
Direct Path
bypasses the database buffer cache and writes beyond the high water mark when finished adjusts the high water mark, No undo is generated and can switch off redo as well, minimal impact to users as does not use SGA. Must disable triggers on tables before use.
External Path
Uses the database buffer cache acts as a SELECT statement into a dump file, during import reconstructs statements into INSERT statements, so whole process is like a normal SELECT/INSERT job. Both undo and redo are generated and uses a normal COMMIT just like a DML statement would.
In the following cases oracle will use the external path if any of the below are in use
1) clustered tables
2) active triggers in the table
3) a single partition in a table with a global index
4) referential integrity constraints
5) domain indexes on LOB columns
6) tables with fine-grained access control enabled in the insert mode
7) tables with BFILE or opaque type columns
Data Pump files
All files will be created on the server.
1) Dump files - holds the data and metadata
2) log files - the resulting output from the data pump command
3) sql files - contain the DDL statements describing the objects included in the job but can contain data
Master data pump tables - when using datapump it will create tables within the schema, this is used for controlling the datapump job, the table is removed when finished.
Data Pump privileges
1) exp_full_database 2) imp_full_database
How Data Pump works
The Master Control Process (MCP), has the process name DMnn, only one master job runs per job which controls the whole Data Pump job, it performs the following
create jobs and controls them
creates and manages the worker processes
monitors the jobs and logs the process
maintains the job state and restart information in the master table (create in the users schema running the job)
manages the necessary files including the dump file set
The master process creates a master table which contains job details (state, restart info), this table is created in the users schema who is running the Data Pump job. Once the job has finished it dumps the table contents into the data pump file and deletes the table. When you import the data pump file it re-creates the table and reads it to verify the correct sequence in which the it should import the various database objects.
The worker process is named DWnn and is the process that actually performs the work, you can have a number of worker process running on the same job (parallelism). The work process updates the master table with the various job status.
The shadow process is created when the client logs in to the oracle server it services data pump API requests, it creates the job consisting of the master table and the master process.
The client processes are the expdp and impdp commands.
Examples
Exporting database
expdp vallep/password directory=datapump full=y dumpfile=data.dmp filesize=2G parallel=2 logfile=full.log
Note: increase the parallel option based on the number of CPU's you have
Exporting schema
expdp sys/password schemas=testuser dumpfile=data.dmp logfile=schema.log
table
expdp vallep/password tables=accounts,employees dumpfile=data.dmp content=metadata_only tablespace
expdp vallep/password tablespaces=users dumpfile=data.dmp logfile=tablespace.log
Importing database
impdp system/password full=y dumpfile=data.dmp nologfile=y
Importing schema change
impdp system/password schemas=’HR’ remap_schema=’HR:HR_TEST’ content=data_only
impdp system/passwd remap_schema=’TEST:TEST3’ tables=test log=… dumpfile=… directory=…Other Options
directory
specifies a oracle directory object
filesize
split the dump file into specific sizes (could be used if filesystem has 2GB limit)
parfile
specify the parameter file
content
contents option can be ALL, METADATA_ONLY or DATA_ONLY
compression
compression is used by default but you can stop it
exclude/include
metadata filtering
query
selectively export table data using a SQL statement
estimate
Calculate job estimates where the vaild keywords are blocks and statistics
estimate_only
Calculate job estimates without performing the export
network link
you can perform a export across a network
encryption
you can encrypt data within the data pump file
parallel
increase worker processes to increase throughput, base it on number of CPU's
remap_schema
move objects from one schema to another
remap_datafile
change the name of the datafile when moving across different systems
remap_tablespace
move from one tablespace to another
Useful Views
DBA_DATAPUMP_JOBS
summary information of all currently running data pump jobs
DBA_DATAPUMP_SESSIONS
displays the user currently running data pump jobs
V$SESSION_LONGOPS
display information like totalwork, sofar, units and opname
The units of work done so far
Privileges
IMP_FULL_DATABASE
required if using advanced features
EXP_FULL_DATABASE
required if using advanced features
DBMS_DATAPUMP package
The package dbms_datapump can be used for the following
starting/stopping/restarting a job
monitoring a job
detaching from a job
exporting
declare d1 number;begin d1 := dbms_datapump.open('export','schema',null, 'test1', 'latest'); dbms_datapump.add_file(d1, 'test1.dmp', 'dmpdir'); dbms_datapump.metadata_filter(d1, 'schema_expr','in (''OE'')'); dbms_datapump.start_job(d1); dbms_datadump.detach(d1);end;
importing
declare d1 number;begin d1 := dbms_datapump.open('import','full',null, 'test1'); dbms_datapump.add_file(d1, 'test1.dmp', 'dmpdir'); dbms_datapump.metadata_remap(d1, 'remap_schema', 'oe', 'hr'); dbms_datapump.start_job(d1); dbms_datadump.detach(d1);end;
============================
REMAP
REMAP_TABLESPACE – This allows you to easily import a table into a different tablespace from which it was originally exported. The databases have to be 10.1 or later.
> impdp username/password REMAP_TABLESPACE=tbs_1:tbs_6 DIRECTORY =dpumpdir1 DUMPFILE=employees.dmp
REMAP_DATAFILES – This is a very useful feature when you move databases between platforms that have different file naming conventions. This parameter changes the source datafile name to the target datafile name in all SQL
statements where the source datafile is referenced. Because the REMAP_DATAFILE value uses quotation marks, it’s best to specify the parameter within a parameter file.
The parameter file, payroll.par, has the following content:
DIRECTORY=dpump_dir1
FULL=Y
DUMPFILE=db_full.dmp
REMAP_DATAFILE=”’C:\DB1\HRDATA\PAYROLL\tbs6.dbf’:’/db1/hrdata/pa
yroll/tbs6.dbf’”
You can then issue the following command:
> impdp username/password PARFILE=payroll.par
================
Maximizing the Power of Oracle Data Pump
Data Pump works great with default parameters, but once you are comfortable with Data Pump, there are new capabilities that you will want to explore.
Parallelism
Data Pump Export and Import operations are processed in the database as a Data Pump job, which is much more efficient that the client-side execution of original Export and Import. Now Data Pump operations can take advantage of the server’s parallel processes to read or write multiple data streams simultaneously (PARALLEL is only available in the Enterprise Edition of Oracle Database.) The number of parallel processes can be changed on the fly using Data Pump’s interactive command-line mode. You may have a certain number of processes running during the day and decide to change that number if more system resources become available at night (or vice versa).
For best performance, you should do the following:
1. Make sure your system is well balanced across CPU, memory, and I/O.
2. Have at least one dump file for each degree of parallelism. If there aren’t enough dump files, performance will not be optimal because multiple threads of execution will be trying to access the same dump file.
3. Put files that are members of a dump file set on separate disks so that they will be written and read in parallel.
4. For export operations, use the %U variable in the DUMPFILE parameter so multiple dump files can be automatically generated.
> expdp username/password DIRECTORY=dpump_dir1 JOB_NAME=hr DUMPFILE=par_exp%u.dmp PARALLEL=4
-------