5. What is DataStage?Design jobs for Extraction, Transformation, and Loading (ETL)
Ideal tool for data integration projects – such as, data warehouses, data marts, and system migrations
Import, export, create, and managed metadata for use within jobs
Schedule, run, and monitor jobs all within DataStage
Administer your DataStage development and execution environments5
7. DataStage Server and Clients7
8. DataStage Server and ClientsAdministrator
Administers DataStage projects and conducts housekeeping on the server
Creates DataStage jobs that are compiled into executable programs
Used to run and monitor the DataStage jobs
Allows you to view and edit the contents of the repository8
10. DataStage AdministratorIn DataStage all development work is done within a project. Projects are created during installation and after installation using Administrator.
Each project is associated with a directory. The directory stores the objects (jobs, metadata, custom routines, etc.) created in the project.
Before you can work in a project you must attach to it (open it).
You can set the default properties of a project using DataStage Administrator10
11. DataStage AdministratorUse the Administrator to specify general server defaults, add and delete projects, and to set project properties.
Use the Administrator Project Properties window to:
· Set job monitoring limits and other Director defaults on the General tab.
· Set user group privileges on the Permissions tab.
· Enable or disable server-side tracing on the Tracing tab.
· Specify a user name and password for scheduling jobs on the Schedule tab.
· Specify hashed file stage read and write cache sizes on the Tunables tab11
13. DataStage ManagerDataStage Manager manages two different types of objects:
· Metadata describing sources and targets:
- Called table definitions in Manager. These are not to be confused with relational tables. DataStage table definitions are used to describe the format and column definitions of any type of source: sequential, relational, hashed file, etc.
- Table definitions can be created in Manager or Designer and they can also be imported from the sources or targets they describe.13
14. DataStage Manager · DataStage components
- Every object in DataStage (jobs, routines, table definitions, etc.) is stored in the DataStage repository. Manager is the interface to this repository.
- DataStage components, including whole projects, can be exported from and imported into Manager.
15. DataStage ManagerAny object in Manager can be exported to a file
Can export whole projects
Use for backup
Sometimes used for version control
Can be used to move DataStage objects from one project to another
Use to share DataStage jobs and projects with other developers
16. DataStage ManagerImport Procedure
In Manager, click “Import>DataStage Components”
Select DataStage objects for import16
17. DataStage ManagerExport Procedure
In Manager, click “Export>DataStage Components”
Select DataStage objects for export
Specified type of export: DSX, XML
Specify file path on client machine17
19. DataStage DirectorCan schedule, validating, and run jobs
Can be invoked from DataStage Manager or Designer
Clear job log
Set Director options
Abort after x warnings19
20. Director Log ViewClick the Log button in the toolbar to view the job log. The job log records events that occur during the execution of a job.
These events include control events, such as the starting, finishing, and aborting of a job; informational messages; warning messages; error messages; and program-generated messages.
21. DataStage Director21
23. What Is a Job?Executable DataStage program
Created in DataStage Designer, but can use components from Manager
Built using a graphical user interface
Compiles into Orchestrate shell language (OSH)23
24. Create New JobSeveral types of DataStage jobs:
Parallel – this course will concentrate on parallel jobs.
Job Sequence – used to create jobs that control execution of other jobs.24