Unit -1: Data Warehouse Fundamentals
- An introduction to Data Warehousing
- Purpose of Data Warehouse
- Data Warehouse Architecture
- Operational Data Store
- OLTP Vs Warehouse Applications
- Data Marts
- Data marts Vs Data Warehouses
- Data Warehouse Life cycle
- Introduction to Data Modeling
- Entity Relationship model (E-R model)
- Data Modeling for Data Warehouse, Normalization process
- Dimensions and fact tables
- Star Schema and Snowflake Schemas.
- Introduction to Extraction, Transformation & Loading
- Types of ETL Tools
- Key tools in the market.
- Data stage introduction
- IBM information Server architecture
- Data stage components
- Data Stage main functions
- Client components- Adding different Servers to our workspace.
- Data stage project Administration
- Editing projects and Adding Projects
- Deleting projects Cleansing up project files
- Environmental Variables
- Environment management
- Auto purging
- Runtime Column Propagation(RCP)
- Add checkpoints for sequencer
- NLS configuration
- Generated OSH (Orchestra Engine)
- System formats like data, timestamp
- Projects protect – Version details.
- Introduction to Data stage Director
- Validating Data stage Jobs
- Executing Data stage jobs
- Job execution status
- Monitoring a job
- Job log view
- Job scheduling
- Creating Batches
- Scheduling batches.
- Introduction to Data stage Designer
- Importance of Parallelism
- Pipeline Parallelism
- Partition Parallelism
- Partitioning and collecting(In depth coverage of partitioning and collective techniques)
- Symmetric Multi Processing (SMP)
- Massively Parallel Processing (MPP)
- Introduction to Configuration file
- Editing a Configuration file
- Partition techniques
- Data stage Repository Palette
- Passive and Active stages
- Job design overview
- Designer work area
- Annotations
- Creating jobs
- Importing flat file definitions
- Managing the Metadata environment
- Dataset management
- Deletion of Dataset
- Routines
- Database Stages
- Oracle
- ODBC
- Dynamic RDBMS
- File Stages
- Sequential file
- Dataset
- File set
- Lookup file se
- Processing Stages
- Copy
- Filter
- Funnel
- Sort
- Remove duplicate
- Aggregator
- Switch
- Pivot stage
- Lookup
- Join
- Merge
- Difference between look up, join and merge
- Change capture
- External Filter
- Surrogate key generator
- Transformer
- Real time scenarios using different Processing Stages - Implementing different logics using Transformer
- Debug Stages
- Head
- Tail
- Peek
- Column generator
- Row generator
- Write Range Map Stage
- Real Time Stages
- XML input
- XML output
- Local and Shared containers
- Routines creation
- Extensive usage of Job parameters, Parameter Sets, Environmental variables in jobs
- Introduction to predefined Environmental variables creating user defined Environmental variables and implementing the same in parallel jobs
- Explanation of Type1 and Type 2 processes
- Implementation of Type1 and Type2 logics using Change Capture stage and SCD Stage
- Range Look process
- Surrogate key generator stage
- FTP stage
- Job performance analysis
- Resource estimation
- Performance tuning
- Arrange job activities in Sequencer
- Triggers in Sequencer
- Restablity
- Recoverability
- Notification activity
- Terminator activity
- Wait for file activity
- Start Loop activity
- Execute Command activity
- Nested Condition activity
- Exception handling activity
- User Variable activity
- End Loop activity
- Adding Checkpoints
- Jobs used in different real time scenarios.
- Explanation of Sequence Job stages through different Jobs
- IBM Web Sphere Data stage administration
- Opening the IBM Information Server Web console –
- Setting up a project ion the console
- Customizing the project dashboard
- Setting up security
- Creating users in the console
- Assigning security roles to users and groups
- Managing licenses
- Managing active sessions
- Managing logs
- Managing schedules
- Backing up and restoring IBM Information Server.
- Performance Tuning of Parallel Jobs
- Data stage Installation process and setup
- Project Explanation