Thursday, 18 October 2012

Basics Of UNIX

The purpose of this post is to have a single page of frequently used basics commands for getting started with UNIX. Basic UNIX Command Line (shell) navigation: Directories: Directories:  Moving around the file system:  Listing directory contents:  Changing file permissions and attributes  Moving, renaming, and copying files:  Viewing and editing files: Directories: File and directory paths in UNIX...
Read more ...>>

Wednesday, 17 October 2012

Performance Tuning in IBM InfoSphere DataStage

Performance is a key factor in the success of any data warehousing project. Care for optimization and performance should be taken into account from the inception of the design and development process. Ideally, a DataStage® job should process large volumes of data within a short period of time. For maximum throughput and performance, a well performing infrastructure is required, or else the tuning of DataStage® jobs will not make...
Read more ...>>

Monitoring Datastage Jobs

The Monitor window in Datastage Director - Datastage Job Monitor is accessible through Datastage Director. This option appears by right clicking on any Job name in Datastage Director client. OR Select the Job on the Director list window. Go to Tools --> View Monitor and select the job. This basically displays summary information...
Read more ...>>

Tuesday, 16 October 2012

ETL vs ELT

One among the many useful articles by Vincent McBurney about comparison between ETL and ELT.  Every now and then I come across a blog entry that reminds me there are people out there who know a lot more about my niche than I do! This is fortunate as this week it has helped me understand ELT tools. ETL versus ELT and ETLT The world of data integration has it's own Coke versus Pepsi challenge - it's called ETL versus...
Read more ...>>

Datastage Execution Flow

When you execute a job, the generated OSH and contents of the configuration file ($APT_CONFIG_FILE) is used to compose a “score”. This is similar to a SQL query optimization plan. At runtime, IBM InfoSphere DataStage identifies the degree of parallelism and node assignments for each operator, and inserts sorts and partitioners as needed...
Read more ...>>

Saturday, 13 October 2012

Quick Datastage Tips

While designing/coding any kind or level of job, few tips may become very handy. This would result into designing of Datastage jobs efficiently. More the efficiency, better would be the performance of the job once executed. These handy tips are also useful in reducing the job designing/coding time and better debugging approach. Partitioning Partitioning...
Read more ...>>