Datastage Enterprise Edition: Different Version of Datastage
Datastage Enterprise Edition: Different Version of Datastage
[edit]
Introduction
DataStage Enterprise Edition is a package of three products: DataStage Server Edition, the parallel
extender with parallel ETL jobs and the MetaStage product described on the Metadata Workbench entry.
The flagship tool of Enterprise Edition is parallel ETL jobs.
[edit]
History
During the 1990s the data integration vendors such as Ascential and Informatica were competing to
deliver tools that provided a wide range of data connectivity and transformation functions in a mostly code
free environment. Towards the late 1990s data stores were becoming large, data warehouses and
business intelligence was demanding larger volumes of data loads. The physical architecture of these
loads was hitting a limit on the volume that a single server could handle and was moving towards clusters
or grids of servers.
The data integration vendors need to be able to integrate data across a massively scalable architecture to
keep up with the increased data volumes.
Ascential started to roll out a parallel capability in the DataStage Server Edition product called multiple
instance jobs. This allowed some additional manual programming to partition and process data in parallel.
In November 2001 they switched to a buy approach and purchased Torrent Systems for $46 million.
Torrent had the capability to run tools on a massively parallel processing (MPP) platform.
[edit]
Versions
This section lists each major release of DataStage Enterprise Edition and the enhancements for
DataStage parallel jobs. For a list of enhancements to the client tools see the versions on the DataStage
Server Edition page is it is the version that has been delivered with every release going back to DataStage
1.
All release of DataStage 7 can import and upgrade DataStage 6 export files. DataStage 8 can only import
and upgrade DataStage 7.5.1 or 7.5.2 jobs.
[edit]
DataStage 6
Released in September 2002, ten months after the acquisition of Torrent, it was the first version of
DataStage to feature the Parallel Extender (PX), the parallel platform that allows processes to run in
parallel across a multiple processor environment.
New parallel job type with a new set of parallel stages. Some with the same name as server job
stages but with different properties and options.
Server job shared container for parallel jobs.
CPU based licensing instead of server based licensing.
Support for SAS 6.12 and 8.2.
This release was followed by the client only 6.0.1 release that fixed a number problems.
[edit]
DataStage 7
Release September 2003 it uses much the same architecture of the previous version with improvements
to the usability. This was the first release to have no server job improvements but many parallel job
improvements.
XML Pack 2.0 provides improved XML metadata support for parallel jobs.
National Language Support (NLS) for parallel jobs but not for all parallel stages.
Parallel shared and local stages.
Enhanced transformer with improved reject row handling, string handling, timestamp conversion
and compile performance.
Modify, Switch and Filter stages added.
Multiple-instance parallel jobs.
Non blocking funnel stage.
[edit]
DataStage 7.5
Released in December 2004 this was the first release of parallel jobs that could run on Windows. While
the Server runs on all the same Unix and Linux platforms as 7.5.1 it adds the additional platform of
Windows 2003 Standard or Enterprise on the Intel x86 Processor Family.
There were no changes to parallel jobs in this release apart from the capability to compile and run them
on Windows.
[edit]
DataStage 8
Released in October 2006 for Windows and April 2007 for Unix this is the first version to run on the IBM
Information Server. There are a number of parallel job improvements in this release:
Lookup stage now supports two new lookup types: range lookup and caseless lookup.
New Slowly Changing Dimension stage.
New QualityStage stages for parallel jobs.
the main reason for success of IBM is "it connects the every software tool to websphere after purchasing
it" this is for only understand for webspher users only.....which is the IBM product