Questions tagged [datastage]

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool.

DataStage is the ETL (Extract, Transform, Load) component of the IBM InfoSphere Information Server suite. It allows the user to integrate various data sources and targets in an enterprise environment as a GUI based client tool. Data Sources/Targets could be database tables, flat files, datasets, csv files etc. Basic design paradigm consists of a unit of work called as DataStage job. Multiple jobs can be controlled and conditionally sequenced using 'Sequences'.

IBM® InfoSphere® DataStage® integrates data across multiple systems using a high performance parallel framework, and it supports extended metadata management and enterprise connectivity. The scalable platform provides more flexible integration of all types of data, including big data at rest (Hadoop-based) or in motion (stream-based), on distributed and mainframe platforms.

Read more here

InfoSphere DataStage provides these features and benefits:

  • Powerful, scalable ETL platform
  • Support for big data and Hadoop
  • Near real-time data integration
  • Workload and business rules management
  • Ease of use

Support for big data and Hadoop

  • Includes support for IBM InfoSphere BigInsights, Cloudera, Apache and Hortonworks Hadoop Distributed File System (HDFS).
  • Offers Balanced Optimization for Hadoop capabilities to push processing to the data and improve efficiency.
  • Supports big-data governance including features such as impact analysis and data lineage

Powerful, scalable ETL platform

  • Manages data arriving in near real-time as well as data received on a periodic or scheduled basis.

  • Provides high-performance processing of very large data volumes.

  • Leverages the parallel processing capabilities of multiprocessor hardware platforms to help you manage growing data volumes and shrinking batch windows.

  • Supports heterogeneous data sources and targets in a single job including text files, XML, ERP systems, most databases (including partitioned databases), web services, and business intelligence tools.

Near real-time data integration

  • Captures messages from Message Oriented Middleware (MOM) queues using Java Message Services (JMS) or WebSphere MQ adapters, allowing you to combine data into conforming operational and historical analysis perspectives.

  • Provides a service-oriented architecture (SOA) for publishing data integration logic as shared services that can be reused over the enterprise.

  • Can simultaneously support high-speed, high reliability requirements of transactional processing and the large volume bulk data requirements of batch processing.

Ease of use

  • Includes an operations console and interactive debugger for parallel jobs to help you enhance productivity and accelerate problem resolution.

  • Helps reduce the development and maintenance cycle for data integration projects by simplifying administration and maximizing development resources.

  • Offers operational intelligence capabilities, smart management of metadata and metadata imports, and parallel debugging capabilities to help enhance productivity when working with partitioned data.

507 questions
2
votes
1 answer

Sparse lookup for tracking used records

I have a scenario where I have claim application table and application claimant table. I need to sparse look up for the application claimant ID from the application claimant table using SSN as the key. Problem is there are multiple application…
blr20
  • 23
  • 7
2
votes
1 answer

DataStage Error: The OCI function OraOCIEnvNIsCreate:OCI_UTF16ID returned status -1

I am trying to preform a simple connection test in DataStage 8.7. I have an Oracle_Connector inside a Parallel job. I know the credentials are good as I can connect with them using something like SQL Developer. However I am seeing the following…
Wes
  • 4,353
  • 6
  • 39
  • 53
2
votes
1 answer

Cross Project Compare option in Data Stage 9.1

There is a utility called Cross Project compare in the Data Stage designer. using the cross project compare utility I can compare two jobs (for eg. two parallel jobs) from different environments (for eg dev vs prod). I wondered if there is any…
NIMISH DESHPANDE
  • 453
  • 3
  • 10
  • 30
2
votes
1 answer

Data stage parallel job export options

I am aware that in Datastage the parallel jobs (.pjb) or any other jobs can be exported to .dsx and .isx files. I wondered if I can simply export a .pjb file as is ?
NIMISH DESHPANDE
  • 453
  • 3
  • 10
  • 30
2
votes
1 answer

DataStage 11.3 Assembly Editor flash popup

Our organisation is in the process of upgrading from DataStage 9.1 to 11.3. Problem: The DataStage 11.3 Assembly Editor fails to display, and falls over with an error. Backend OS: Red Hat Enterprise Linux Server release 6.6 (Santiago) Linux …
Bruce Smith
  • 41
  • 1
  • 4
2
votes
3 answers

How to write datastage performance stats on a DB2 table?

My DataStage version is 8.5. I have to populate a table in DB2 with the datastage performance data, something like job_name, start_time, finish_time and execution_date. There is a master sequence with A LOT of jobs. The sequence itself runs once a…
LeandroHumb
  • 704
  • 6
  • 20
2
votes
0 answers

Unlock DataStage job

i am planning disconnect the user session(to unlock job) through a Unix script , provided a session id as input. Here is the manual procedure i found from IBM website. IBM procedure starts http://www-01.ibm.com/support/docview.wss?uid=swg21439971 In…
user3686069
  • 91
  • 2
  • 13
2
votes
4 answers

Run database query in datastage with no inputs or outputs

Relatively new to datastage, quite possibly a stupid question. From datastage, I want to run a database query against a SQL Server database. The query is a delete query with a hardcoded WHERE clause (not my decision). What I cannot figure out is…
James Dean
  • 683
  • 1
  • 10
  • 22
2
votes
4 answers

After insert trigger - SQL Server 2008

I have data coming in from datastage that is being put in our SQL Server 2008 database in a table: stg_table_outside_data. The ourside source is putting the data into that table every morning. I want to move the data from stg_table_outside_data to…
Azzna
  • 77
  • 2
  • 9
2
votes
1 answer

Dumping dataset (.ds) file contents to a text file

At work we use DataStage which uses dataset (.ds) files. I can view the contents of the file from without our UNIX environment by using: orchadmin dump -name This only dumps the contents of the file to the screen. What I would like…
mlevit
  • 2,586
  • 8
  • 40
  • 49
1
vote
1 answer

ODBC Config File for Datastage Connection to SQLServer 2008

I have an odbc config file on a sun solaris server, used for IBM datastage. We need to connect to a sqlserver express edition. Ip used to connect is xxx.xxx.xxx.xxx\TARGET port is 1433, database is dbname. Sample of config file is: …
sangi
  • 479
  • 3
  • 12
  • 23
1
vote
2 answers

Datastage: how to improve the performance load data from oracle to sql server

The platform is IBM Datastage 8.1 RHEL4 16G MEM,4CPU16CORE. When I try to create a job to load data from Oracle to SQL Server the job is running correctly, but slowly. The row count from the source table in Oracle is about 100,000,000 and the speed…
gobird
  • 81
  • 1
  • 7
1
vote
3 answers

Calling WCF service from Datastage - Output to XML file

I have developed a WCF service that returns data serializable objects as [DataContracts]. Other folks in my organization wish to call this web services using DataStage and have it output the response to an XML file. We are able to reference the…
Borophyll
  • 949
  • 2
  • 13
  • 24
1
vote
4 answers

strlen inconsistent with zero length string

I'm creating a DataStage parallel routine, which is a C or C++ function that is called from within IBM (formerly Ascential) DataStage. It is failing if one of the strings passed in is zero length. If I put this at the very first line of the…
PhilHibbs
  • 730
  • 1
  • 10
  • 27
1
vote
1 answer

How to connect Amazon S3 to IBM datastage server which is hosted on premise

I have IBM Datastage server installed on premises. I want to connect to an Amazon S3 bucket from datastage to load data. How can i establish a connection to Amazon S3 from datastage server.
Db2Cramp
  • 23
  • 5
1
2
3
33 34