Wednesday, 10 September 2014

Informatica Designer Interview Questions


Informatica Designer Interview Questions


Q. How to execute PL/SQL script from Informatica mapping?
A. Stored Procedure (SP) transformation can be used to execute PL/SQL Scripts. In SP Transformation PL/SQL procedure name can be specified. Whenever the session is executed, the session will call the pl/sql procedure.

Q. How can you define a transformation? What are different types of transformations available in Informatica?
A. A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Below are the various transformations available in Informatica:
  • Aggregator
  • Application Source Qualifier
  • Custom
  • Expression
  • External Procedure
  • Filter
  • Input
  • Joiner
  • Lookup
  • Normalizer
  • Output
  • Rank
  • Router
  • Sequence Generator
  • Sorter
  • Source Qualifier
  • Stored Procedure
  • Transaction Control
  • Union
  • Update Strategy
  • XML Generator
  • XML Parser
  • XML Source Qualifier

Q. What is a source qualifier? What is meant by Query Override?
A. Source Qualifier represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation.
PowerCenter Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query.

Q. What is aggregator transformation?
A. The Aggregator transformation allows performing aggregate calculations, such as averages and sums. Unlike Expression Transformation, the Aggregator transformation can only be used to perform calculations on groups. The Expression transformation permits calculations on a row-by-row basis only.
Aggregator Transformation contains group by ports that indicate how to group the data. While grouping the data, the aggregator transformation outputs the last row of each group unless otherwise specified in the transformation properties.
Various group by functions available in Informatica are : AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE.

Q. What is Incremental Aggregation?
A. Whenever a session is created for a mapping Aggregate Transformation, the session option for Incremental Aggregation can be enabled. When PowerCenter performs incremental aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally.

Q. How Union Transformation is used?
A. The union transformation is a multiple input group transformation that can be used to merge data from various sources (or pipelines). This transformation works just like UNION ALL statement in SQL, that is used to combine result set of two SELECT statements.

Q. Can two flat files be joined with Joiner Transformation?
A. Yes, joiner transformation can be used to join data from two flat file sources.

Q. What is a look up transformation?
A. This transformation is used to lookup data in a flat file or a relational table, view or synonym. It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. Later returned values can be passed to other transformations.

Q. Can a lookup be done on Flat Files?
A. Yes.

Q. What is the difference between a connected look up and unconnected look up?
A. Connected lookup takes input values directly from other transformations in the pipeline.
Unconnected lookup doesnít take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping.

Q. What is a mapplet?
A. A mapplet is a reusable object that is created using mapplet designer. The mapplet contains set of transformations and it allows us to reuse that transformation logic in multiple mappings.

Q. What does reusable transformation mean?
A. Reusable transformations can be used multiple times in a mapping. The reusable transformation is stored as a metadata separate from any other mapping that uses the transformation. Whenever any changes to a reusable transformation are made, all the mappings where the transformation is used will be invalidated.

Q. What is update strategy and what are the options for update strategy?
A. Informatica processes the source data row-by-row. By default every row is marked to be inserted in the target table. If the row has to be updated/inserted based on some logic Update Strategy transformation is used. The condition can be specified in Update Strategy to mark the processed row for update or insert.
Following options are available for update strategy:
  • DD_INSERT : If this is used the Update Strategy flags the row for insertion. Equivalent numeric value of DD_INSERT is 0.
  • DD_UPDATE : If this is used the Update Strategy flags the row for update. Equivalent numeric value of DD_UPDATE is 1.
  • DD_DELETE : If this is used the Update Strategy flags the row for deletion. Equivalent numeric value of DD_DELETE is 2.
  • DD_REJECT : If this is used the Update Strategy flags the row for rejection. Equivalent numeric value of DD_REJECT is 3.

Informatica Transformation Types


Informatica Transformation Types

A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data.
Transformations can be of two types:
Active Transformation: An active transformation can change the number of rows that pass through the transformation, change the transaction boundary, can change the row type. For example, Filter, Transaction Control and Update Strategy are active transformations.
The key point is to note that Designer does not allow you to connect multiple active transformations or an active and a passive transformation to the same downstream transformation or transformation input group because the Integration Service may not be able to concatenate the rows passed by active transformations. However, Sequence Generator transformation(SGT) is an exception to this rule. A SGT does not receive data. It generates unique numeric values. As a result, the Integration Service does not encounter problems concatenating rows passed by a SGT and an active transformation.
Passive Transformation: A passive transformation does not change the number of rows that pass through it, maintains the transaction boundary, and maintains the row type.
The key point is to note that Designer allows you to connect multiple transformations to the same downstream transformation or transformation input group only if all transformations in the upstream branches are passive. The transformation that originates the branch can be active or passive.
Transformations can be Connected or UnConnected to the data flow.
Connected Transformation: Connected transformation is connected to other transformations or directly to target table in the mapping.
UnConnected Transformation: An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.

ETL Tools



Business Case: Why do we need ETL Tools?
Think of GE, the company has over 100+ years of history & presence in almost all the industries. Over these years company’s management style has been changed from book keeping to SAP. This transition was not a single day transition. In transition, from book keeping to SAP, they used a wide array of technologies, ranging from mainframes to PCs, data storage ranging from flat files to relational databases, programming languages ranging from Cobol to Java.This transformation resulted into different businesses, or to be precise different sub businesses within a business, running different applications, different hardware and different architecture. Technologies are introduced as and when invented & as and when required.
This directly resulted into the scenario, like HR department of the company running on Oracle Applications, Finance running SAP, some part of process chain supported by mainframes, some data stored on Oracle, some data on mainframes, some data in VSM files & the list goes on. If one day company requires a consolidated reports of assets, there are two ways.
  • First completely manual, generate different reports from different systems and integrate them.
  • Second fetch all the data from different systems/applications, make a Data Warehouse, and generate reports as per the requirement.
Obviously second approach is going to be the best.
Now to fetch the data from different systems, making it coherent, and loading into a Data Warehouse requires some kind of extraction, cleansing, integration, and load. ETL stands for Extraction, Transformation & Load.
ETL Tools provide facility to Extract data from different non-coherent systems, cleanse it, merge it and load into target systems.
Informatica: ETL (Extract-Transform-Load) tool
  • Extract data from multiple sources
  • Transform the data according to business logic you
     build in the client application
  • Load data into a centralized location, such as a data
     mart, data warehouse, or operational data store
     (ODS)
Informatica has 4 main components
1.Informatica Repository (Server Component)
Set of metadata tables within the repository database
2.Informatica Repository Server
(Server Component)
Manages connections to the repository from
client applications
3.Informatica Server (Server Component)
  Extracts source data, performs data
transformation & load the target
4.Informatica Client (Client Component)
Manage users, build mappings and create
workflows
  • Repository Manager
  • Designer
  • Workflow Manager
  • Workflow Monitor
Components of Informatica Client
Repository Manager
Manage the Informatica repository from the client machine
Designer
The Designer helps you create source definitions, target definitions, and
transformations to build a mapping
Workflow Manager
Allows user to execute a mapping created in the designer
Workflow Monitor
Tool that allows you to monitor workflows and tasks

The Repository Manager client tool allows you to navigate 
through multiple folders and repositories and perform basic
repository tasks such as
  • Create / delete repositories
  • User Administration
  • Monitoring locks
Demo of Repository Manager
Exercise
Create new user
Create new folder
Give access to user

The Designer provides the following tools:
  • Source Analyzer: Use to import or create source
    definitions
  • Warehouse Designer: Use to import or create target
    definitions
  • Transformation Developer: Use to create reusable
    transformations
  • Mapplet Designer: Use to create mapplets
  • Mapping Designer: Use to create mappings

1.Source databases
  • Relational: Oracle, Sybase, Informix, IBM DB2, MS SQL Server &Teradata
  • File: Fixed and delimited flat file, COBOL file, and XML.
  • Application: Additional PowerConnect products to access business sources such as PeopleSoft, SAP R/3, Siebel, IBM MQSeries, and Tibco.
  • Mainframe: PowerConnect for Mainframe (IBM DB2 on MVS).
  • Other: Microsoft Excel and Access.
2.Target databases
  • Relational: Oracle, Sybase, Sybase IQ, Informix, IBM DB2, MS SQL Server & Teradata.
  • File: Fixed and delimited flat file and XML.
  • Application: Additional PowerConnect products to load data into SAP BW, IBM MQSeries
  •   message queues and Tibco
Other: Microsoft Access

Demo of Informatica Designer
Exercise:
Import a source definition
Import a target definition
Create a basic mapping 

Informatica Client - Workflow Manager
To create a workflow, you first create tasks such as a session,
which contains the mapping you build in the Designer. Workflow 
Manager provides 3 tools:
  • Task Developer: Use the Task Developer to create tasks you
    want to execute in the workflow.
  • Workflow Designer: Use the Workflow Designer to create a
    workflow by connecting tasks with links. You can also create
    tasks in the Workflow Designer as you develop the workflow.
  • Worklet Designer: Use the Worklet Designer to create a
    worklet
 Informatica Client – Workflow Monitor
The Workflow Monitor consists of the following windows:
  • Navigator window: Displays monitored repositories, servers,
     and repositories objects
  • Output window: Displays messages from the Informatica
    Server and the Repository Server.
  • Time window: Displays progress of workflow runs.
  • Gantt Chart view: Displays details about workflow runs in
    chronological (Gantt Chart) format.
  • Task view: Displays details about workflow runs in a report
    format, organized by task, folder, or status
Demo of Informatica Workflow Manager and Workflow Monitor

Exercise:
Create a task
Create a workflow
Monitor a workflow

Specialized transformations
  1.     Source Qualifier    
  2.     Update Strategy 
  3.     Lookup transformation
  4.     Sequence Generator 
  5.     Expression transformation
  6.     Stored procedure transformation
   Transformation - Source Qualifier
The Source Qualifier represents the rows that the 
Informatica Server reads when it executes a session. 
(only 1 per mapping)
Join data originating from multiple source tables
Filter data from source tables (add WHERE clause)
Specify an outer join rather than the default inner join
Specify sorted ports: Add an ORDER BY clause to the

default SQL query
Select only distinct values from the source
Create a custom query to issue a special SELECT
Transformation – Update Strategy 
This transformation determines how to handle changes to
existing rows in the target
Within a session:
TREAT SOURCE ROWS AS

    INSERT (default)

    UPDATE

    DELETE

    DATA DRIVEN

Individual target tables

    INSERT (default)

    UPDATE AS UPDATE

    UPDATE AS INSERT

    UPDATE ELSE INSERT

    DELETE
Within a mapping:
Using Update transformation

Transformation – Lookup
Use a Lookup transformation in your mapping to look up data 
in a relational table, view, or synonym
Lookup table – Defines the database table to perform lookup operation
Ports – Define the table columns used for the lookup
Properties – Lookup policy, caching, tracing level
Condition – Lookup condition


Transformation – Sequence Generator
The Sequence Generator transformation generates 
numeric values
NEXTVAL / CURRVAL (default ports)
Properties
        Start value (default = 0)

   End value (default = 2147483647)

   Increment by (default = 1)

   Cycle
Transformation – Expression
Expression transformation are used to calculate values in a
single row before writing to the target
Character operations (Concatenation, Case, Trim etc)

Arithmetic operations (Absolute, Exponential, SQRT etc)

Conversion (TO_CHAR, TO_DATE etc)

Date operations (TO_DATE, ADD_TO_DATE etc)

Special operations (e.g. IIF / DECODE etc)

Test operations (IS_DATE, ISNULL, IS_NUMBER etc)

 Transformation – Stored Procedure
    Used to call Stored procedures within a mapping
    Connected mode: Every row of data in the Source Qualifier

 transformation passes data through the Stored Procedure transformation
    Unconnected mode: Stored procedure runs before or after the session

 and is not connected to the flow of data
    Type of Stored procedure
Normal (Row by row execution)
Pre-load of the Source: Before session retrieves data from source
Post-load of the Source: After session retrieves data from source
Pre-load of the Target: Before session loads data into target
Post-load of the Target: After session loads data into target
 Other Transformations
Aggregator Transformation: The Aggregator transformation allows you to perform 

aggregate calculations, such as averages and sums on data groups
External procedure Transformation: Use Transformation Exchange (TX) dynamic 

invocation interface to call COM components and DLLs
Filter Transformation:  Used to filter rows in a mapping
Joiner Transformation: Joiner transformation joins two related

heterogeneous sources
 residing in different locations or file systems

with at least one common port




Normalizer
 Transformation: Primarily used with COBOL sources which are

often stored
 in a denormalized format.
Rank Transformation: Allows you to select only the top or bottom rank of

data
 Router Transformation:Router transformation tests data for one or more 
conditionsand gives you the option to route rows of data that do not meet any

of  the conditions to a defaultoutput group.

     Sorter Transformation: The Sorter transformation allows you to sort

     data in 
ascending or descending order according to a specified sort key


     XML Source Qualifier:
  The XML Source Qualifier represents the data       
     elements that 
the Informatica Server reads when it executes a session

     with XML sources.


Basic Information

Informatica To ETL:

1)Loading Tables into sources


Report Generating Tools

1)BOXI

2)OBIEE

3)PENTAHO

4)TABLEAU

5)QLICKVIEW

Informatica software

Basics of ETL and Informatica


Basics of ETL and Informatica

Posted by Koti Reddy on September 10, 2014
Thought to provide some basics of ETL and Informatica PowerCenter 8.6
What is ETL?
ETL stands for Extraction, Transformation and Load. It is the process of transferring heterogeneous data from different sources to target system called as data warehouse as per the business requirements for different purpose like management reporting, data analysis etc. ETL Tools provide facility to extract data from different non-coherent systems, cleanse it, merge it and load into target systems.
Informatica is one of the powerful ETL tool, easy to work which supports all the steps of Extraction, Transformation and Load process. It has got a simple visual interface like forms in visual basic. You just need to drag and drop different objects (known as transformations) and design process flow for Data extraction transformation and load.
Main Components of Informatica Power Center
1. Repository:
Repository is the heart of Informatica tool. Repository is a kind of data inventory where all the data related to mappings, sources, targets etc is kept. All the client tools and Informatica Server fetch data from Repository.
2. Informatica PowerCenter Server:
All the executions take place in server, it makes physical connections to sources/targets, fetches data, applies the transformations mentioned in the mapping and loads the data in the target system.
3. Informatica PowerCenter Client Tools:
These tools enable a developer to define transformation process, known as mapping in Designer, define run-time properties for a mapping, known as sessions in Workflow Manager, monitor execution of sessions in Workflow Monitor and manage repository, useful for administrators in Repository Manager.
Clients Tools
Administration Console
It’s a web based client tool which is responsible for creating and managing services, manipulate services, manages to create users/groups and privileges. It creates repositories, services and initialization of the services.
It creates and maintains two important services 1. Repository Services – Interacts with Repository and 2. Integration Services – Main services which used to execute the process through repository services.
Repository Manager
It is responsible for creating the folders to manage the Meta data used by different users.
Designer
It is responsible for defining the flows between source system and target systems through transformation logics called as mappings. There are five components of Designer
1. Source Analyzer: Defines the source definition through ODBC connection.
2. Target Designer: Defines the target definition through ODBC connection
3. Transformation Developer: Transformations help to transform the source data according to the requirements of target system. Sorting, Filtering, Aggregation, Joining are some of the examples of transformation.
4. Mapplet Designer: A Mapplet is a reusable object that you create in the Mapplet Designer. It contains a set of transformations and lets you reuse the transformation logic in multiple mappings.
5. Mappings Designer: Mappings are developed in mappings designer. A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation.
Workflow Manager
Workflow helps to load the data from source to target in a sequential manner. A workflow is a set of instructions that tells the Integration service how to run tasks such as sessions, email notifications, and shell commands. After you create tasks in the Task Developer and Workflow Designer, you connect the tasks with links to create a workflow. There are three components of workflow manager
1. Task Developer: Defines the tasks like sessions, email, command. Session is a object of an repository which sets instructions to the server how to execute by carrying the mapping definitions and production connections.
2. Worklet Designer: It’s an object which carries set of sessions for execution in a serial or parallel flow.
3. Workflow Designer: it’s an object of repository which sets instructions to the server, when to execute the workflow.
Workflow Monitor
This monitor is helpful in monitoring and tracking the workflows created in each Power Center Server.
Simple steps to start with Informatica 8.6
1. Install the Informatica
2. Create a Repository in Admin Console
3. Create Integration Services in Admin Console
4. Connect to Repository Manager (First time, add the repository created and connect it, create the folder)
5. Work on the Designer to create Mappings
6. Work on the Workflow Manager to create workflow
7. Work on the Workflow Monitor to track the changes
- Koti Reddy