Youtube Channel

Joiner Transformation

1. What is a joiner transformation?
A joiner transformation joins two heterogeneous sources. You can also join the data from the same source. The joiner transformation joins sources with at least one matching column. The joiner uses a condition that matches one or more joins of columns between the two sources.

2. How many joiner transformations are required to join n sources?
To join n sources n-1 joiner transformations are required.

3. What are the limitations of joiner transformation?
You cannot use a joiner transformation when input pipeline contains an update strategy transformation.
You cannot use a joiner if you connect a sequence generator transformation directly before the joiner.

4. What are the different types of joins?
Normal join: In a normal join, the integration service discards all the rows from the master and detail source that do not match the join condition.
Master outer join: A master outer join keeps all the rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source.
Detail outer join: A detail outer join keeps all the rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.
Full outer join: A full outer join keeps all rows of data from both the master and detail rows.

5.What is joiner cache?
When the integration service processes a joiner transformation, it reads the rows from master source and builds the index and data cached. Then the integration service reads the detail source and performs the join. In case of sorted joiner, the integration service reads both sources (master and detail) concurrently and builds the cache based on the master rows.

6.How to improve the performance of joiner transformation?
Join sorted data whenever possible.
For an unsorted Joiner transformation, designate the source with fewer rows as the master source.
For a sorted Joiner transformation, designate the source with fewer duplicate key values as the master source.

7.Why joiner is a blocking transformation?
When the integration service processes an unsorted joiner transformation, it reads all master rows before it reads the detail rows. To ensure it reads all master rows before the detail rows, the integration service blocks all the details source while it caches rows from the master source. As it blocks the detail source, the unsorted joiner is called a blocking transformation.
 

8.What are the settings used to configure the joiner transformation
Master and detail source
Type of join
Join condition

9.Why Joiner Transformation t is an Active one?
The join condition contains ports from both input sources that must match for the Integration Service to join two rows. Depending on the type of join selected, the Integration Service either adds the row to the result set or discards the row.
The Joiner transformation produces result sets based on the join type, condition, and input data sources. Hence it is an Active transformation.

10.Out of the two input pipelines of a joiner, which one will you set as the master pipeline?
To improve performance for an Unsorted Joiner transformation, use the source with fewer rows as the master source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
To improve performance for a Sorted Joiner transformation, use the source with fewer duplicate key values as the master source.
11.How does Joiner transformation treat NULL value matching?
 The Joiner transformation does not match null values.
For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not consider them a match and does not join the two rows.
To join rows with null values, replace null input with default values in the Ports tab of the joiner, and then join on the default values.
Note: If a result set includes fields that do not contain data in either of the sources, the Joiner transformation populates the empty fields with null values. If we know that a field will return a NULL and we do not want to insert NULLs in the target, set a default value on the Ports tab for the corresponding port.
12.Suppose we configure Sorter transformations in the master and detail pipelines with the following sorted ports in order: ITEM_NO, ITEM_NAME, and PRICE.
If we have sorted both the master and detail pipelines in order of the ports say ITEM_NO, ITEM_NAME and PRICE we must ensure that:
· Use ITEM_NO in the First Join Condition.
· If we add a Second Join Condition, we must use ITEM_NAME.
· If we want to use PRICE as a Join Condition apart from ITEM_NO, we must also use ITEM_NAME in the Second Join Condition.
· If we skip ITEM_NAME and join on ITEM_NO and PRICE, we will lose the input sort order and the Integration Service fails the session.

13.What are the transformations that cannot be placed between the sort origin and the Joiner transformation so that we do not lose the input sort order? 
The best option is to place the Joiner transformation directly after the sort origin to maintain sorted data. However do not place any of the following transformations between the sort origin and the Joiner transformation:
    Custom
    UnsortedAggregator
    Normalizer
    Rank
    Union transformation
    XML Parser transformation
    XML Generator transformation
            Mapplet [if it contains any one of the above mentioned transformations]
14.Describe the scenarios where we go for Joiner transformation instead of Source Qualifier transformation.
While joining Source Data of heterogeneous sources as well as to join flat files we will use the Joiner transformation. Use the Joiner transformation when we need to join the following types of sources:
Join data from different Relational Databases.
Join data from different Flat Files.
Join relational sources and flat files.


0 comments: