Youtube Channel

Aggregator Transformation



1. What is aggregator transformation?

Aggregator transformation performs aggregate calculations like sum, average, count etc. It is an active transformation, changes the number of rows in the pipeline. Unlike expression transformation (performs calculations on a row-by-row basis), an aggregator transformation performs calculations on group of rows.

2. What is aggregate cache?

The integration service creates index and data cache in memory to process the aggregator transformation and stores the data group in index cache, row data in data cache. If the integration service requires more space, it stores the overflow values in cache files.

3. How can we improve performance of aggregate transformation?

·  Use sorted input: Sort the data before passing into aggregator. The integration service uses memory to process the aggregator transformation and it does not use cache memory.
·  Filter the unwanted data before aggregating.
·  Limit the number of input/output or output ports to reduce the amount of data the aggregator transformation stores in the data cache.

4. What are the different types of aggregate functions?

The different types of aggregate functions are listed below:

·                     AVG
·                     COUNT
·                     FIRST
·                     LAST
·                     MAX
·                     MEDIAN
·                     MIN
·                     PERCENTILE
·                     STDDEV
·                     SUM
·                     VARIANCE

5. Why cannot you use both single level and nested aggregate functions in a single aggregate transformation?

The nested aggregate function returns only one output row, whereas the single level aggregate function returns more than one row. Since the numbers of rows returned are not same, you cannot use both single level and nested aggregate functions in the same transformation. If you include both the single level and nested functions in the same aggregator, the designer marks the mapping or mapplet as invalid. So, you need to create separate aggregator transformations.

6. up to how many levels, you can nest the aggregate functions?

We can nest up to two levels only.
Example: MAX (SUM (ITEM))

7. What is incremental aggregation?
The integration service performs aggregate calculations and then stores the data in historical cache. Next time when you run the session, the integration service reads only new data and uses the historical cache to perform new aggregation calculations incrementally.

8. Why cannot we use sorted input option for incremental aggregation?

In incremental aggregation, the aggregate calculations are stored in historical cache on the server. In this historical cache the data need not be in sorted order.  If you give sorted input, the records come as presorted for that particular run but in the historical cache the data may not be in the sorted order. That is why this option is not allowed.

9. How the NULL values are handled in Aggregator?

 
You can configure the integration service to treat null values in aggregator functions as NULL or zero. By default the integration service treats null values as NULL in aggregate functions.

Q37. How an Expression Transformation differs from Aggregator Transformation?
Ans. An Expression Transformation performs calculation on a row-by-row basis. An Aggregator Transformation performs calculations on groups.

Q38. Does an Informatica Transformation support only Aggregate expressions?
Ans. Apart from aggregate expressions Informatica Aggregator also supports non-aggregate expressions and conditional clauses. 

Q39. How does Aggregator Transformation handle NULL values? 
Ans. By default, the aggregator transformation treats null values as NULL in aggregate functions. But we can specify to treat null values in aggregate functions as NULL or zero. 

Q40. What is Incremental Aggregation? 
Ans. We can enable the session option, Incremental Aggregation for a session that includes an Aggregator Transformation. When the Integration Service performs incremental aggregation, it actually passes changed source data through the mapping and uses the historical cache data to perform aggregate calculations incrementally. 

Q41. What are the performance considerations when working with Aggregator Transformation? 
  Filter the unnecessary data before aggregating it. Place a Filter transformation in the mapping before the Aggregator transformation to reduce unnecessary aggregation. 
  Improve performance by connecting only the necessary input/output ports to subsequent transformations, thereby reducing the size of the data cache. 
  Use Sorted input which reduces the amount of data cached and improves session performance. 

Q42. What differs when we choose Sorted Input for Aggregator Transformation? 
Ans. Integration Service creates the index and data caches files in memory to process the Aggregator transformation. If the Integration Service requires more space as allocated for the index and data cache sizes in the transformation properties, it stores overflow values in cache files i.e. paging to disk. One way to increase session performance is to increase the index and data cache sizes in the transformation properties. But when we check Sorted Input the Integration Service uses memory to process an Aggregator transformation it does not use cache files. 

Q43. Under what conditions selecting Sorted Input in aggregator will still not boost session performance? 
  Incremental Aggregation, session option is enabled. 
  The aggregate expression contains nested aggregate functions. 
  Source data is data driven. 

Q44. Under what condition selecting Sorted Input in aggregator may fail the session? 
  If the input data is not sorted correctly, the session will fail. 
  Also if the input data is properly sorted, the session may fail if the sort order by ports and the group by ports of the aggregator are not in the same order. 

Q45. Suppose we do not group by on any ports of the aggregator what will be the output. 
Ans. If we do not group values, the Integration Service will return only the last row for the input rows. 

Q46. What is the expected value if the column in an aggregator transform is neither a group by nor an aggregate expression? 
Ans. Integration Service produces one row for each group based on the group by ports. The columns which are neither part of the key nor aggregate expression will return the corresponding value of last record of the group received. However, if we specify particularly the FIRST function, the Integration Service then returns the value of the specified first row of the group. So default is the LAST function. 

Q47. Give one example for each of Conditional Aggregation, Non-Aggregate expression and Nested Aggregation. 
Use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.
SUM( SALARY, JOB = CLERK )
Use non-aggregate expressions in group by ports to modify or replace groups.
IIF( PRODUCT = Brown Bread, Bread, PRODUCT )
The expression can also include one aggregate function within another aggregate function, such as:
MAX( COUNT( PRODUCT )) 

Q48. Under what condition selecting Sorted Input in aggregator may fail the session? 
1. If the input data is not sorted correctly, the session will fail. 
2. Also if the input data is properly sorted, the session may fail if the sort order by ports and the group by ports of the aggregator are not in the same order.




0 comments: