Step-3: Classification of Aggregation Functions

  • How hard to compute aggregate from sub-aggregates?

 

    • Three classes of aggregates:

 

      • Distributive
        • Compute aggregate directly from sub-aggregates
        • Examples: MIN, MAX ,COUNT, SUM

 

      • Algebraic
        • Compute aggregate from constant-sized summary of subgroup
        • Examples: STDDEV, AVERAGE
        • For AVERAGE, summary data for each group is SUM, COUNT

 

      • Holistic
        • Require unbounded amount of information about each subgroup
        • Examples: MEDIAN, COUNT DISTINCT
        • Usually impractical for a data warehouses!

 

We see that calculating aggregates from aggregates is desirable, but is not possible for non-additive facts.  So we deal with three types of aggregates i.e. distributive that are additive in nature, and then algebraic which are non-additive in nature. Therefore, such aggregates have to be computed from summary of subgroups to avoid the problem of incorrect results. The of course are the holistic aggregates that give a complete picture of the data, such as median, or distinct values. However, such aggregates are not desirable for a data warehouse environment, as it requires a complete scanning, which is highly undesirable as it consumes lot of time.

 

     

 

 

Previous

 

 

TOC

 

 

Next