M. Bowen
November, 1999
---
This is the first in a series of tools to enhance the data-management capabilities of essbase formed by a suite of backend tools which will enable the e-business.
The Bucketmaster will fit into the context of API based tools which work on essbase components. The should be developed outside of the core product as if they were 3rd party tools. They can later be integrated as options enabled by our license.
The purpose of this tool ist to perform a one-off or two-off analysis of a stream of dimensional data. It should allow dbas to quickly bucketize a flat dimension into either a dimensional hierarchy or a virtual attribute dimension. The source for this can be a flat file, or other stream directed to it via the rules editor, or an actual built essbase index, against which the sparsity of a single dimension can be judged. It should also be able to run directly against relational sources.
The overall goal is to give the multidimensional dba the opportunity to optimize design of the essbase database based on real data provided. This analysis should be able to be performed before or after loading data into Essbase.
Part of the Bucketizer is a Tokenizer. This tokenizer should be able to create a surrogate key against a single dimensional source which can be output back into a relational database or flat file. This will allow Essbase to quickly process odd files.
Outputs of the bucketizer will be:
Single Dimension Operations - Against Counts
1. an 'equal bucket' flat line distribution of source keys in the dimension with equivalent sized buckets in a 1 to n distribution.
2. Size constraint distribution of source keys in the dimension. Buckets would all be of equal
3. Bell Curve distribution of source keys in the dimension.
Single Dimension Operations - Against Measures
or the same thing against a second dimension.
2. A bell curve distribution in the form of a dimensional hierarchy or virtual attribute dimension.
3. An 80/20 (adjustable) distribution of dimensional data against a pre-defined or dynamic measure.
Dual Dimension Operations - (Sparsity Managment)
4. a graphical output of the distribution of data within an essbase cube for a two dimensional combination, or the same against a relational source. In this regard I am envisioning a 2dimensional field with red dots signifying the hits within the field. This needn't be done by measure but, color could represent a third dimension. As well it could be done in 3 dimensions which could add some value as a visualization tool.
Comments