The best fun in being a hacker and a crank is that you have this internal notion of what you should do that makes perfect sense, but you can't explain it. My new habit is to use Audacity and explain it to myself for the times I get back to needing that code several years later.
For example. I know that I have some code that I put on a timer to go check and see if the Hyperion Reports server is up and running. I forget the guy who wrote it or what modifications I made to it. I just know it's in a big pile somewhere in my new All Customers archive. But I think of it every time I need historical fx data because I know it was hacked out of a general purpose web snarfer. I had that written and carried over from another customer. And interestingly enough I just remembered that part as I wrote that sentence.
Well I just finished unit testing a new Essbase back-end. This one is actually fairly cool. One of my favorite things to do on backends are 'fabs' - little perl programs that fabricate other scripts and then execute them. So this fab generates a calling program, a wrapper around a stored procedure with a couple parameters that I craft.
The back end is like this. I've got a weird proprietary HR database with a wide view of about 25 columns, about a half-dozen demographics, 3 or four geos, 4 dates, some titles and grades, position numbers, and salary. It's small, only about 500 employees in the table, but enough master data to generate 17 dimensions. The HR Source is on super lockdown, and therefore I can't query directly into the transaction system and the specified view even goes through another database with separate security. So basically my account has access down to an absolute minimum. All of that is on Server A. Transacation database, and a 'tunnelvision' database.
I get a push of the tunnelvision view data into my ODS which is on a slightly different form of lockdown. Bascially just an administrative account on the server plus one more account which belongs to SQL Server. The source is pushed into a table. Now I have a workplace to create a bunch of status flags which give various date fields multidimensional meaning. Plus I calculate a tenure metric for calculation of vacation days. And of course I have the facility to incorporate any new business rules necessary.
The tunnelvision is stateless so I have to stamp it - and that's the fun part. I hacked one of my old code archiving programs which gives me incremental version numbers and a date stamp into a new cut field in a historical table. I just pile on that whenever I initiate a manual refresh into the cube.
So I've got one wide table that updates 17 dimension tables, one narrow minimum fact table (there's just one fk that ties to most of the attribute dims) and one historical table. I have a fab that creates the calling SQL for a five step stored procedure and takes parameters from the same date facility that updates the substitution variables.
The coolest thing is that I have the ability to either let the batch program run this whole thing or I can initiate it from the user desktop. That was done via some .NET programming by a guy on staff here. Just double-clich on an icon and a .NET program initiates my batch file from a browser. The only problem with that is that the path changes when it runs from IIS instead of my Essbase batch subdirectory, so I can't use any relative pathing internal to my scripts. But now end users can be on the phone with HR, know that they are about to commit a change to their system and can initiate a refresh of the whole shebang end to end in under 5 minutes.
Stored procedures have generally been a mystery to me precisely because DBAs are jealous of their positions. So being the multidimensional guy I never get access to do with SQL what I would generally want to do - I just provide the specs for the interface, they disappear for weeks and come back grumbling. But it turns out that they are super simple - at least they are in SQL 2K5 Express which is the freebie I use for my ODS.
The most difficult thing to do here was pass the security review. It's a huge help that MaXL now has the security hashing that encrypts your IDs and pws, but my server was closed enough to not require that. I really dig dealing with streams of data like this, mixing perl, SQL and Essbase calcs to each optimize that part of the final picture the end user is presented. It was a great discipline to learn how to manage those kinds of streams independently and asynchronously as they dropped into my world's inbucket. Now being able to do this on a standard batch schedule and on demand using the same code is the new cool thing at my customer.
I was just curious is you're going to tie in your Java-Based outline reader (and data loader) for any automated unit testing purposes?
I was (and still) a web programmer turned Essbase Consultant. In the web world I use nUnit with C# to perform unit testing. I was hoping to take the same methodology to the Essbase world.
Your thoughts?
Posted by: Ed SoftwareEssGeek | February 09, 2009 at 10:33 AM
I'm not familiar with nUnit. Unit testing with Essbase is relatively straightforward. With the average database, what you are doing primarily is reverse engineering spreadsheet macros and calculating numbers on the server side, rather than the client side. So more often than not a simple financial reconciliation is all that's required.
With performance testing, it is actually rather rare that customers will demand a rigorous test plan. There is a checklist of about a dozen things that make the biggest difference, two dozen for old pros. So there aren't many test tools that are setup for Essbase. I've hooked it up to LoadRunner and done a full formal suite of tests - the customer provided 3 standard data sets and that was all we needed.
Somewhere out there are the specifications for the APB-1, which was the OLAP benchmark when there were more engines out there and people were first experiencing OLAP. Nobody pays attention to those, although the people who built them are still around.
I think that most people throw partitions and hardware at Essbase performance problems at the upper end of the application size. 9 times out of 10, there is more data than people can consume.
For my two cents, I think the biggest difficulty with Essbase performance is not Essbase, but with data streams, and consumability. Which is to say understanding how to rightsize a datamart within the context of an area for analysis is the trick. How much do you do in ETL, how much do you do in the DW, how much do you do in Essbase? A well designed Essbase cube easily puts every query imaginable across 4GB of data in the hands of end-users in under 5 seconds. And on a 64bit dedicated server, there's no reason not to have at least double that in memory. Since Essbase uses a virtual memory model and pages data, a 40GB single cube works fine in that size blade. So scalability is a breeze.
The difficulty is something that most IT folks have not deal with that more advanced web guys have - what do you do when you have more than 500 concurrent users. The overwhelming majority of analytic applications never go near that number. So we don't know much about sharding, port sharing, 24/7 availability, hot swapping, or map-reduce. And we almost never think about that on the back-end. So there is a relatively empty area in OLAP where strategies for dealing with very large dataflows from the back-end are presented to a finely scaled and redundant datamart layer.
I think that is because of the rather small number of applications marketed for multidimensional databases. I don't know if that space will ever expand. I don't know how many people think BIG for multidimensional apps.
Here's an example of big. It would be fairly straightforward for a small team of say 8, to develop from scratch, an application that serves up the 10Qs for the S&P 500. So whatever Obama's TARP plan would be, Essbase would be perfect to present the transparency to the public. We could scale it from there. More detail on the financial side, more availability on the query side. Technically, the guts of the application is very easy despite its size.
Posted by: M Bowen | February 10, 2009 at 08:15 AM