It's been a while since I've cranked up my n-dimensional data generator, but I'm probably about to update it again. Now that I'm playing with the new Essbase ASO, I've run out of interesting old outlines to test for performance. I have to think of crazy stupid models. So the next one I'm going to try will have three really fat dimensions, a half million customers, and 50k products and 30k locations. Plus I may throw in some attributes. Fzample, in my fakeData repository I've got 900 colors - although now that I think about it, I could have easily mocked up RGB combinations.
FYI, here's a snippet of code:
if ($ndims==3) {
for ($i=0; $i<$top[0]; $i++) {
for ($j=0; $j<$top[1]; $j++) {
for ($k=0; $k<$top[2]; $k++) {
$num = int($df[0][$i]*$df[1][$j]*$df[2][$k]*$adj)+1;
$dat = $dates[int(rand 36)+1];
$record = "$topdim[0][$i]\t$topdim[1][$j]\t$topdim[2][$k]\t$dat\t$num\n";
print OUT $record;
}
}
}
close OUT;
print "done.\n";
}
Once that's done, I'm going to hook it up to my Hyperion Visual Explorer and see what I can see. The next step will be for me to play around and download publicly available Census data and try some of those datasets on for size.
Wouldn't it be interesting if Tableau were already looking at the next level of visualization? Not that they have to, they've already been scooping up awards. Still, I'm trying to think of ways to visualize the actual sparsity of datasets themselves. I'm sure more ideas will pop up as I get into some super dimensional models.
That reminds me. The biggest crazy model I ever had to deal with was a survey database for Coke. They looked at dimensions like 'Purchase Container' and 'Consumption Container'. They paid attention not only to where you bought the drink but if you transferred it out of the thing you bought it and if you drank it there or not. At the time that I built it, 10000 source records blew the limits of Essbase, so we had to use very smart sampling. Today I could probably knock this out in a heartbeat.
Anyway I'm looking forward to building new massively dimensional models (like 30+) in ASO for demographics and whatnot and do my mining visually. Plus I'm going to try to do some visual mining of more traditional retail affinities with fat dims and lots of attributes.
Comments