Scientific Computing
   Popular Searches:
lims, visualization, chemistry, statistics, hpc
DATA ANALYSIS



SITE SPONSORS
Home > Data Analysis > STATISTICA 9.1: Continued Excellence in Statistics and Data Mining

STATISTICA 9.1: Continued Excellence in Statistics and Data Mining
The breadth and depth of analytic capabilities is truly astounding 
John A. Wass, Ph.D. 

fig 1

Here we are again with the latest version of an immense software set that is a workhorse and powerhouse of statistical analysis. Although it has its own programming language and runs R programs within it, it is menu-driven! For those of us who are programming-challenged (although your editor can code) this is truly a gift. The latest version contains not only an exhaustive analytic suite, upgraded data mining capabilities, a variety of visualization and automation capabilities, but multi-user solutions as well. As always, it’s the breadth and depth of analytic capabilities that is truly astounding. The software runs on most modern versions of Windows (XP and above), and this review was done on Vista.

Before we delve into the testing and graphics, I need to proffer a word of advice for the casual user of any statistical/mathematical software. For those using this as a more or less constant tool, as with all software, the more practice the more proficient you become. To the technician or scientist who may devote most of their time to the laboratory work, and only get to the software at widely spaced intervals, it will be exceedingly difficult to master the software and become facile with any but the simplest statistics. I advise as I had been away from STATISTICA since the last review (evaluation software goes dormant after 30-90 days) and when reacquainting myself, found some procedures simple and very intuitive and others not. Let’s start with the easy stuff…

Exploratory statistics in this platform are as simple as 1-2-3! For a small data set of cholesterol reading, going to Statistics/Basic Statistics/Tables on the main menu produces a dialog box asking for the variables to be used, and then the following box for the analytic choices:

fig 2
 
This is pretty much the standard appearance of the choice boxes and they soon become old friends. A little experimentation indicates that the most information in the most digestible format comes with the “Graphical Comparative Summary Display” button:
fig 3

The Box-and-Whisker Plots are also useful:

fig 4

As with most software, just right-click anywhere within the graphic and a variety of display options appear. STATISTICA offers a nice menu of color/axis/labeling options. Below is a scatterplot/regression of the data:

fig 5

T-tests, ANOVA, regression, and correlation are just as easy. When we get into the advanced linear and non-linear models is where I ran into problems (admittedly due to lack of practice). In attempting a repeated measures multivariate analysis of variance (RM-MANOVA) with compound structure, I had near intractable problems in filling out the variable selection list and moving through the steps. This despite a superb level of support from a help desk professional! Upon reflection and consulting the ‘Quick Reference Manual’ it appeared to me that the data were being improperly imported. It seems that I was working with a worksheet rather than a workbook. All I had to do was go to the ‘Workbook/Use as active input’ menu selection and suddenly everything worked! It took mere seconds to get to the GLM output dialog:

fig 6

then check the Multivariate test boxes and push the Multiv. tests button in the Within effects area to get:
fig 7
Data could then be further analyzed by Discriminant- and Cluster analysis:
fig 8
fig 9

STATISTICA also quickly produces stunning 3D graphics for yet another look at the data:

fig 10

Extensive as the statistical and graphical menus are (and you can always expand this with the programming language), its offerings in Data Mining are truly, for lack of a more scientific term, awesome! From the main menu:

fig 11

Now for the very pleasant surprise: Most software designers are attempting to make their product more useful by enhancing the ease-of-use features and StatSoft is no exception. The first choice above, ‘Data Miner Recipes’ is a highly intelligent Wizard that will complete an analysis on your data set from the point of specifying the variables and test data set. Importing a chemical data set from EXCEL and specifying the roles of each column took seconds. Then by employing the Next Step/Run to Completion buttons, 3 separate mining techniques were applied to the data to generate the error summaries for prediction, and the Lift Curves to explore maximum classification probability. Of course the old-fashioned manual method leaves the analyst with more control and a nice diagram of the workings.

fig 12

Summary
As usual, the capabilities are extensive, the power of the advanced and data mining modules are almost breathtaking, and the developers are attempting to flatten the learning curve. The above brief summary is but a minute sampling of the software’s total power.

My only suggestions boil down to making the graphics easier to use (there are too many steps to label the points on the graphs themselves) and highlighting the correct data format upfront, prior to doing analyses. For the price, this package packs a wallop. For both scientific and business types, there is much to recommend here.

Availability
$1,995 advanced + QC $1,995 single user, commercial StatSoft 2300 East 14th St., Tulsa, OK 74104 918-749-1119; Fax: 918-749-2217 info@statsoft.com; www.statsoft.com

John Wass is a statistician based in Chicago, IL. He may be reached at editor@ScientificComputing.com.    


Scientific Computing
Advantage Business Media
Rockaway NJ 07866

Email Article | Contact the Editor | Printer Friendly

Post to Del.icio.us | Digg This | Post to Slashdot
 







Bioscience Technology Chromatography Techniques Drug Discovery & Development Laboratory Equipment Pharmaceutical Processing R&D Scientific Computing
Advantage Business Media © 2010 Advantage Business Media
Privacy Policy | Terms & Conditions | Advertise with Us