Friday, July 16, 2010

Getting down to what matters

This fifth week of Immersion Term saw the lift-off of my summer project with Dr. William Frayer, which involves some statistical analysis of data on patients from the neonatal ICU. It was a very productive week; on Monday we only had a rough idea of what needed to be done, but by Friday we had obtained a copy of SAS (very expensive statistical analysis software made by SAS Inc.), I had learned the required syntax, and we had started generating some meaningful statistics. In doing so I had expanded my own knowledge of statistical techniques. Now for some fun details...

The first goal of the project is to determine which patient factors have a strong correlation with low mental developmental index (MDI, measured at approximately 1 year of age) in low birth weight babies. The second goal is to use these factors to create a discriminant function that will allow physicians to make predictions about the MDI of their patients. A discriminant function is a linear function that sums the products of each independent variable with its correlation coefficient (a value determined though canonical correlation analysis). The discriminant function can be created from any data set containing a categorical dependent variable (the variable you want to make a prediction about) and several independent variables (the predictors). This set is called the training set. To validate the function, a second data set with the same variables (the test set) is needed. The test set is then run through the discriminant function, the predictions made by the function are compared to the actual dependent variables in the test set, and the frequency of Type I and Type II errors (false positives and false negatives respectively) can be determined. Having low error frequencies means that the discriminant function is a good predictor of the dependent variable.

This type of analysis may sound complicated, but luckily it only takes about 4 lines of code in SAS to complete. I can't imagine how long it would take by hand! Next week will most likely involve fine-tuning our analysis and maybe even some validation if we're lucky. Stay tuned...

No comments:

Post a Comment