Logistic Regression in SAS

SAS is general-purpose software with a wide variety of approaches for statistical analyses. One of it’s best features, Logistics regression, is widely used now a days in marketing research, finance and clinical studies when the dependent variable is dichotomous. However, ordinary linear regression was routinely used before we had the modern statistical packages for analyzing logit (Logistic Regression transform probability).

Many SAS instructors, when encountering regression in SAS for the first time, are somewhat alarmed by the seemingly endless options and voluminous output. Many procedures in SAS/STAT can be used to perform logistic regression analysis: CATMOD, GENMOD, LOGISTIC, PHREG and PROBIT.

Each procedure has special features that make it useful for certain applications. Though the LOGISTIC procedure does indeed have its complexities, most problems and much confusion can be avoided by taking a systematic approach to your analyses. Thus for most of the applications, PROC LOGISTIC is the preferred choice.  It fits binary response or proportional odds models, provides various model-selections methods to identify important prognostic variables from a large number of candidate variables, and computes regression diagnostic statistics.

Logistic regression analysis is often used to investigate the relationship between these discrete responses and a set of explanatory variables. You can fit logistic regression models using either software for GLMs or specialized software for logistic regression. PROC GENMOD uses Newton-Raphson, whereas PROC LOGISTIC uses Fisher scoring. Here we will look for PROC LOGISTICS implemented in SAS and few points on the basic statistic output for understanding the logistic regression results. Below is the logistic regression curve –

Predictor variables (xi) can take on any form: binary, categorical, and/or continuous.

Logistic regression model with a single continuous predictor
logit (pi) = log (odds) = 0 + 1X1

where –
logit(pi) logit transformation of the probability of the event
0 intercept of the regression line
1 slope of the regression line

The following statements are available in PROC LOGISTIC:

BY variables ;
CLASS variable <(v-options)> … >
< / v-options >;
CONTRAST ’label’ effect values <,… effect values>< =options >;
FREQ variable ;
MODEL response = < effects >< / options >;
OUTPUT < OUT=SAS-data-set >
< keyword=name…keyword=name > / < option >;
UNITS independent1 = list1 < /option >;
WEIGHT variable ;

PROC LOGISTIC has a built-in check of whether logistic regression ML estimates exist. It can detect complete separation of data points with 0 and 1 outcomes, in which case at least one estimate is infinite.
Furthermore features of the LOGISTIC procedure in SAS enables you to control the ordering of the response levels, to test linear hypotheses about the regression parameters, to create a data set for producing a receiver operating characteristic curve for each fitted model and to create a data set containing the estimated response probabilities, residuals, and influence diagnostics. The PROC LOGISTIC in SAS is good to apply for model-selection method, choice of link function and regression diagnostics.

Interested in learning about other Analytics and Big Data tools and techniques? Click on our course links and explore more.
Jigsaw’s Data Science with SAS Course – click here.
Jigsaw’s Data Science with R Course – click here.
Jigsaw’s Big Data Course – click here.

Are you ready to build your own career?