Which statistical test to use
Bayesian Statistics 7. Adaptive Clinical Trials 8. Previous Story. Next Story. October 28, September 30, How to predict key events in survival analysis trials. August 26, Subscribe by Email.
No Comments Yet Let us know what you think. Overview What's New. Try Now. Trial Solutions. Classical Bayesian Adaptive. Base Plus Pro. Support Hub. Getting Started Online Account. Manage cookies. Pricing Get In Touch. Privacy Policy. About Us. Learn nQuery Resources Webinars. All Rights Reserved. Measurement from Gaussian Population. Binomial Two Possible Outcomes. Survival Time. Click here to report an error on this page or leave a comment.
Your Name required. Your Email must be a valid email for us to receive the report! Groups or data sets are regarded as unpaired if there is no possibility of the values in one data set being related to or being influenced by the values in the other data sets. Different tests are required for quantitative or numerical data and qualitative or categorical data as shown in Fig. For numerical data, it is important to decide if they follow the parameters of the normal distribution curve Gaussian curve , in which case parametric tests are applied.
If distribution of the data is not normal or if one is not sure about the distribution, it is safer to use non-parametric tests. When comparing more than two sets of numerical data, a multiple group comparison test such as one-way analysis of variance ANOVA or Kruskal-Wallis test should be used first.
Repeatedly applying the t test or its non-parametric counterpart, the Mann-Whitney U test, to a multiple group situation increases the possibility of incorrectly rejecting the null hypothesis. Tests to address the question: Is there a difference between groups — unpaired parallel and independent groups situation? Question 2: Is there a difference between groups which are paired? Pairing signifies that data sets are derived by repeated measurements e.
Pairing will also occur if subject groups are different but values in one group are in some way linked or related to values in the other group e. A crossover study design also calls for the application of paired group tests for comparing the effects of different interventions on the same subjects. Sometimes subjects are deliberately paired to match baseline characteristics such as age, sex, severity or duration of disease. A scheme similar to Fig. Once again, multiple data set comparison should be done through appropriate multiple group tests followed by post hoc tests.
Tests to address the question: Is there a difference between groups — paired situation? Question 3: Is there any association between variables? The various tests applicable are outlined in Fig. It should be noted that the tests meant for numerical data are for testing the association between two variables.
These are correlation tests and they express the strength of the association as a correlation coefficient. An inverse correlation between two variables is depicted by a minus sign. All correlation coefficients vary in magnitude from 0 no correlation at all to 1 perfect correlation.
We will use type of program prog and school type schtyp as our predictor variables. Because prog is a categorical variable it has three levels , we need to create dummy codes for it.
SPSS will do this for you by making dummy codes for all variables listed after the keyword with. SPSS will also create the interaction term; simply list the two variables that will make up the interaction separated by the keyword by. Furthermore, none of the coefficients are statistically significant either. This shows that the overall effect of prog is not significant.
A correlation is useful when you want to see the relationship between two or more normally distributed interval variables. For example, using the hsb2 data file we can run a correlation between two continuous variables, read and write.
In the second example, we will run a correlation between a dichotomous variable, female , and a continuous variable, write. Although it is assumed that the variables are interval and normally distributed, we can include dummy variables when performing correlations.
In the first example above, we see that the correlation between read and write is 0. By squaring the correlation and then multiplying by , you can determine what percentage of the variability is shared. In the output for the second example, we can see the correlation between write and female is 0. Squaring this number yields. Simple linear regression allows us to look at the linear relationship between one normally distributed interval predictor and one normally distributed interval outcome variable.
For example, using the hsb2 data file , say we wish to look at the relationship between writing scores write and reading scores read ; in other words, predicting write from read. We see that the relationship between write and read is positive. Hence, we would say there is a statistically significant positive linear relationship between reading and writing.
A Spearman correlation is used when one or both of the variables are not assumed to be normally distributed and interval but are assumed to be ordinal. The values of the variables are converted in ranks and then correlated.
In our example, we will look for a relationship between read and write. We will not assume that both of these variables are normal and interval.
Logistic regression assumes that the outcome variable is binary i. We have only one variable in the hsb2 data file that is coded 0 and 1, and that is female. We understand that female is a silly outcome variable it would make more sense to use it as a predictor variable , but we can use female as the outcome variable to illustrate how the code for this command is structured and how to interpret the output. The first variable listed after the logistic command is the outcome or dependent variable, and all of the rest of the variables are predictor or independent variables.
In our example, female will be the outcome variable, and read will be the predictor variable. As with OLS regression, the predictor variables must be either dichotomous or continuous; they cannot be categorical. The results indicate that reading score read is not a statistically significant predictor of gender i. Likewise, the test of the overall model is not statistically significant, LR chi-squared — 0. Multiple regression is very similar to simple regression, except that in multiple regression you have more than one predictor variable in the equation.
For example, using the hsb2 data file we will predict writing score from gender female , reading, math, science and social studies socst scores.
Furthermore, all of the predictor variables are statistically significant except for read. Analysis of covariance is like ANOVA, except in addition to the categorical predictors you also have continuous predictors as well.
For example, the one way ANOVA example used write as the dependent variable and prog as the independent variable. Multiple logistic regression is like simple logistic regression, except that there are two or more predictors. The predictors can be interval variables or dummy variables, but cannot be categorical variables. If you have categorical predictors, they should be coded into one or more dummy variables.
We have only one variable in our data set that is coded 0 and 1, and that is female. The first variable listed after the logistic regression command is the outcome or dependent variable, and all of the rest of the variables are predictor or independent variables listed after the keyword with.
In our example, female will be the outcome variable, and read and write will be the predictor variables. These results show that both read and write are significant predictors of female.
Discriminant analysis is used when you have one or more normally distributed interval independent variables and a categorical dependent variable. It is a multivariate technique that considers the latent dimensions in the independent variables for predicting group membership in the categorical dependent variable.
For example, using the hsb2 data file , say we wish to use read , write and math scores to predict the type of program a student belongs to prog. Clearly, the SPSS output for this procedure is quite lengthy, and it is beyond the scope of this page to explain all of it. However, the main point is that two canonical variables are identified by the analysis, the first of which seems to be more related to program type than the second.
For example, using the hsb2 data file , say we wish to examine the differences in read , write and math broken down by program type prog. The students in the different programs differ in their joint distribution of read , write and math.
0コメント