Want to create interactive content? It’s easy in Genially!
Data Analysis
RACHEL JENKINS
Created on July 9, 2024
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Essential Dossier
View
Essential Business Proposal
View
Essential One Pager
View
Akihabara Dossier
View
Akihabara Marketing Proposal
View
Akihabara One Pager
View
Vertical Genial One Pager
Transcript
The Results Section
The Results Section
The results section can seem like a scary place for beginner researchers. There's a lot of important information that needs to be said in an unbiased way, which means numbers!
Types of Data/Levels of Measurement
Nominal Data
Ordinal Data
- Data defined by order, but the distance between the choices or values is not defined
- Examples: Likert scales, preference scales, rankings
- Data that can be put into categories
- Examples: Gender, eye color, race, specialty
Interval Data
Ratio Data
- Data with a defined interval between the values but no true zero value
- Example: Temperature (zero does not indicate a total lack of temperature; the temperature interval difference between 61°F and 62°F is the same as the interval difference between –61°F and –62°F
- Data with an absolute zero value, where zero means there is a total absense of what is being measured
- Examples: Visual acuity, range of motion, blood pressure, height, weight, pH, Pa02
The higher level of measurement, the more data that can be pulled!
Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise
John Tukey
Running the appropriate statistical test depends on the type of data
Inappropriate statistical analysis can significantly alter results
Statistics can be misleading
Descriptive Statistics
- Measures of central tendency, used to describe the data set
- Single scores used to represent a larger set of scores
- Mean
- Sum of the scores divided by the total number of scores (aka average)
- Interval and ratio data
- Median
- Midpoint of the data
- Ordinal, interval, or ratio data
- Mode
- Most frequently occurring score/data
- All variables
Descriptive Statistics
- Measures of variability
- Range
- Difference between the highest and lowest scores
- Standard Deviation (SD)
- Average deviation for each data point from the mean
- Standard Error
- Estimates the population SD, not the sample SD
- How likely is the population SD is going to be different from the sample SD?
- Used for confidence intervals
Once we've got our descriptive statistics, we need to go back to our research question.
What kind of data are we trying to compare? What are we trying to see a difference in?
Inferential Statistics
- Inferential statistics are what is used to detect differences between groups (did our independent variable actually make a difference?)
- Instead of describing the data, we are making inferences about the data
- Used to test the hypothesis of the study
- Need a null-hypothesis to either reject or accept
- Most RCT's have a null-hypothesis that states there is NO difference between the control and the intervention group
- Stating that there is no difference is easier to test
Statistical Hypothesis Testing
- Researcher sets a significance level [aka an alpha level (α)]
- Generally, <.05 or <.01, usually never >.05
- Will compare this alpha level to the p-value generated by the data analyzation
- The p-value is the probability that the observed result could occur by chance if the null hypothesis is true (you should read that again)
- For example: if p=0.05, an α of <0.05 is statistically significant
- For example: If p= 0.002, and the α is set at <.05, then the result is statistically significant
- For example: if p= 0.12, and the α is set at <.05, the the result is NOT statistically significant
- If p < α, then you reject the null (statistically significant)
- If p > α, then you fail to reject the null (not statistically significant)
How do I get the p-value?
You must run a statistical test! The type of data you are comparing determines the statistical test.
For example: Does student exposure to 300mg of caffeine have an effect on heart rate?
- Outcome= Heart rate (ratio data, continous)
- Exposure= 300 mg caffeine (Yes/No, categorical)
- I want to know if there's a statistical difference in HR between groups
- How many groups do I have? Two!
- Students who didn't get caffeine vs.
- Students who did get caffeine
There are a TON of statistical tests, this is not an exhaustive list.
Confidence Intervals (CI)
- Based off alpha (α) level set by researcher
- α = .1, then CI is 90%
- α = .05, then CI is 95%
- α = .01, then CI is 99%
- The higher the CI, the less precise the data
- CI= Mean +/- the standard error (2 SE's for 95%, 3 SE's for 99%)
- Example: If the mean number for my data is 50, my CI is 95%, and the standard error is 2, I need to take my mean +/- 4 to find my 95% CI
- Why 4? ---> 2 SE's +/- the mean for 95% CI
- So, (50) +/- (4) = 46-54
- 46-54 is my 95% CI. What does this mean?
- If the CI is 95% with a range of 46-54, I can say that 95% of the time, any result I get will be between those two numbers.
- 99% CI= 3 SE's = 50 +/- 6= 44-56.
- This encompasses a larger group of data, making it less precise. If the CI is 99% with a range of 44-56, I can say that 99% of the time, any result I get will be between those two numbers.
What You Need to Know about α and CIs: The Main Points
- CI is based off of α
- The higher the confidence interval (aka 99% vs 90%), the less precise the data because researchers are including in a larger data range:
- 90% CI data range: 70-72
- 99% CI data range: 66-76
- I can be 99% confident that my actual mean is in between the range of 66-76, while I can only be 90% confident that my actual mean is between 70-72
- Think about this as target practice:
90% confident that I can hit this one because it's a smaller target
99% confident that I can hit this one because it's larger
Which target are you more confident hitting?
Odds Ratio (OR) and Hazard Ratio (HR)
Odds Ratio (OR)
Hazard Ratio (HR)
- Indicates the risk of an event in the intervention group compared with the control group at any particular point in time
- Greater than 1 is increased events for treatment
- Less than 1 is decreased events for treatment compared to control
- For example, if HR is 0.65, there is a 35% event reduction in the treatment group
- Measure for association between an exposure and an outcome
- A measure of the odds of an event happening in one group compared to the odds of the same event happening in another group
- "1" is your middle ground
- If an odds ratio is 1.3, it means there’s a 30% increase in the odds of something happening in the intervention group
- If an odds ratio is 0.8, it means there's a 20% reduction in the odds of something happening in the intervention group
Odds Ratio (OR) and Hazard Ratio (HR)
- You'll likely see CI's for OR, HR, and RR that might look like this:
- OR 1.05 [95% CI, 0.56-1.34]
- This means they calculated the value at 1.05, and are 95% confident the true values lies between 0.56 and 1.34 if repeated
Chi-Squared and Pearsons Coefficient
Chi-Squared (χ2)
Pearsons Coefficient
- To determine if an event occurs more frequently than it would happen by chance
- Compare to the p-value!
- r value states the correlation between two things
- -1 to +1 (-1 is negative correlation, +1 is positive correlation, 0 is none)
- Generally never -1 or +1, as these are perfect correlations
There's a lot of other tests we can run!
Don't be afraid to look something up when reading the result section.
Clinically Significant vs. Statistically Significant
- Just because there is a significant difference in the numbers, that doesn't mean those correlate to a large difference clinically.
- Example: When running tests, you find that there's a statistically significant difference in heart rate between your groups of students who either got 300 mg of caffeine, or didn't.
- Students who got caffeine: HR = 87 bpm
- Student who didn't get caffeine: HR = 68 bpm
- Clinically speaking, both of these results are perceived as normal, and the difference does not have clinical relevance.