Want to create interactive content? It’s easy in Genially!
Data Analyst, Session 11 Slide Deck
Natasha Rose
Created on February 8, 2023
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Branching Scenarios Challenge Mobile
View
Branching Scenario Mission: Innovating for the Future
View
Piñata Challenge
View
Teaching Challenge: Transform Your Classroom
View
Frayer Model
View
Math Calculations
View
Interactive QR Code Generator
Transcript
Data Analyst Session 11
Time series forecasting
What is Time Series?
“Have you noticed that Bitcoin prices are higher on Fridays? I was looking at the historical daily prices and I spotted a trend.”
What is Time Series?
“Have you noticed that Bitcoin prices are higher on Fridays? I was looking at the historical daily prices and I spotted a trend.”
“This is the price and how it’s changed over time, this is what they call a ‘time series’. If you look carefully, you can see spikes on various days and those days happened to be Fridays.”
Session Agenda
Hypothesis testing: Including the steps you should follow to complete a hypothesis test, worked examples and a demo experiment
A/B Testing: Including how an A/B test works, and the steps to completing them
Performing statistical hypothesis testing: Including a note on statistical significance and a worked example
Hypothesis Testing
Hypothesis Testing Steps
There are 5 main steps in hypothesis testing, as shown opposite. During which of the following hypothesis testing steps would you complete the following?
- Perform sampling
- Collect data in a way that is designed to test your hypothesis
- Make statistical inferences about the population you are interested in
1: Stating null and alternative hypotheses
2: Collecting data
3: Performing a statistical test
4: Rejecting or accepting your null hypothesis
Submit your responses to the chat!
5: Presenting your findings
Step 1: Stating null & alternative hypotheses
After developing your initial research hypothesis, it is important to restate it as a null (Ho) and alternate (Ha) hypothesis so that you can test it mathematically.
- The null hypothesis is a prediction of no relationship between the variables you are interested in
- The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables
You formulate a hypothesis that a well-designed sales website will generate more revenue than a poorly designed one. To test this hypothesis, you restate it as:
- H0: The average spend would be the same for both websites (null hypothesis)
- Ha: The better-designed website would lead to higher spend on average (alternative hypothesis)
Step 2: Collecting Data
For a statistical test to be valid, it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in. Note: The example of Gender vs height is a known biased example.
To test differences in revenue for both websites your sample should have an equal sample size of customers visiting the sites. It should also consider any variables that might influence the revenue generation (ie size of the sample and any known customer demographics).
Demo Experiment: Sample Size
Here is an example of the need to perform sampling and collect data in a way that will test your hypothesis. This is a demo showing the importance of sample size for making an accurate estimate.
Lake Fish Sampling Interaction
What is a P-value?
Definition: P-value
A p-value of 0.05 means a 5% chance or 1 in 20.
Step 3: Performing a Statistical Test
All statistical tests are based on a comparison of the following:
- Between-group variance - how different the categories are from one another
- Within-group variance - how spread out the data is within a category
- If between-group variance is large and shows no overlap between groups then your statistical test will show a low p-value
- Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will show a high p-value
Step 4: Rejecting or accepting null hypothesis
- This is done using the p-value to guide your decision
- In most cases, your threshold for rejecting the null hypothesis will be 0.05 (5%)
- That is, a less than 5% chance that you would see these results if the null hypothesis were true
- In some cases, researchers choose a more conservative level of significance, such as 0.01 (1%)
In your analysis of the difference in revenue generation between our 2 websites, you find that the p-value of 0.002 is below your cutoff of 0.05, so you decide to reject your null hypothesis of:
- H0: The average spend would be the same for both websites
Step 5: Tips for Presenting Findings
Results will be presented in the results and discussions sections of a research paper, dissertation, or thesis and should follow this guidance:
In the discussion, you can discuss whether initial hypotheses were supported by results.
In the formal language of hypothesis testing, you should talk about rejecting or failing to reject the null hypothesis.
The results section should give a brief summary of the data and results of the statistical test.
Example: “In our comparison of revenue generation between 2 websites, we found the p-value of 0.002 is below our cutoff of 0.05; therefore, we can reject the null hypothesis and conclude that the better-designed website would lead to higher spend on average .”
A/B Testing
Introduction
A/B testing is a method of experimentation where two versions of a website, product, or advertisement are compared to determine which one performs better. For example, a company may want to determine which of two versions of an advertisement will lead to more clicks and therefore revenue generation.
Your Quick Guide to A/B testing is available here: Quick Guide: A/B Testing Quick Guide
A/B Testing of Advertisements to a Split Audience
How does an A/B Test work?
The A/B Test is a well-known instance of a business experiment and it's one of the most popular business experiments. The rules for conducting an A/B test are as follows:
Two variants of a single variable are compared to determine which performs better
The variable can be anything, from the colour of a button to the layout of a webpage
To properly conduct an A/B test, both variants must be identical in all respects except for the one being tested
Once the test is conducted and a winner is chosen, that variant becomes the new standard
Activity: Completing an A/B Test
Our online clothing company wants to determine which of two versions of an advertisement will lead to more clicks, and therefore more potential revenue. What steps should they need to follow to complete an A/B test? Use the A/B testing quick guide you have been provided to identify the steps that should be taken.
Formulating the experiment
Visualising the data
Plotting the data
Submit your responses to the chat!
Converting the data to a summary table
Performing Statistical Hypothesis Testing
A note on statistical significance
In the context of A/B testing experiments, statistical significance is how likely it is that the difference between your experiment’s control version and test version isn’t due to error or random chance. For example, if you run a test with a 95% significance level, you can be 95% confident that the differences are real.
Definition: Statistical Significance
Demo: Statistical Hypothesis Testing
So far we have looked at comparing average values between 2 different groups, in hypothesis testing we can also test proportion in 2 separate groups. Use the link to the statistical significance calculator to test the following hypothesis and data:
- H0: There is no difference in the rate (%) of clicks for each advert (null hypothesis)
- Ha: Advert B attracts more clicks and therefore more customer revenue (alternative hypothesis)
A/B Testing of Adverts to a Split Audience
Advert
Visitors
Conversions (Clicks)
Statistical Significance Calculator
1000
450
1000
500
Topic Summary
You should now be able to:
- Describe the steps of statistical testing
- Explain key concepts relating to hypothesis testing, including null and alternative hypothesis, p-values, and statistical significance
- Carry out statistical hypothesis testing to determine the likelihood of hypotheses and solve business problems
20
Data Analyst Session 8
Hypothesis Testing End of Lesson