Want to create interactive content? It’s easy in Genially!

Get started free

Data Analyst, Session 11 Slide Deck

Natasha Rose

Created on February 8, 2023

Start designing with a free template

Discover more than 1500 professional designs like these:

Branching Scenarios Challenge Mobile

Branching Scenario Mission: Innovating for the Future

Piñata Challenge

Teaching Challenge: Transform Your Classroom

Frayer Model

Math Calculations

Interactive QR Code Generator

Transcript

Data Analyst Session 11

Time series forecasting

What is Time Series?

“Have you noticed that Bitcoin prices are higher on Fridays? I was looking at the historical daily prices and I spotted a trend.”

What is Time Series?

“Have you noticed that Bitcoin prices are higher on Fridays? I was looking at the historical daily prices and I spotted a trend.”

“This is the price and how it’s changed over time, this is what they call a ‘time series’. If you look carefully, you can see spikes on various days and those days happened to be Fridays.”

Session Agenda

Hypothesis testing: Including the steps you should follow to complete a hypothesis test, worked examples and a demo experiment

A/B Testing: Including how an A/B test works, and the steps to completing them

Performing statistical hypothesis testing: Including a note on statistical significance and a worked example

Hypothesis Testing

Hypothesis Testing Steps

There are 5 main steps in hypothesis testing, as shown opposite. During which of the following hypothesis testing steps would you complete the following?

  • Perform sampling
  • Collect data in a way that is designed to test your hypothesis
  • Make statistical inferences about the population you are interested in

1: Stating null and alternative hypotheses

2: Collecting data

3: Performing a statistical test

4: Rejecting or accepting your null hypothesis

Submit your responses to the chat!

5: Presenting your findings

Step 1: Stating null & alternative hypotheses

After developing your initial research hypothesis, it is important to restate it as a null (Ho) and alternate (Ha) hypothesis so that you can test it mathematically.

  • The null hypothesis is a prediction of no relationship between the variables you are interested in
  • The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables

You formulate a hypothesis that a well-designed sales website will generate more revenue than a poorly designed one. To test this hypothesis, you restate it as:

  • H0: The average spend would be the same for both websites (null hypothesis)
  • Ha: The better-designed website would lead to higher spend on average (alternative hypothesis)

Step 2: Collecting Data

For a statistical test to be valid, it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in. Note: The example of Gender vs height is a known biased example.

To test differences in revenue for both websites your sample should have an equal sample size of customers visiting the sites. It should also consider any variables that might influence the revenue generation (ie size of the sample and any known customer demographics).

Demo Experiment: Sample Size

Here is an example of the need to perform sampling and collect data in a way that will test your hypothesis. This is a demo showing the importance of sample size for making an accurate estimate.

Lake Fish Sampling Interaction

What is a P-value?

Definition: P-value

A p-value of 0.05 means a 5% chance or 1 in 20.

    Step 3: Performing a Statistical Test

    All statistical tests are based on a comparison of the following:

    1. Between-group variance - how different the categories are from one another
    2. Within-group variance - how spread out the data is within a category

    • If between-group variance is large and shows no overlap between groups then your statistical test will show a low p-value
    • Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will show a high p-value

    Step 4: Rejecting or accepting null hypothesis

    • This is done using the p-value to guide your decision
    • In most cases, your threshold for rejecting the null hypothesis will be 0.05 (5%)
    • That is, a less than 5% chance that you would see these results if the null hypothesis were true
    • In some cases, researchers choose a more conservative level of significance, such as 0.01 (1%)

    In your analysis of the difference in revenue generation between our 2 websites, you find that the p-value of 0.002 is below your cutoff of 0.05, so you decide to reject your null hypothesis of:

    • H0: The average spend would be the same for both websites

    Step 5: Tips for Presenting Findings

    Results will be presented in the results and discussions sections of a research paper, dissertation, or thesis and should follow this guidance:

    In the discussion, you can discuss whether initial hypotheses were supported by results.

    In the formal language of hypothesis testing, you should talk about rejecting or failing to reject the null hypothesis.

    The results section should give a brief summary of the data and results of the statistical test.

    Example: “In our comparison of revenue generation between 2 websites, we found the p-value of 0.002 is below our cutoff of 0.05; therefore, we can reject the null hypothesis and conclude that the better-designed website would lead to higher spend on average .”

    A/B Testing

    Introduction

    A/B testing is a method of experimentation where two versions of a website, product, or advertisement are compared to determine which one performs better. For example, a company may want to determine which of two versions of an advertisement will lead to more clicks and therefore revenue generation.

    Your Quick Guide to A/B testing is available here: Quick Guide: A/B Testing Quick Guide

    A/B Testing of Advertisements to a Split Audience

    How does an A/B Test work?

    The A/B Test is a well-known instance of a business experiment and it's one of the most popular business experiments. The rules for conducting an A/B test are as follows:

    Two variants of a single variable are compared to determine which performs better

    The variable can be anything, from the colour of a button to the layout of a webpage

    To properly conduct an A/B test, both variants must be identical in all respects except for the one being tested

    Once the test is conducted and a winner is chosen, that variant becomes the new standard

    Activity: Completing an A/B Test

    Our online clothing company wants to determine which of two versions of an advertisement will lead to more clicks, and therefore more potential revenue. What steps should they need to follow to complete an A/B test? Use the A/B testing quick guide you have been provided to identify the steps that should be taken.

    Formulating the experiment

    Visualising the data

    Plotting the data

    Submit your responses to the chat!

    Converting the data to a summary table

    Performing Statistical Hypothesis Testing

    A note on statistical significance

    In the context of A/B testing experiments, statistical significance is how likely it is that the difference between your experiment’s control version and test version isn’t due to error or random chance. For example, if you run a test with a 95% significance level, you can be 95% confident that the differences are real.

    Definition: Statistical Significance

    Demo: Statistical Hypothesis Testing

    So far we have looked at comparing average values between 2 different groups, in hypothesis testing we can also test proportion in 2 separate groups. Use the link to the statistical significance calculator to test the following hypothesis and data:

    • H0: There is no difference in the rate (%) of clicks for each advert (null hypothesis)
    • Ha: Advert B attracts more clicks and therefore more customer revenue (alternative hypothesis)
    Data:

    A/B Testing of Adverts to a Split Audience

    Advert

    Visitors

    Conversions (Clicks)

    Statistical Significance Calculator

    1000

    450

    1000

    500

    Topic Summary

    You should now be able to:

    1. Describe the steps of statistical testing ​
    1. Explain key concepts relating to hypothesis testing, including null and alternative hypothesis, p-values, and statistical significance
    1. Carry out statistical hypothesis testing to determine the likelihood of hypotheses and solve business problems
    Are there any questions or feedback?

    20

    Data Analyst Session 8

    Hypothesis Testing End of Lesson