Want to create interactive content? It’s easy in Genially!

Reuse this genially

STATISTICS

Luisamaria Castano V

Created on March 18, 2021

Start designing with a free template

Discover more than 1500 professional designs like these:

Transcript

STATISTICS

What are we gonna learn about statistics?

types of variables

CENTRAL TENDENCY

measures of location

LEVELS OF MEASURE

GRAPHICAL REPRESENTATIONS

measures of dispersion

FREQUENCY TABLES

measures of correlation

WHAT IS STATISTICS FOR?

COLLECTION OF DATA

DESCRIPTION

ANALYSIS

CONCLUSIONS

TYPES OF DATA

Quantitative data can either be...

SCALE OF MEASURE

RATIO DATA

Distance between categorias with true zeroEX: WEIGHT

INTERVAL DATA

Distance between categories, no absolute zero EX: CELSIUS TEMPERATURE SCALE

ORDINAL DATA

Ordered categories (rankings, scales) EX: SOCIOECONOMIC STATUS

NOMINAL DATA

LET'S PRACTICE

Categories (no order o direction) EX: MARITAL STATUS

TYPES OF STATISTICAL ANALYSIS

Not as reliable but faster!

More detailed and accurate!

GRAPHICAL REPRESENTATIONS

RATIO DATA

INTERVAL DATA

ORDINAL DATA

NOMINAL DATA

GRAPHICAL REPRESENTATIONS

RATIO DATA

Bar Charts!

INTERVAL DATA

  • Ideal to represent categories (nominal)
  • Very good to show relative size
  • It is better to leave gaps inbetween bars

ORDINAL DATA

NOMINAL DATA

GRAPHICAL REPRESENTATIONS

RATIO DATA

Pie Charts!

INTERVAL DATA

  • Effective to represent categories (nominal or ordinal)
  • Every slice shows a proportion of a whole
  • Cannot be used if any categorie has a zero or negative value

ORDINAL DATA

NOMINAL DATA

GRAPHICAL REPRESENTATIONS

RATIO DATA

Dot Plots!

INTERVAL DATA

  • Effective both for categorical or quantitative variables
  • Better to use it when dealing with small data sets

ORDINAL DATA

NOMINAL DATA

GRAPHICAL REPRESENTATIONS

RATIO DATA

Line Graphs!

INTERVAL DATA

  • Can only be used with quantitative data
  • Perfect to show changes overtime
  • Good to compare different sets of data changing together over the same amount of time

ORDINAL DATA

NOMINAL DATA

GRAPHICAL REPRESENTATIONS

RATIO DATA

Histogram!

INTERVAL DATA

  • Can only be used with quantitative data
  • Shows number intervals and number ranges
  • The horizontal axis is a continuous number line, thus, it can have negative and zero values

ORDINAL DATA

NOMINAL DATA

Once we have decided what type of statistical analysis to do and we have collected the data we shall classify it since it will be easier to work with...

How do we classify the data?

1. group the information according to the levels of measure 2. count the frequency of each category

EXAMPLE OF CLASSIFICATION

HOW DO WE CLASSIFY DATA IN GROUPS?

EX:

HOW DO WE CLASSIFY DATA IN GROUPS?

EX:

HOW DO WE CLASSIFY DATA IN GROUPS?

EX:

HOW DO WE CLASSIFY DATA IN GROUPS?

EX:

HOW DO WE CLASSIFY DATA IN GROUPS?

EX:

Once we have organized the data in groups and counted their frequency, we can take measurements that will give us relevant information about the data...

CENTRAL TENDENCY

MEAN, MEDIAN AND MODE

Add up the values and divide by the total amount of numbers!

Order the numbers and choose the one in the middle!

Choose the value that repeats the most!

CENTRAL TENDENCY

The Mean

The mean age of the kids attending the party is 7,5 years

CENTRAL TENDENCY

The Median

The median age is 13 years

CENTRAL TENDENCY

The Mode

The mode age is 13

LOCATION MEASUREMENTS

PERCENTILES

Percentile: the value below which a percentage of data falls.

DECILES: The data is divided in 10 groups with 10% of data each.

QUARTILES: The data is divided in 4 groups with 25% of data each. The middle quartile is the median.

DATA NEEDS TO BE ORDERED

QUINTILES: The data is divided in 5 groups with 20% of data each.

WHAT IF DATA IS GROUPED?

LOCATION MEASUREMENTS

What if you are asked about the best estimate for a percentile of certain observation from a group of data:

EX:

The dot plot shows the number of hours of daily driving time for 14 different MKS school bus drivers, each dot represent one driver. What is the best percentile estimation for the driver with a daily driving time of 6 hours?

LOCATION MEASUREMENTS

What if you are asked about the value of an observation given the percentile rank in a group of data:

EX:

A total of 10000 people attended the music festival StereoPicnic in Bogotá. The table shows the amount of people that arrived per hour. What interval contains the 45th percentile, that means, when 45% of the festival-goers had arrived.

LOCATION MEASUREMENTS

WHAT IF DATA IS GROUPED?

Add up all percentages below the score, plus half the percentage at the score. Taking half the B means we don't assume we got the best B nor the worst B, just an average B.

In the previous test: * 12% of the group got D * 50% of the group got C * 30% of the group got B * 8% of the group got A

12% + 50% + 0.5(30%) = 77%

If you got a B, what percentile are you in?

You are on the 77th percentile, you did as well as or better than 77% of the class

LOCATION MEASUREMENTS

Let's practice!

LOCATION MEASUREMENTS: SABER QUESTIONS

Responde la siguiente pregunta teniendo en cuenta toda la información suministrada.

LOCATION MEASUREMENTS: SABER QUESTIONS

1. Tanto un hombre como una mujer, dados su peso y estatura, se encuentran en el rango de obesidad leve del IMC es decir, su índice está entre 30 y 34,9. Qué se puede afirmar de dichos individuos cuando se les compara con la población de su mismo sexo entre los 26 y 60 años? A. Ambos individuos son más obesos que el 90% de la población de su mismo sexo. B. La mujer es más obesa que el 88% de población de su mismo sexo mientras el hombre es más obeso que el 95% de su población. C. La mujer es más obesa que el 92% de su población mientras el hombre es más obeso que tan solo el 88% de su población. D. Tanto el hombre como la mujer son parte del 80% menos obeso de sus respectivas poblaciones.

DISPERSION

MEAN DEVIATION

VARIANCE

STANDARD DEVIATION

RANGE

DISPERSION

Range: the difference between the lowest and highest values

DISPERSION

Mean Deviation: how far, on average, are all values from the middle

1. Find the mean of all values. 2. Find the distance of each value from that mean: subtract the mean from each value, ignore minus signs. 3. Then find the mean of those distances.

DISPERSION

EX:

DISPERSION

Standard Deviation: measures the amount of variability among the numbers in a data set. Its symbol is the greel letter σ.

  • It calculates the typical distance of a data point from the mean of the data
  • If the standard deviation is relatively large, it means the data is quite spread out away from the mean
  • If the standard deviation is relatively small, it means the data is concentrated near the mean

DISPERSION

Variance: The average of the squared differences from the mean.

Variance² = Standard Deviation

1. Calculate the mean: the simple average of the numbers 2. Then for each number: subtract the mean and square the result, that means, the squared difference. 3. Calculate the average of those squared differences. We work with the square because it avoids positive and negative numbers from cancelling each other out.

DISPERSION

EX:

mean

These five dogs' heights (at the shoulders) are, from left to right: 600mm, 470mm, 170mm, 430mm and 300mm.

So, using the Standard Deviation we have a "standard" way of knowing what is normal, and what is extra large or extra small. Rottweilers are tall dogs. And Dachshunds are a bit short, right?

HOW CAN WE REPRESENT THIS GRAPHICALLY?

Many things closely follow a Normal Distribution:

  • heights of people
  • size of things produced by machines
  • errors in measurements
  • blood pressure
  • marks on a test

NORMAL DISTRIBUTION

STANDARD DEVIATION'S PERCENTAGES

EXAMPLE:

EXAMPLE:

EXAMPLE:

Students pass a test if they score 50/100 or more. The marks of a large number of students were sampled and the mean and standard deviation were calculated as 42/100 and 8/100, respectively. 1. Sketch the bell distribution according with the information 2. Assuming this data is normally distributed, what percentage of students pass the test?

EXAMPLE:

1 Standard Deviation

2 Standard Deviations

3 Standard Deviations

MEAN

We know 68 of the scores are within 1 standard deviation (SD) from the means and the score 50 is exactly 1 SD above the mean. So, OUTSIDE of that region there is 32% (100%-68%) of scores, one half above and one half below. Therefore, only 32%/2=16% of people are over the 50 score and have actually passed the test!

DO IT YOURSELF:

The mean June midday temperature in Desertville is 36°C and the standard deviation is 3°C. Assuming this data is normally distributed, how many days in June would you expect the midday temperature to be between 30°C and 42°C?

DO IT YOURSELF:

The heights of male adults are Normally distributed with mean 1.7 m and standard deviation 0.2 m. In a population of 400 male adults, how many would you expect to have a height between 1.5 and 1.9 m?

DO IT YOURSELF:

The heights of male adults are Normally distributed with mean 1.7 m and standard deviation 0.2 m. In a population of 400 male adults, how many would you expect to have a height between 1.5 and 1.9 m?

practice

https://www.mathopolis.com/questions/q.html?id=2619&t=mif&qs=2619_2620_2621_2622_2623_2624_2625_2626_3844_3845&site=1&ref=2f646174612f7374616e646172642d6e6f726d616c2d646973747269627574696f6e2e68746d6c&title=4e6f726d616c20446973747269627574696f6e#

CORRELATION

Correlation measures how related two sets of values are.

  • POSITIVE: Both values increase together
  • NEGATIVE: While one values increases, the other one decreases
  • 1 is a perfect positive correlation
  • 0 is no correlation (the values don't seem linked at all)
  • -1 is a perfect negative correlation