STATISTICS
DEFINITION AND KEY TERMS
STATISTICS
DEFINITION
Statistics is a mathematical discipline that focuses on the collection, organization, analysis, interpretation, and presentation of numerical data or information.
What's its purpose?
"To understand patterns, trends, relationships, and variations in data to make informed decisions or to make inferences about a population or a larger set of data from a representative sample."
+ Info
https://openstax.org/details/books/introducci%C3%B3n-estad%C3%ADstica-empresarial
KEY TERMS
- Population
- Sampling
- Sample
- Statistics
- Parameter
- Variable of interest
- Mean (or average)
- Proportion
- Data
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
DATA
The data collected generally comes in different formats: Excel files, CSV, TXT, XML (big data), etc. When you want to analyze the data (for example, in R or Python), it is arranged in a "data frame" format, which are tables organized into columns (variables) and rows (observations).
Data organization
Variables
Observables
Types of variables
Name:Height: Blood type Nivel de estudio: Shoe size:
Categorical (nominal) Numerical (continuous)Categorical (nominal)Categorical (ordinal)Numerical (discrete)
Data VISUALIZATION
CATEGORICAL DATA
PIE CHART
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
SIDE BAR CHART
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
STACKED BAR CHART
CATEGORICAL DATA
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
STACKED BAR CHART
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
Other solutions
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
Other solutions
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
Other solutions
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
https://colab.research.google.com/drive/1Ysh8lQeTChAv2Wo1pydUZOK3ZO_CS2vK?usp=sharing
DATA VISUALIZATION
Interesting links
https://python-graph-gallery.com/
https://r-graph-gallery.com/
https://flourish.studio/
https://www.tableau.com/community/academic
https://mschermann.github.io/data_viz_reader/fundamentals.html#data-visualization-tools
DATA VISUALIZATION
NUMERICAL DATA
DATA: title, year, rating for movies from 1893 to 2005 (58788 movies). We want to study the distribution of the variable "rating," which measures the level of enjoyment for each movie on a scale of 0 to 10.
The histogram allows us to visualize how the data is distributed with respect to a certain variable, in this case, the "rating."
Histogram
- The values of the variable under study ("rating") are grouped into classes (bins) of a certain size. Here, the histogram is shown for 19 bins with a width of 0.5.
- The "rating" values are on the horizontal axis. The first bar on the left is centered at the value 1, and it represents rating values from 0.75 to 1.25. The second bar is centered at the value 1.5 and represents "rating" values between 1.25 and 1.75, and so on.
- The data is grouped into classes: for each class, the number of data points with a "rating" value within the class's range is counted.
The highest bars are associated with "rating" values that occur most frequently in the database. Extreme "rating" values occur less frequently; these are the values in the "tails."
- The number of bins can be reduced; here we have 10 bins of width 1. More data is grouped into each bin.
The shape of the histogram may change a little, but its general characteristics do not vary; the highest bar still corresponds to a "rating" around 6.
- You can increase the number of bins; here we have bins with a width of 0.1. Fewer data points are grouped into each class.
Histogram
You can add a color scale related to the frequency of the data. The most frequent data is represented with light blue, and the rarest with dark blue.
Uniform distribution
In each class of the histogram, there is an approximately equal number of observations. The height of the bars is not exactly equal due to the randomness of natural phenomena. Example: the distribution of results obtained by rolling an unloaded (fair) die many times (each result can be a number from 1 to 6).
Symmetrical distribution
The shape of the distribution is symmetrical, approximately equal on the left and right of the center line. The distribution is not perfectly symmetrical due to the randomness of natural phenomena. The central values are more probable (peak), while the extreme values are less probable (tails).
Example: weight, height etc..
Skewed distribution (with a right tail)
The shape of the distribution is not symmetrical, with the peak shifted to the left. The variable has a limit on the left side but not on the right. A "tail" is observed to the right. Smaller values of the variable are more frequent. Example: distribution of the number of cars per household.
Right tail
Limit
Skewed distribution (with left tail)
Limit
The shape of the distribution is not symmetrical, with the peak shifted to the right. T he variable has a limit on the right side but not on the left. A "tail" is observed to the left. Larger values of the variable are more frequent.
Left tail
Example: distribution of the number of vacation days requested each year at a company.
Scatter plot
DATA: Car characteristics for various models (miles per gallon, number of cylinders, horsepower, weight...). (32 data points). We want to study if there is a relationship between the two variables "mpg" (miles per gallon) and "wt" (weight).
The scatter plot allows us to analyze the relationship between two variables.
Scatter plot (basic version)
- Each point represents one data point. The X-axis represents the weight (in units of 1000 lbs), and the Y-axis represents the number of miles per gallon.
- We can observe that the two variables are "correlated"; a pattern is seen. A higher number of miles per gallon (greater efficiency) is associated with a lower weight. As the weight increases, the efficiency decreases, and fewer miles per gallon are traveled.
In this case, we say that the variables have a negative relationship (negative slope).
Scatter plot (version "pro" 1)
- You can use sizes, colors, and shapes to add information about a third variable. Here, the points are represented with different sizes. The size of the points is associated with the variable "hp," which represents horsepower.
- We can see that cars with higher horsepower generally have higher fuel consumption.
Scatter plot (version "pro" 2)
- You can use sizes, colors, and shapes to add information about a third variable. Here, the data is represented with different shapes and colors. The color and shape are related to the "cyl" variable, which represents the number of cylinders.
- We can see that cars with 8 cylinders are in the region of the graph associated with higher fuel consumption.
Time plot
DATA: average global temperature from 1880 to 2018.X-axis: time (years, months, hours, etc.) Y-axis: variable under study Aprendizaje guiado Deep Research 🍌 Imagen Canvas
STATISTICAL MEASURES
Inferential statistics
Descriptive statistics
Inferential statistics is used to draw conclusions or make predictions about a larger population based on a sample of data. This involves applying techniques such as hypothesis testing, interval estimation, and regression analysis.
It is responsible for describing and summarizing data in a concise and meaningful way. It includes techniques such as creating tables, graphs, measures of central tendency (like the mean, median, and mode), measures of dispersion (like the standard deviation), and the representation of data distributions.
Statistic
TEC MX
Created on September 22, 2025
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Practical Presentation
View
Smart Presentation
View
Essential Presentation
View
Akihabara Presentation
View
Pastel Color Presentation
View
Nature Presentation
View
Higher Education Presentation
Explore all templates
Transcript
STATISTICS
DEFINITION AND KEY TERMS
STATISTICS
DEFINITION
Statistics is a mathematical discipline that focuses on the collection, organization, analysis, interpretation, and presentation of numerical data or information.
What's its purpose?
"To understand patterns, trends, relationships, and variations in data to make informed decisions or to make inferences about a population or a larger set of data from a representative sample."
+ Info
https://openstax.org/details/books/introducci%C3%B3n-estad%C3%ADstica-empresarial
KEY TERMS
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
Determine a qué se refieren los términos clave en el siguiente estudio. Queremos saber la cantidad promedio (media) de dinero que gastan los estudiantes de primer año del ABC College en material escolar que no incluya libros. Encuestamos al azar a 100 estudiantes de primer año del ABC College. Tres de esos estudiantes gastaron 150, 200 y 225 dólares, respectivamente.
DATA
The data collected generally comes in different formats: Excel files, CSV, TXT, XML (big data), etc. When you want to analyze the data (for example, in R or Python), it is arranged in a "data frame" format, which are tables organized into columns (variables) and rows (observations).
Data organization
Variables
Observables
Types of variables
Name:Height: Blood type Nivel de estudio: Shoe size:
Categorical (nominal) Numerical (continuous)Categorical (nominal)Categorical (ordinal)Numerical (discrete)
Data VISUALIZATION
CATEGORICAL DATA
PIE CHART
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
SIDE BAR CHART
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
STACKED BAR CHART
CATEGORICAL DATA
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
STACKED BAR CHART
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
Other solutions
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
Other solutions
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
CATEGORICAL DATA
Other solutions
https://mschermann.github.io/data_viz_reader/
DATA VISUALIZATION
https://colab.research.google.com/drive/1Ysh8lQeTChAv2Wo1pydUZOK3ZO_CS2vK?usp=sharing
DATA VISUALIZATION
Interesting links
https://python-graph-gallery.com/
https://r-graph-gallery.com/
https://flourish.studio/
https://www.tableau.com/community/academic
https://mschermann.github.io/data_viz_reader/fundamentals.html#data-visualization-tools
DATA VISUALIZATION
NUMERICAL DATA
DATA: title, year, rating for movies from 1893 to 2005 (58788 movies). We want to study the distribution of the variable "rating," which measures the level of enjoyment for each movie on a scale of 0 to 10.
The histogram allows us to visualize how the data is distributed with respect to a certain variable, in this case, the "rating."
Histogram
The highest bars are associated with "rating" values that occur most frequently in the database. Extreme "rating" values occur less frequently; these are the values in the "tails."
The shape of the histogram may change a little, but its general characteristics do not vary; the highest bar still corresponds to a "rating" around 6.
Histogram
You can add a color scale related to the frequency of the data. The most frequent data is represented with light blue, and the rarest with dark blue.
Uniform distribution
In each class of the histogram, there is an approximately equal number of observations. The height of the bars is not exactly equal due to the randomness of natural phenomena. Example: the distribution of results obtained by rolling an unloaded (fair) die many times (each result can be a number from 1 to 6).
Symmetrical distribution
The shape of the distribution is symmetrical, approximately equal on the left and right of the center line. The distribution is not perfectly symmetrical due to the randomness of natural phenomena. The central values are more probable (peak), while the extreme values are less probable (tails).
Example: weight, height etc..
Skewed distribution (with a right tail)
The shape of the distribution is not symmetrical, with the peak shifted to the left. The variable has a limit on the left side but not on the right. A "tail" is observed to the right. Smaller values of the variable are more frequent. Example: distribution of the number of cars per household.
Right tail
Limit
Skewed distribution (with left tail)
Limit
The shape of the distribution is not symmetrical, with the peak shifted to the right. T he variable has a limit on the right side but not on the left. A "tail" is observed to the left. Larger values of the variable are more frequent.
Left tail
Example: distribution of the number of vacation days requested each year at a company.
Scatter plot
DATA: Car characteristics for various models (miles per gallon, number of cylinders, horsepower, weight...). (32 data points). We want to study if there is a relationship between the two variables "mpg" (miles per gallon) and "wt" (weight).
The scatter plot allows us to analyze the relationship between two variables.
Scatter plot (basic version)
In this case, we say that the variables have a negative relationship (negative slope).
Scatter plot (version "pro" 1)
Scatter plot (version "pro" 2)
Time plot
DATA: average global temperature from 1880 to 2018.X-axis: time (years, months, hours, etc.) Y-axis: variable under study Aprendizaje guiado Deep Research 🍌 Imagen Canvas
STATISTICAL MEASURES
Inferential statistics
Descriptive statistics
Inferential statistics is used to draw conclusions or make predictions about a larger population based on a sample of data. This involves applying techniques such as hypothesis testing, interval estimation, and regression analysis.
It is responsible for describing and summarizing data in a concise and meaningful way. It includes techniques such as creating tables, graphs, measures of central tendency (like the mean, median, and mode), measures of dispersion (like the standard deviation), and the representation of data distributions.