Artificial Intelligence - PREVENT Project
Theory
Start
Artificial Intelligence Theory
FG
AI Breakthroughs
Image Classification
Machine Translation
As of 2015, computers can be trained to perform better than humans at image classification tasks.
As of 2016, we have achieved near-human performance in translating between languages using advanced AI techniques.
"Je suis étudiant"
AI Is The New Electricity
"About 100 years ago, electricity transformed every major industry. AI has advanced to the point where it has the power to transform every major sector in coming years."
- Andrew Ng, Stanford University
Definitions
Artificial Intelligence
The broadest concept
Machine Learning
A subset of AI
Deep Learning
A subset of ML
Artificial Intelligence
Merriam-Webster Definition
Intel Definition
"A program that can sense, reason, act, and adapt."
"A branch of computer science dealing with the simulation of intelligent behaviour in computers."
Wikipedia Definition
"Colloquially, the term 'artificial intelligence' is applied when a machine mimics 'cognitive' functions that humans associate with other human minds, such as 'learning' and 'problem solving'."
Machine Learning
"The study and construction of programs that are not explicitly programmed, but learn patterns as they are exposed to more data over time."
Machine Learning
Input Data
Large datasets feed the program
Pattern Recognition
Program identifies patterns without explicit programming
Learning
System improves with more examples
Classification
Makes decisions on new data
These programs learn from repeatedly seeing data, rather than being explicitly programmed by humans.
Machine Learning Terminology
Features
Attributes of the data (input columns)
Target
Column to be predicted (output)
This example is learning to classify a species from a set of measurement features.
Two Main Types of Machine Learning
Supervised Learning
Unsupervised Learning
Dataset: Has a target column
Dataset: Does not have a target column
Goal: Make predictions
Goal: Find structure in the data
Example: Fraud detection
Example: Customer segmentation
Machine Learning Example
Fraud Detection
Key Features
ML algorithms can identify unusual activity in financial transactions.
Machine Learning Limitations
Feature Engineering Challenge
Deep Learning Solution
For complex tasks like image recognition, defining effective features is difficult.
Deep learning overcomes this limitation by automatically learning the most relevant features from raw data.
What features would you use to distinguish a cat from a dog?
Deep Learning
"Machine learning that involves using very complicated models called 'deep neural networks'."
Deep learning models determine the best representation of original data. In classic machine learning, humans must manually engineer these features.
Deep Learning Example
Classic Machine Learning
Deep Learning
Step 1: Determine features manually
Steps 1 and 2 are combined into a single step
Step 2: Feed them through model
The neural network automatically extracts the relevant features
History of AI
Early algorithms
1950s-1960s: Foundations of AI established
First AI Winter
Late 1960s-1970s: Funding cuts after limited progress
Expert systems
1980s: Rule-based systems gained commercial success
Second AI Winter
Late 1980s-1990s: Limited progress led to reduced interest
Machine learning
1990s-2000s: Statistical approaches gained traction
Deep learning
2010s-Present: Neural networks revolutionized the field
1950s: Early AI
1950: Turing Test
1956: Dartmouth Conference
Alan Turing developed a test for machine intelligence
Artificial Intelligence accepted as a formal academic field
1957: Perceptron
1959: Machine Learning
Frank Rosenblatt invented the precursor to neural networks
Arthur Samuel's checkers program learned from experience
The First "AI Winter"
1966: ALPAC Report
Committee evaluated AI techniques for machine translation and found poor return on investment
1969: Perceptron Limitations
Marvin Minsky's book highlighted limitations of neural networks, slowing research
1973: Lighthill Report
Highlighted AI's failure to deliver on promises, leading to funding cuts
Impact
These reports led to significant cuts in government funding for AI research
1980's AI Boom
Expert Systems
Systems with programmed rules designed to mimic human experts gained commercial adoption
Mainframe Computing
Ran on specialized hardware using languages like LISP
Commercial Success
Two-thirds of Fortune 500 companies used expert systems at their peak
Neural Network Revival
In 1986, the "Backpropagation" algorithm enabled training of multi-layer networks
Another AI Winter (late 1980's – early 1990s)
Technology Integration
Progress Slowed
Expert systems became features in general business applications
Expert systems' impact on business problems plateaued
PC Revolution
Software moved from mainframes to personal computers
Declining Interest
Scaling Issues
Business enthusiasm for AI waned significantly
Neural networks couldn't handle large problems
Late 1990's to early 2000's: Classical Machine Learning
SVM Algorithm
Practical Applications
Integration
Support Vector Machine became the leading machine learning method
AI solutions succeeded in speech recognition, medical diagnosis, and robotics
AI algorithms became embedded in larger systems across industries
2006: Rise of Deep Learning
2006
Geoffrey Hinton publishes groundbreaking paper on unsupervised pre-training for deeper neural networks
2009
ImageNet database of human-tagged images presented at the CVPR conference
2010
First ImageNet competition launches with algorithms competing on visual recognition tasks
Rebranding
Neural networks rebranded as "deep learning" to reflect their renewed potential
Deep Learning Breakthroughs (2012 – Present)
2012
2013
2014
Deep learning models dramatically outperform previous methods on the ImageNet competition
Deep learning models begin to understand "conceptual meaning" of words
Similar breakthroughs appear in language translation tasks
Impact
Advancements led to improvements in web search, document search, summarization, and translation
Deep Learning Breakthroughs (2012 – Present)
2014
Computer vision algorithms learn to describe photos with natural language
2015
Google releases TensorFlow, making deep learning tools widely accessible
2016
DeepMind's AlphaGo defeats Go master Lee Se-dol, a milestone achievement
Impact
These breakthroughs demonstrated AI's ability to master tasks previously thought to require human intuition
Modern AI (2012 – Present): Deep Learning Impact
Self-driving Cars
Healthcare
Communication
Advanced object detection enables autonomous navigation in complex environments
AI systems improve diagnostic accuracy across various medical specialties
Neural translation systems approach human-level quality in many language pairs
How Is This Era of AI Different?
Faster Computers
Modern computing power, especially GPUs, enables complex model training
Bigger Datasets
Internet-scale data collection provides vast training resources
Advanced Neural Networks
Sophisticated architectures can learn complex patterns autonomously
Cross-disciplinary Results
AI advances benefit multiple fields simultaneously
Other Modern AI Factors
Open Source Ecosystem
Open Source Libraries
Open Data
Large labeled datasets enable training of more sophisticated models
Python-based tools have democratized access to machine learning
Leading deep learning frameworks are freely available to researchers and developers
Collaborative Research
Academic and industry collaboration accelerates progress
Transformative Changes in Healthcare
Enhanced Diagnostics
Drug Discovery
Patient Care
- AI systems analyze medical images with expert-level accuracy
- AI accelerates identification of potential therapeutic compounds
- Monitoring systems detect subtle changes in patient condition
- Early detection of conditions improves treatment outcomes
- Reduces development time from years to months
- Predictive algorithms identify high-risk patients
- Reduces diagnostic errors and improves patient care
- Enables personalized medicine approaches
- Virtual assistants support patient management
Transformative Changes in Finance
Algorithmic Trading
AI systems make high-speed trading decisions based on market patterns
Fraud Detection
ML models identify suspicious transactions with high accuracy
Risk Assessment
AI evaluates loan applications and investment opportunities
Personal Finance
Chatbots and robo-advisors provide financial guidance
Transformative Changes in Government
24/7
Citizen Services
AI-powered systems provide round-the-clock assistance to citizens
50%
Efficiency Gains
Process automation reduces administrative costs and time
90%
Threat Detection
AI systems identify security risks with high accuracy
75%
Resource Optimization
Smart city applications improve urban resource management
Transformative Changes in Transport
Autonomous Vehicles
Logistics Optimization
Emergency Response
Self-driving cars use AI to navigate complex environments safely
AI systems manage fleets and optimize delivery routes
Drones and robots assist in search and rescue operations
Supervised Learning
Labeled Data
Model Training
Dataset includes input features and desired output
Algorithm learns patterns between inputs and outputs
Evaluation
Prediction
Performance assessed on held-out test data
Trained model applied to new, unseen data
Machine Learning
Type
Dataset
Supervised Learning
Data points have known outcome
Unsupervised Learning
Data points have unknown outcome
The study and construction of programs that learn from repeatedly seeing data, rather than being explicitly programmed by humans.
Target vs. Features
Features
Target
Properties of the data used for prediction (non-target columns)
Column to predict - the outcome we're interested in
- Input variables that the model uses
- Output variable that the model learns to predict
- In emergency management: weather data, population density, infrastructure status
- In emergency management: flood risk level, evacuation requirement, resource needs
Example: Supervised Learning Problem
Goal
Predict if an email is spam or not spam
Data
Historical emails labeled as spam or not spam
Features
Email text, subject, time sent, sender information
Target
Binary classification: spam or not spam
Example: Supervised Learning Problem
Object Detection for Emergency Response
AI systems can identify people, vehicles, and damaged structures in disaster zones.
- Goal: Predict location of bounding boxes around objects
- Data: Images with annotated bounding box locations
- Features: Image pixels and patterns
- Target: Coordinates of object bounding boxes
Emergency Management Applications
Disaster Detection
Risk Prediction
Resource Allocation
AI can rapidly analyze satellite and drone imagery to identify disaster impacts and severity
ML models can forecast disaster trajectories based on weather and terrain data
AI optimizes emergency response resources based on real-time needs assessment
Formulating a Supervised Learning Problem
Collect Labeled Dataset
Gather data with features and target labels relevant to your problem
Choose a Model
Select the algorithm best suited to your data type and problem
Define Evaluation Metric
Determine how to measure performance based on your specific goals
Select Optimization Method
Choose how to find the model configuration that maximizes performance
Which Model?
Decision Tree
Nearest Neighbor
Neural Network
Makes predictions by asking a series of yes/no questions about features
Makes predictions based on similarity to training examples
Makes predictions using interconnected layers of artificial neurons
Which Model?
When choosing a model for emergency management applications, consider these key factors. Problem complexity and data requirements often outweigh other considerations due to the critical nature of emergency response.
Evaluation Metric
Accuracy
Mean Squared Error
Other Metrics
Proportion of correct predictions
Average squared difference between predictions and actual values
- Precision: Accuracy of positive predictions
Useful when classes are balanced
- Recall: Ability to find all positive cases
Used for regression problems
- F1-Score: Harmonic mean of precision and recall
- AUC-ROC: Area under receiver operating characteristic curve
Evaluation Metric
The Wrong Metric Can Be Misleading
In Emergency Management
Consider using accuracy for spam detection with 99% spam emails. A model predicting "spam" for every email would have 99% accuracy but miss important legitimate emails.
False negatives (missing an emergency) are often more costly than false positives (false alarms). Metrics should reflect this asymmetric cost.
Context Matters
Choose metrics that align with the real-world impact of predictions. For evacuation decisions, recall (finding all cases requiring evacuation) may be more important than precision.
Training
Training Data
Optimization
For Emergency Management
The dataset used to teach the model patterns between features and targets
The process of configuring the model for best performance
Models must be trained on diverse scenarios to handle the unpredictable nature of disasters
- Historical emergency situations with outcomes
- Adjusts model parameters to minimize errors
- Synthetic disaster scenarios
- Uses algorithms like gradient descent
- Data from simulations and exercises
- May require multiple iterations
Training
Input Data
Labeled examples feed into the model
Forward Pass
Model generates predictions based on current configuration
Error Calculation
Difference between predictions and actual targets is measured
Backward Pass
Model parameters are adjusted to reduce errors
Iteration
Process repeats until performance stops improving
Inference
New Data
Unseen examples are provided to the trained model
Processing
Model applies learned patterns to analyze the data
Prediction
Model generates outputs based on its training
Decision
Predictions inform emergency management actions
Training vs. Inference
Aspect
Training
Inference
Goal
Learn patterns from data
Apply patterns to new data
Input
Labeled data (features + targets)
Unlabeled data (features only)
Output
Trained model parameters
Predictions
Computation
Intensive, often requires GPUs
Relatively lightweight
Deployment
Typically offline, in development
Real-time, in production
Supervised Learning Overview
Training Phase
Inference Phase
Data with answers + Model → Trained Model
New data + Trained Model → Predictions
Evaluation
Refinement
Compare predictions to actual results
Improve model based on performance
The ultimate goal is to develop a model that performs well on unseen data, making reliable predictions in new emergency situations.
Emergency Management Example
Wildfire Prediction
Flood Risk Assessment
Damage Assessment
AI models predict fire spread based on weather, vegetation, and topography
ML algorithms estimate flooding probability using rainfall and terrain data
Computer vision algorithms rapidly identify structural damage after earthquakes
Curve Fitting: Overfitting vs. Underfitting Example
Goal
Challenge
Fit a curve to the data points to model the underlying relationship
Finding the right complexity for the model to capture the true pattern without fitting to noise
In emergency management: model the relationship between weather conditions and flood severity
Curve Fitting: Underfitting Example
The Curve Is Too Simple
Model fails to capture important patterns in the data
Poor Training Performance
High error even on data used for training
Poor Test Performance
Cannot generalize to new situations
In Emergency Management
An underfitted model might miss critical warning signs of an impending disaster
Curve Fitting: Overfitting Example
The Curve Is Too Complex
Model captures random noise instead of true patterns
Excellent Training Performance
Nearly perfect fit to training data
Poor Test Performance
Cannot generalize to new situations
In Emergency Management
An overfitted model might generate false alarms or miss genuine emergencies in slightly different conditions
Curve Fitting Problem
Challenge
Risk
For Emergency Management
Unseen data isn't available during training, making it difficult to evaluate performance on new scenarios
When measuring performance only on training data, models tend to overfit
Finding the right balance is crucial - models must generalize to new disaster scenarios while maintaining sensitivity to warning signs
Solution: Split Data Into Two Sets
Training Set
Test Set
Data used for model learning
Data used for performance evaluation
- Used to adjust model parameters
- Simulates unseen scenarios
- Model sees this data during learning
- Model never sees this during training
Train-Test Split
Training Phase
Model Weight Adjustment
Testing Phase
Performance Assessment
Model learns patterns from training data
Trained model evaluated on unseen test data
Parameters optimized based on training performance
Test results estimate real-world performance
This approach simulates how the model will perform in real emergency situations it hasn't encountered before.
Cross-Validation for Emergency Models
Split Data
Iterate
Train on all but one fold, test on remaining fold
Divide dataset into multiple folds
Rotate
Average
Repeat using different fold as test set
Calculate performance across all iterations
Cross-validation provides a more robust performance estimate, especially important for emergency management models where data may be limited and variability high.
Deep Learning
"Machine learning that involves using very complicated models called 'deep neural networks'."
These sophisticated models automatically determine the best representation of data, eliminating the need for manual feature engineering that traditional machine learning requires.
Deep Learning Differences
Classic Machine Learning
Deep Learning
Two distinct steps:
Integrated approach:
- Humans determine features manually
- Feature extraction and modeling combined
- Features are fed through model
- Raw data processed through multiple layers
- Each layer learns increasingly abstract features
Deep Learning Problem Types
Image Analysis
- Classification of disaster types
- Object detection in affected areas
- Semantic segmentation of damage zones
Natural Language Processing
- Social media monitoring for emergency reports
- Sentiment analysis during crises
- Automated emergency communication
Time Series Analysis
- Weather pattern prediction
- Epidemic spread forecasting
Speech Recognition
- Emergency call processing
- Voice-activated response systems
- Multilingual communication support
Classification and Detection
Object Detection
Emergency Applications
Real-time Processing
Locates and identifies specific objects in images or video frames
Identifies victims, damaged structures, blocked roads, and emergency vehicles
Enables rapid response to developing situations
Semantic Segmentation
Pixel-level Classification
Labels every pixel in an image, creating detailed maps of different elements
In emergency management:
- Precise damage assessment
- Accurate flooding extent mapping
- Detailed wildfire boundary detection
- Identification of safe zones vs hazardous areas
Natural Language Object Retrieval
Text-guided Visual Search
Emergency Applications
Resource Management
Systems can locate objects in images based on natural language descriptions
Enables search and rescue operations based on witness descriptions
Quickly identifies specific infrastructure or resources needed during response
Speech Recognition and Language Translation
Cross-language Communication
Emergency Call Processing
Voice Commands
Hands-free operation of emergency systems through voice recognition
AI enables effective communication between responders and affected populations regardless of language barriers
Automated transcription and analysis of emergency calls helps prioritize response
Radio Communication
Real-time transcription of field radio communications for coordination centers
Fully Connected Network
FG
Formulating Supervised Learning Tools
Dataset Collection
Gather features and target labels that represent the problem you're solving.
Model Selection
Choose an appropriate architecture based on your problem type.
Evaluation Metric
Define how you'll measure performance and success.
Optimization Method
Determine how to find the optimal model configuration.
Which Model?
Different models represent problems uniquely, each with distinct advantages for specific scenarios.
Biological Inspiration
Neuron Building Blocks
Deep learning models draw inspiration from the human brain and its neural structure.
The core component of neural networks is the artificial neuron, which processes inputs into meaningful outputs.
Neuron Mechanics
Input Features
X1, X2, X3 are numerical inputs representing data features.
Weighted Sum
Each input is multiplied by a weight (W1, W2, W3), then summed.
Output Value
Z = X1W1 + X2W2 + X3W3 is the weighted calculation result.
Activation Functions
Purpose
Variety
Non-linearity
Transform the weighted sum into a meaningful output value.
Multiple functions exist, each with specific mathematical properties.
Most activation functions introduce non-linear properties, enabling complex pattern learning.
The Perceptron Model
Historical Significance
Linear Separation
Simple Architecture
One of the earliest neural network models, developed in the 1950s.
Can only solve problems where classes can be separated by a straight line.
Uses basic activation functions to classify inputs into binary categories.
Perceptron Limitations
Non-Linear Problems
The XOR Problem
AI Winter Catalyst
Perceptrons fail when data points cannot be separated by a single line.
A famous example where perceptrons fail, requiring multiple decision boundaries.
This limitation contributed to reduced interest and funding in neural networks research.
Fully Connected Networks
Output Layer
Final predictions
Hidden Layers
Complex feature extraction
Input Layer
Raw data features
Fully connected networks organize neurons in layers. Each neuron connects to every neuron in adjacent layers. Every connection has a separate weight. This structure allows solving complex, non-linear problems by transforming data through successive layers.
Deep Learning Architecture
Feature Compression
Input Processing
Each layer summarizes important information
Raw data enters the network
Relevant Extraction
Output Generation
Task-specific patterns are identified
Final predictions emerge
Deep learning uses many layers, often decreasing in width. Modern architectures may contain hundreds of layers, each extracting increasingly abstract features from the data.
Building a Fully Connected Network
Network Architecture
Define layers and neurons
Activation Functions
Choose appropriate functions
Evaluation Metrics
Select performance measures
Weight Training
Learn optimal parameters
When creating a neural network, you must decide on the number of layers, neurons per layer, and appropriate activation functions. The model's weights are automatically learned during training.
Evaluation Metrics
Regression
Classification
Multi-Label
Mean Squared Error (MSE) measures the average squared difference between predictions and actual values.
Categorical Cross-Entropy measures how well the model predicts class probabilities.
Binary Cross-Entropy evaluates prediction accuracy when items can belong to multiple classes.
Fully Connected Network Limitations
10^9+
Parameter Count
Large networks can contain billions of weights.
TB
Memory Usage
Significant RAM required for training and inference.
100x
Computation
Much more processing power needed than simpler models.
Low
Feature Detection
Not optimal for spatial patterns in images or sequences.
CNN: Revolution in Visual Processing
Convolutional Neural Networks represent a fundamental shift in how computers process visual information. Inspired by biological visual systems, CNNs have transformed image recognition, object detection, and many other visual tasks.
Convolutional Neural Networks
Localized Connections
Weight Sharing
Each neuron connects only to a small region of the previous layer.
The same set of weights applies across the entire input.
Spatial Features
Resource Efficiency
Excellent at recognizing patterns regardless of position.
Requires fewer connections than fully connected networks.
Convolutions as Feature Detectors
Vertical Line Detector Horizontal Line Detector
Corner Detector
Convolutions act as local feature detectors that identify specific patterns. Each filter responds to different visual elements in the input image.
Convolution Operation
Filter Application
Feature Map Creation
The convolution kernel slides across the input image, performing element-wise multiplication and summation.
The result is a new image highlighting where specific features appear in the original input.
CNN Architecture
Input Layer
Raw image data enters the network for processing.
Convolutional Layers
Multiple filters extract various features from the input.
Pooling Layers
Downsample feature maps to reduce dimensions and computational load.
Fully Connected Layers
Combine extracted features for final classification or regression.
Transfer Learning: Building on Giants
Transfer learning leverages pre-trained neural networks to solve new problems with limited data. By reusing knowledge from existing models, we can achieve excellent results more efficiently.
Challenges with CNN Development
Data Requirements
Training effective CNNs typically requires massive datasets with millions of examples.
Computational Demands
Model training can take days or weeks, even with specialized GPU hardware.
Hyperparameter Tuning
Finding optimal network configurations involves extensive experimentation.
Expertise Barrier
Building competitive models from scratch requires deep technical knowledge.
Transfer Learning Principles
Early Layer Characteristics
Middle Layer Features
Later Layer Specificity
Initial layers learn universal visual features like edges, corners, and textures. These take longest to train but apply across most image tasks.
Middle layers combine primitive features into more complex shapes and patterns. These have moderate task specificity.
Final layers learn highly task-specific features. These respond quickly to training and are most adaptable to new tasks.
Benefits of Transfer Learning
Reduced Data Requirements
Faster Training
Better Performance
Fine-tuning takes hours instead of weeks compared to training from scratch.
Pre-trained networks need much less data to adapt to new tasks.
Models built on established architectures often achieve superior results.
Portability
Trained weights are easily stored and shared for deployment.
Transfer Learning Implementation
Select Base Model
Choose a pre-trained network like ResNet, VGG, or EfficientNet.
Freeze Early Layers
Lock weights in early layers to preserve general feature detection.
Replace Classification Layers
Add new layers specific to your task (e.g., emergency detection).
Fine-tune on Target Data
Train new layers while keeping frozen layers fixed.
os
Fine-Tuning Strategies
Training Time
Data Required
Performance
The chart compares different fine-tuning approaches on relative scales (1-10). Consider your available data, computational resources, and performance requirements when selecting a strategy. For emergency detection systems, "Last Few Layers" often provides the best balance.
PREVENT Artificial Intelligence Theory (UVIGO) - EN
Cristina López Bravo
Created on June 5, 2025
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Essential Business Proposal
View
Project Roadmap Timeline
View
Step-by-Step Timeline: How to Develop an Idea
View
Artificial Intelligence History Timeline
View
Momentum: First Operational Steps
View
Momentum: Employee Introduction Presentation
View
Mind Map: The 4 Pillars of Success
Explore all templates
Transcript
Artificial Intelligence - PREVENT Project
Theory
Start
Artificial Intelligence Theory
FG
AI Breakthroughs
Image Classification
Machine Translation
As of 2015, computers can be trained to perform better than humans at image classification tasks.
As of 2016, we have achieved near-human performance in translating between languages using advanced AI techniques.
"Je suis étudiant"
AI Is The New Electricity
"About 100 years ago, electricity transformed every major industry. AI has advanced to the point where it has the power to transform every major sector in coming years."
- Andrew Ng, Stanford University
Definitions
Artificial Intelligence
The broadest concept
Machine Learning
A subset of AI
Deep Learning
A subset of ML
Artificial Intelligence
Merriam-Webster Definition
Intel Definition
"A program that can sense, reason, act, and adapt."
"A branch of computer science dealing with the simulation of intelligent behaviour in computers."
Wikipedia Definition
"Colloquially, the term 'artificial intelligence' is applied when a machine mimics 'cognitive' functions that humans associate with other human minds, such as 'learning' and 'problem solving'."
Machine Learning
"The study and construction of programs that are not explicitly programmed, but learn patterns as they are exposed to more data over time."
Machine Learning
Input Data
Large datasets feed the program
Pattern Recognition
Program identifies patterns without explicit programming
Learning
System improves with more examples
Classification
Makes decisions on new data
These programs learn from repeatedly seeing data, rather than being explicitly programmed by humans.
Machine Learning Terminology
Features
Attributes of the data (input columns)
Target
Column to be predicted (output)
This example is learning to classify a species from a set of measurement features.
Two Main Types of Machine Learning
Supervised Learning
Unsupervised Learning
Dataset: Has a target column
Dataset: Does not have a target column
Goal: Make predictions
Goal: Find structure in the data
Example: Fraud detection
Example: Customer segmentation
Machine Learning Example
Fraud Detection
Key Features
ML algorithms can identify unusual activity in financial transactions.
Machine Learning Limitations
Feature Engineering Challenge
Deep Learning Solution
For complex tasks like image recognition, defining effective features is difficult.
Deep learning overcomes this limitation by automatically learning the most relevant features from raw data.
What features would you use to distinguish a cat from a dog?
Deep Learning
"Machine learning that involves using very complicated models called 'deep neural networks'."
Deep learning models determine the best representation of original data. In classic machine learning, humans must manually engineer these features.
Deep Learning Example
Classic Machine Learning
Deep Learning
Step 1: Determine features manually
Steps 1 and 2 are combined into a single step
Step 2: Feed them through model
The neural network automatically extracts the relevant features
History of AI
Early algorithms
1950s-1960s: Foundations of AI established
First AI Winter
Late 1960s-1970s: Funding cuts after limited progress
Expert systems
1980s: Rule-based systems gained commercial success
Second AI Winter
Late 1980s-1990s: Limited progress led to reduced interest
Machine learning
1990s-2000s: Statistical approaches gained traction
Deep learning
2010s-Present: Neural networks revolutionized the field
1950s: Early AI
1950: Turing Test
1956: Dartmouth Conference
Alan Turing developed a test for machine intelligence
Artificial Intelligence accepted as a formal academic field
1957: Perceptron
1959: Machine Learning
Frank Rosenblatt invented the precursor to neural networks
Arthur Samuel's checkers program learned from experience
The First "AI Winter"
1966: ALPAC Report
Committee evaluated AI techniques for machine translation and found poor return on investment
1969: Perceptron Limitations
Marvin Minsky's book highlighted limitations of neural networks, slowing research
1973: Lighthill Report
Highlighted AI's failure to deliver on promises, leading to funding cuts
Impact
These reports led to significant cuts in government funding for AI research
1980's AI Boom
Expert Systems
Systems with programmed rules designed to mimic human experts gained commercial adoption
Mainframe Computing
Ran on specialized hardware using languages like LISP
Commercial Success
Two-thirds of Fortune 500 companies used expert systems at their peak
Neural Network Revival
In 1986, the "Backpropagation" algorithm enabled training of multi-layer networks
Another AI Winter (late 1980's – early 1990s)
Technology Integration
Progress Slowed
Expert systems became features in general business applications
Expert systems' impact on business problems plateaued
PC Revolution
Software moved from mainframes to personal computers
Declining Interest
Scaling Issues
Business enthusiasm for AI waned significantly
Neural networks couldn't handle large problems
Late 1990's to early 2000's: Classical Machine Learning
SVM Algorithm
Practical Applications
Integration
Support Vector Machine became the leading machine learning method
AI solutions succeeded in speech recognition, medical diagnosis, and robotics
AI algorithms became embedded in larger systems across industries
2006: Rise of Deep Learning
2006
Geoffrey Hinton publishes groundbreaking paper on unsupervised pre-training for deeper neural networks
2009
ImageNet database of human-tagged images presented at the CVPR conference
2010
First ImageNet competition launches with algorithms competing on visual recognition tasks
Rebranding
Neural networks rebranded as "deep learning" to reflect their renewed potential
Deep Learning Breakthroughs (2012 – Present)
2012
2013
2014
Deep learning models dramatically outperform previous methods on the ImageNet competition
Deep learning models begin to understand "conceptual meaning" of words
Similar breakthroughs appear in language translation tasks
Impact
Advancements led to improvements in web search, document search, summarization, and translation
Deep Learning Breakthroughs (2012 – Present)
2014
Computer vision algorithms learn to describe photos with natural language
2015
Google releases TensorFlow, making deep learning tools widely accessible
2016
DeepMind's AlphaGo defeats Go master Lee Se-dol, a milestone achievement
Impact
These breakthroughs demonstrated AI's ability to master tasks previously thought to require human intuition
Modern AI (2012 – Present): Deep Learning Impact
Self-driving Cars
Healthcare
Communication
Advanced object detection enables autonomous navigation in complex environments
AI systems improve diagnostic accuracy across various medical specialties
Neural translation systems approach human-level quality in many language pairs
How Is This Era of AI Different?
Faster Computers
Modern computing power, especially GPUs, enables complex model training
Bigger Datasets
Internet-scale data collection provides vast training resources
Advanced Neural Networks
Sophisticated architectures can learn complex patterns autonomously
Cross-disciplinary Results
AI advances benefit multiple fields simultaneously
Other Modern AI Factors
Open Source Ecosystem
Open Source Libraries
Open Data
Large labeled datasets enable training of more sophisticated models
Python-based tools have democratized access to machine learning
Leading deep learning frameworks are freely available to researchers and developers
Collaborative Research
Academic and industry collaboration accelerates progress
Transformative Changes in Healthcare
Enhanced Diagnostics
Drug Discovery
Patient Care
Transformative Changes in Finance
Algorithmic Trading
AI systems make high-speed trading decisions based on market patterns
Fraud Detection
ML models identify suspicious transactions with high accuracy
Risk Assessment
AI evaluates loan applications and investment opportunities
Personal Finance
Chatbots and robo-advisors provide financial guidance
Transformative Changes in Government
24/7
Citizen Services
AI-powered systems provide round-the-clock assistance to citizens
50%
Efficiency Gains
Process automation reduces administrative costs and time
90%
Threat Detection
AI systems identify security risks with high accuracy
75%
Resource Optimization
Smart city applications improve urban resource management
Transformative Changes in Transport
Autonomous Vehicles
Logistics Optimization
Emergency Response
Self-driving cars use AI to navigate complex environments safely
AI systems manage fleets and optimize delivery routes
Drones and robots assist in search and rescue operations
Supervised Learning
Labeled Data
Model Training
Dataset includes input features and desired output
Algorithm learns patterns between inputs and outputs
Evaluation
Prediction
Performance assessed on held-out test data
Trained model applied to new, unseen data
Machine Learning
Type
Dataset
Supervised Learning
Data points have known outcome
Unsupervised Learning
Data points have unknown outcome
The study and construction of programs that learn from repeatedly seeing data, rather than being explicitly programmed by humans.
Target vs. Features
Features
Target
Properties of the data used for prediction (non-target columns)
Column to predict - the outcome we're interested in
Example: Supervised Learning Problem
Goal
Predict if an email is spam or not spam
Data
Historical emails labeled as spam or not spam
Features
Email text, subject, time sent, sender information
Target
Binary classification: spam or not spam
Example: Supervised Learning Problem
Object Detection for Emergency Response
AI systems can identify people, vehicles, and damaged structures in disaster zones.
Emergency Management Applications
Disaster Detection
Risk Prediction
Resource Allocation
AI can rapidly analyze satellite and drone imagery to identify disaster impacts and severity
ML models can forecast disaster trajectories based on weather and terrain data
AI optimizes emergency response resources based on real-time needs assessment
Formulating a Supervised Learning Problem
Collect Labeled Dataset
Gather data with features and target labels relevant to your problem
Choose a Model
Select the algorithm best suited to your data type and problem
Define Evaluation Metric
Determine how to measure performance based on your specific goals
Select Optimization Method
Choose how to find the model configuration that maximizes performance
Which Model?
Decision Tree
Nearest Neighbor
Neural Network
Makes predictions by asking a series of yes/no questions about features
Makes predictions based on similarity to training examples
Makes predictions using interconnected layers of artificial neurons
Which Model?
When choosing a model for emergency management applications, consider these key factors. Problem complexity and data requirements often outweigh other considerations due to the critical nature of emergency response.
Evaluation Metric
Accuracy
Mean Squared Error
Other Metrics
Proportion of correct predictions
Average squared difference between predictions and actual values
Useful when classes are balanced
Used for regression problems
Evaluation Metric
The Wrong Metric Can Be Misleading
In Emergency Management
Consider using accuracy for spam detection with 99% spam emails. A model predicting "spam" for every email would have 99% accuracy but miss important legitimate emails.
False negatives (missing an emergency) are often more costly than false positives (false alarms). Metrics should reflect this asymmetric cost.
Context Matters
Choose metrics that align with the real-world impact of predictions. For evacuation decisions, recall (finding all cases requiring evacuation) may be more important than precision.
Training
Training Data
Optimization
For Emergency Management
The dataset used to teach the model patterns between features and targets
The process of configuring the model for best performance
Models must be trained on diverse scenarios to handle the unpredictable nature of disasters
Training
Input Data
Labeled examples feed into the model
Forward Pass
Model generates predictions based on current configuration
Error Calculation
Difference between predictions and actual targets is measured
Backward Pass
Model parameters are adjusted to reduce errors
Iteration
Process repeats until performance stops improving
Inference
New Data
Unseen examples are provided to the trained model
Processing
Model applies learned patterns to analyze the data
Prediction
Model generates outputs based on its training
Decision
Predictions inform emergency management actions
Training vs. Inference
Aspect
Training
Inference
Goal
Learn patterns from data
Apply patterns to new data
Input
Labeled data (features + targets)
Unlabeled data (features only)
Output
Trained model parameters
Predictions
Computation
Intensive, often requires GPUs
Relatively lightweight
Deployment
Typically offline, in development
Real-time, in production
Supervised Learning Overview
Training Phase
Inference Phase
Data with answers + Model → Trained Model
New data + Trained Model → Predictions
Evaluation
Refinement
Compare predictions to actual results
Improve model based on performance
The ultimate goal is to develop a model that performs well on unseen data, making reliable predictions in new emergency situations.
Emergency Management Example
Wildfire Prediction
Flood Risk Assessment
Damage Assessment
AI models predict fire spread based on weather, vegetation, and topography
ML algorithms estimate flooding probability using rainfall and terrain data
Computer vision algorithms rapidly identify structural damage after earthquakes
Curve Fitting: Overfitting vs. Underfitting Example
Goal
Challenge
Fit a curve to the data points to model the underlying relationship
Finding the right complexity for the model to capture the true pattern without fitting to noise
In emergency management: model the relationship between weather conditions and flood severity
Curve Fitting: Underfitting Example
The Curve Is Too Simple
Model fails to capture important patterns in the data
Poor Training Performance
High error even on data used for training
Poor Test Performance
Cannot generalize to new situations
In Emergency Management
An underfitted model might miss critical warning signs of an impending disaster
Curve Fitting: Overfitting Example
The Curve Is Too Complex
Model captures random noise instead of true patterns
Excellent Training Performance
Nearly perfect fit to training data
Poor Test Performance
Cannot generalize to new situations
In Emergency Management
An overfitted model might generate false alarms or miss genuine emergencies in slightly different conditions
Curve Fitting Problem
Challenge
Risk
For Emergency Management
Unseen data isn't available during training, making it difficult to evaluate performance on new scenarios
When measuring performance only on training data, models tend to overfit
Finding the right balance is crucial - models must generalize to new disaster scenarios while maintaining sensitivity to warning signs
Solution: Split Data Into Two Sets
Training Set
Test Set
Data used for model learning
Data used for performance evaluation
Train-Test Split
Training Phase
Model Weight Adjustment
Testing Phase
Performance Assessment
Model learns patterns from training data
Trained model evaluated on unseen test data
Parameters optimized based on training performance
Test results estimate real-world performance
This approach simulates how the model will perform in real emergency situations it hasn't encountered before.
Cross-Validation for Emergency Models
Split Data
Iterate
Train on all but one fold, test on remaining fold
Divide dataset into multiple folds
Rotate
Average
Repeat using different fold as test set
Calculate performance across all iterations
Cross-validation provides a more robust performance estimate, especially important for emergency management models where data may be limited and variability high.
Deep Learning
"Machine learning that involves using very complicated models called 'deep neural networks'."
These sophisticated models automatically determine the best representation of data, eliminating the need for manual feature engineering that traditional machine learning requires.
Deep Learning Differences
Classic Machine Learning
Deep Learning
Two distinct steps:
Integrated approach:
Deep Learning Problem Types
Image Analysis
Natural Language Processing
Time Series Analysis
Speech Recognition
Classification and Detection
Object Detection
Emergency Applications
Real-time Processing
Locates and identifies specific objects in images or video frames
Identifies victims, damaged structures, blocked roads, and emergency vehicles
Enables rapid response to developing situations
Semantic Segmentation
Pixel-level Classification
Labels every pixel in an image, creating detailed maps of different elements
In emergency management:
Natural Language Object Retrieval
Text-guided Visual Search
Emergency Applications
Resource Management
Systems can locate objects in images based on natural language descriptions
Enables search and rescue operations based on witness descriptions
Quickly identifies specific infrastructure or resources needed during response
Speech Recognition and Language Translation
Cross-language Communication
Emergency Call Processing
Voice Commands
Hands-free operation of emergency systems through voice recognition
AI enables effective communication between responders and affected populations regardless of language barriers
Automated transcription and analysis of emergency calls helps prioritize response
Radio Communication
Real-time transcription of field radio communications for coordination centers
Fully Connected Network
FG
Formulating Supervised Learning Tools
Dataset Collection
Gather features and target labels that represent the problem you're solving.
Model Selection
Choose an appropriate architecture based on your problem type.
Evaluation Metric
Define how you'll measure performance and success.
Optimization Method
Determine how to find the optimal model configuration.
Which Model?
Different models represent problems uniquely, each with distinct advantages for specific scenarios.
Biological Inspiration
Neuron Building Blocks
Deep learning models draw inspiration from the human brain and its neural structure.
The core component of neural networks is the artificial neuron, which processes inputs into meaningful outputs.
Neuron Mechanics
Input Features
X1, X2, X3 are numerical inputs representing data features.
Weighted Sum
Each input is multiplied by a weight (W1, W2, W3), then summed.
Output Value
Z = X1W1 + X2W2 + X3W3 is the weighted calculation result.
Activation Functions
Purpose
Variety
Non-linearity
Transform the weighted sum into a meaningful output value.
Multiple functions exist, each with specific mathematical properties.
Most activation functions introduce non-linear properties, enabling complex pattern learning.
The Perceptron Model
Historical Significance
Linear Separation
Simple Architecture
One of the earliest neural network models, developed in the 1950s.
Can only solve problems where classes can be separated by a straight line.
Uses basic activation functions to classify inputs into binary categories.
Perceptron Limitations
Non-Linear Problems
The XOR Problem
AI Winter Catalyst
Perceptrons fail when data points cannot be separated by a single line.
A famous example where perceptrons fail, requiring multiple decision boundaries.
This limitation contributed to reduced interest and funding in neural networks research.
Fully Connected Networks
Output Layer
Final predictions
Hidden Layers
Complex feature extraction
Input Layer
Raw data features
Fully connected networks organize neurons in layers. Each neuron connects to every neuron in adjacent layers. Every connection has a separate weight. This structure allows solving complex, non-linear problems by transforming data through successive layers.
Deep Learning Architecture
Feature Compression
Input Processing
Each layer summarizes important information
Raw data enters the network
Relevant Extraction
Output Generation
Task-specific patterns are identified
Final predictions emerge
Deep learning uses many layers, often decreasing in width. Modern architectures may contain hundreds of layers, each extracting increasingly abstract features from the data.
Building a Fully Connected Network
Network Architecture
Define layers and neurons
Activation Functions
Choose appropriate functions
Evaluation Metrics
Select performance measures
Weight Training
Learn optimal parameters
When creating a neural network, you must decide on the number of layers, neurons per layer, and appropriate activation functions. The model's weights are automatically learned during training.
Evaluation Metrics
Regression
Classification
Multi-Label
Mean Squared Error (MSE) measures the average squared difference between predictions and actual values.
Categorical Cross-Entropy measures how well the model predicts class probabilities.
Binary Cross-Entropy evaluates prediction accuracy when items can belong to multiple classes.
Fully Connected Network Limitations
10^9+
Parameter Count
Large networks can contain billions of weights.
TB
Memory Usage
Significant RAM required for training and inference.
100x
Computation
Much more processing power needed than simpler models.
Low
Feature Detection
Not optimal for spatial patterns in images or sequences.
CNN: Revolution in Visual Processing
Convolutional Neural Networks represent a fundamental shift in how computers process visual information. Inspired by biological visual systems, CNNs have transformed image recognition, object detection, and many other visual tasks.
Convolutional Neural Networks
Localized Connections
Weight Sharing
Each neuron connects only to a small region of the previous layer.
The same set of weights applies across the entire input.
Spatial Features
Resource Efficiency
Excellent at recognizing patterns regardless of position.
Requires fewer connections than fully connected networks.
Convolutions as Feature Detectors
Vertical Line Detector Horizontal Line Detector
Corner Detector
Convolutions act as local feature detectors that identify specific patterns. Each filter responds to different visual elements in the input image.
Convolution Operation
Filter Application
Feature Map Creation
The convolution kernel slides across the input image, performing element-wise multiplication and summation.
The result is a new image highlighting where specific features appear in the original input.
CNN Architecture
Input Layer
Raw image data enters the network for processing.
Convolutional Layers
Multiple filters extract various features from the input.
Pooling Layers
Downsample feature maps to reduce dimensions and computational load.
Fully Connected Layers
Combine extracted features for final classification or regression.
Transfer Learning: Building on Giants
Transfer learning leverages pre-trained neural networks to solve new problems with limited data. By reusing knowledge from existing models, we can achieve excellent results more efficiently.
Challenges with CNN Development
Data Requirements
Training effective CNNs typically requires massive datasets with millions of examples.
Computational Demands
Model training can take days or weeks, even with specialized GPU hardware.
Hyperparameter Tuning
Finding optimal network configurations involves extensive experimentation.
Expertise Barrier
Building competitive models from scratch requires deep technical knowledge.
Transfer Learning Principles
Early Layer Characteristics
Middle Layer Features
Later Layer Specificity
Initial layers learn universal visual features like edges, corners, and textures. These take longest to train but apply across most image tasks.
Middle layers combine primitive features into more complex shapes and patterns. These have moderate task specificity.
Final layers learn highly task-specific features. These respond quickly to training and are most adaptable to new tasks.
Benefits of Transfer Learning
Reduced Data Requirements
Faster Training
Better Performance
Fine-tuning takes hours instead of weeks compared to training from scratch.
Pre-trained networks need much less data to adapt to new tasks.
Models built on established architectures often achieve superior results.
Portability
Trained weights are easily stored and shared for deployment.
Transfer Learning Implementation
Select Base Model
Choose a pre-trained network like ResNet, VGG, or EfficientNet.
Freeze Early Layers
Lock weights in early layers to preserve general feature detection.
Replace Classification Layers
Add new layers specific to your task (e.g., emergency detection).
Fine-tune on Target Data
Train new layers while keeping frozen layers fixed.
os
Fine-Tuning Strategies
Training Time
Data Required
Performance
The chart compares different fine-tuning approaches on relative scales (1-10). Consider your available data, computational resources, and performance requirements when selecting a strategy. For emergency detection systems, "Last Few Layers" often provides the best balance.