Want to create interactive content? It’s easy in Genially!

Get started free

Metrics of Evaluation

Sharon Welburn (Slovina)

Created on October 27, 2025

Start designing with a free template

Discover more than 1500 professional designs like these:

Audio tutorial

Pechakucha Presentation

Desktop Workspace

Decades Presentation

Psychology Presentation

Medical Dna Presentation

Geometric Project Presentation

Transcript

Metrics of Evaluation

Sharon Welburn, PhD

Learning Objectives

  • Define program evaluation
  • Differentiate between the 5 major types of evaluation and their use
  • Identify potential interest holders
  • Describe the GRADE approach in evaluating evidence

Let's Think about it.

  • Why do evaluation?
  • What is evaluation?
  • When should you start thinking about evaluating a project?

Why Evaluate?

  • Ensures accountability and continuous improvement
  • Provides data for decision-making in public health programs
  • Informs policy, funding, and practice change
  • Bridges the gap between research and real-world program implementation

"Program Evaluation is the use of social research methods to systematically investigate the effectiveness of social intervention programs in ways that are adapted to their political and organizational environments and are designed to inform social action to improve social conditions

Rossi, Lipsey, Freeman, 2004

Who needs the program, how great the need is, and what might work to meet the need

F O R M A T I V E

Needs Assessment

5 Major Types of Evaluation

Is it likely to work?

Feasibility

"When the cook tastes the soup, that's formative; When the guests taste the soup, that's summative."

How is the program being delivered? Is the delivery effective?

Process

S U M M A T I V E

Did behavior / knowledge change?

Outcome

Did the program improve health?

NSF Evaluation Handbook

Impact

Where does it all fit?

If a "need" is the gap between the ideal and current health status of a target population then a "needs assessment" is the process of gathering data to know what the gap is, and what precedes it.

Things to Consider

  • Specific target population
  • Socio-ecological levels of influence
  • Theory

Some Indicators

Environmental

Mental Health

Social Health

Physical Health

  • Poverty
  • Education
  • Crime
  • Supports
  • Morbidity
  • Mortality
  • Health costs
  • Prevalence of a risk factor
  • Utilization rates
  • Toxins & pollutants
  • Transportation
  • Housing
  • Mental health care costs
  • Medications
  • Lifespan

Don't forget to assess strengths

  • Cultural influences that would support an intervention
  • Faith or spiritual support
  • Availability of resources including effective interventions
  • Community wisdom or experience
  • Resilience

Methods

  • Public health data (secondary data)
  • Interviews, focus groups, etc. with stakeholders (primary data)
  • Mixed methods is ideal - qualitative can help interpret quantitative

Steps

Step 4

Step 3

Step 2

Step 1

Gather data, only take what you can use.

What's your scope?

Report and share!

Analyze

Data collected from pilot situations and recipients while developing an intervention to obtain feedback about feasibility of proposed activities and fit with intended settings and recipients.

Assessing validity & feasibility

  • Design Review
  • Expert Review
  • Resources in place?
  • Pilot it!
  • How will you apply what you've found?

Methods

  • Focus groups
  • Observation
  • Open-ended interviews
  • Expert judgement
  • Equipment trial

Where does it all fit?

Process Objectives

  • Program components are the basis for selecting or developing instruments to measure aspects of program
    • Extent of implementation
    • Scope of implementation
  • Asks who, what, when, and how many program activities and outputs were accomplished
  • Answers to these questions allow us to assess if activities are being delivered as intended
  • Helps determine areas where program needs to be improved

Process Evaluation Questions

  1. Were program activities accomplished?
  2. Were milestones achieved as planned (on time)?
  3. How well were activities implemented?
  4. Was the target audience reached?
  5. How did external factors influence program delivery?

Outcome Monitoring

  • Results focused
  • Short-term outcomes MAY be attainable in 1-3 years
  • Mid-term outcomes MAY be achievable in 4-6 years
  • Connectedness
    • Short-term outcomes must be achieved in order for mid-term outcomes to occur

Short-term vs. Mid-Term

Mid-term

Short-term

  • Knowledge
  • Attitudes
  • Beliefs
  • Behaviors
  • Skills

Outcome Evaluation Questions

  1. Did the intervention CAUSE the expected outcomes?
  2. How do we know this?

Impact Assessment

  • Deeper, long-term outcomes
  • Causally distal outcomes
  • May occur after the conclusion of project funding

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

Interest Holders

Identifying Interest Holders

  • Who would be served or affected by the program?
  • Who is helping plan or implement the program?
  • Who might find the findings useful?
  • Who are skeptical about the program?

Helpful Input

  • Who do they represent and why are they interested in the program?
  • What is important about the program to them?
  • What would they like the program to accomplish?
  • How much progress would they expect the program to make at various times? (milestones?)
  • What do they see as critical evaluation questions?
  • How would they use the results of the evaluation?
  • What resources (time, funds, expertise, access to respondents or policymakers) might they contribute to the evaluation effort?

'GRADE'ing the Evidence Quality

What is GRADE?

  • Grading of Recommendations, Assessment, Development, and Evaluation
    • Widely used in guideline development, systematic reviews, and public-health decision making
  • Study Design
    • RCTs - rating starts at HIGH quality
    • non-RCTs - rating stats at LOW quality
  • Assesses quality of evidence against 8 criteria

Ryan R, Hill S (2016) How to GRADE the quality of the evidence. Cochrane Consumers and Communication Group, available at http://cccrg.cochrane.org/author-resources. Version 3.0 December 2016.

GRADE Criteria

  • Deductions:
    • Risk of Bias
    • Inconsistency
    • Indirectness
    • Imprecision
    • Publication bias
  • Upgrades:
    • Large magnitude of effect
    • Dose response
    • Effect of all plausible confounding factors

Ryan R, Hill S (2016) How to GRADE the quality of the evidence. Cochrane Consumers and Communication Group, available at http://cccrg.cochrane.org/author-resources. Version 3.0 December 2016.

GRADE Rating

1. Risk of Bias

  • Degree to which study design may have introduced systematic error
  • RCTs start as high quality but can be downgraded for poor methods
  • Indicators of concern:
    • Lack of randomization or allocation concealment
    • No blinding of participants or assessors
    • High loss to follow-up or selective reporting
    • Incomplete data or deviations from protocol
What are the limitations?

2. Inconsistency

  • Variation in results across different studies (heterogeneity)
  • Consistent direction and magnitude of effect strengthens confidence
  • Indicators of concern:
    • Large variation in point estimates
    • Confidence intervals that barely overlap
    • High I² statistic in meta-analysis (>50%)
    • No clear explanation for variability
How consistent are the results?

3. Indirectness

  • Evidence doesn't directly apply to research question, population, or intervention of interest (can possibly use indirect comparison: A with C and B with C)
  • Indicators of concern:
    • Population differs from target (e.g., adults studied but intervention for children)
    • Surrogate outcomes instead of clinical outcomes
    • Intervention or comparator not identical to the one of interest
    • Setting or implementation context differs
How do these results apply to my review question?

4. Imprecision

  • Results are uncertain due to small sample size or wide confidence intervals
  • Indicators of concern:
    • Confidence interval crosses the threshold for meaningful benefit or harm
    • Small total number of events
    • Studies underpowered to detect an effect
How precise is the effect size?

5. Publication Bias

  • The published evidence is systematically unrepresentative of all research conducted
  • Indicators of concern:
    • Non-publication of negative or null studies
    • Selective outcome reporting
    • Funding source bias (e.g., industry-sponsored trials)
    • Funnel plot asymmetry in meta-analysis
Are these all of the relevant studies?

Reasons to Upgrade

  • Rare to upgrade quality of evidence
  • Very rare to upgrade evidence from RCTs that were downgraded
  • For observational studies, only evidence with no important validity threats should be upgraded
  • 3 Major possible reasons to upgrade

Ryan R, Hill S (2016) How to GRADE the quality of the evidence. Cochrane Consumers and Communication Group, available at http://cccrg.cochrane.org/author-resources. Version 3.0 December 2016.

6. Large Effect

  • When an effect is so large that bias is unlikely to fully explain it
  • Applies mostly to observational studies
  • Indicators to upgrade:
    • RR or OR > 2 (or < 0.5) with no plausible confounding
    • Clear, consistent direction of effect
Is there a large magnitude of effect?

7. Dose-Response

  • Clear relationship between the amount of exposure and the magnitude of effect increases confidence in causality
  • Indicators to upgrade:
    • Stepwise increases in benefit or harm with higher exposure
    • Linear trend across exposure categories
Is there a dose-response gradient in the findings?

8. All Plausible Confounding Factors

  • All reasonable sources of bias would diminish (not exaggerate) the observed effect, confidence increases
  • Indicators to upgrade:
    • Known confounders would bias toward the null
    • Effect observed despite conservative bias
    • Direction of residual confounding predictable
Have all plausible confounding factors been accounted for?

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

01:00

Now... Let's See What You remember

Questions?

Evaluation Participation

  • What would an evaluation plan look like for your project?
    • 1-2 sentences
  • Next week, we'll focus on the CDC framework of program evaluation
    • Make sure to read Sriram and Pullybank articles on Strong Hearts, Healthy Communities