Want to create interactive content? It’s easy in Genially!
CMU Assessment in Social Studies Education
Melissa Stanley
Created on September 3, 2024
Start designing with a free template
Discover more than 1500 professional designs like these:
Transcript
Writing Effective Multiple Choice and Written Response Items
Assessment Design in Social Studies
Hi everyone! I am sorry I can't be there with you this week. If you are interested in learning more about the conference I'm attending and presenting at, you can use the link to the right. This presentation has everything you need for your application. There are interactive elements throughout, so please make sure you complete the checks for understanding, click on all buttons, and take advantage of the links. As a reminder, I will not be have access to email while I am gone. See you next week!
Note from Dr. Stanley
Use the link to access and read Berwick's (2019) article, What Does the Research Say About Testing. Once you've finished reading, complete the learning checks to the right.
Berwick, 2019
Reading
- Multiple-choice questioning can be useful fo rassessing and reinforcing learning if items are well-designed, timed, and feedback on answers is provided.
- Avoid the testing format problems of too much guessing (all/some/none of the above) and short writing that isn't different from multiple choice.
- Consider providing more time for fewer, but richer mix of items to reduce anxiety and support more complex thinking.
Berwick, 2019
Reading Takeaways
Formative
Diagnostic
Ipsative
Criterion-Referenced
Norm-Referenced
Summative
The general purpose of assessments is to gaguge the amount and quality of learning that students have done. Many teachers are familiar with only two types of assessment: formative and summative. In reality, there are six types of assessments that we need to be familiar with and will ennouter in our careers. Use the buttons to the right to explore and learn more about the six types of assessment. Assessment involves two psychological principles:
- Recognition: Abilty to identify familiar information previously encountered from among possible options
- Recall: Ability to retrieve and use information and ideas from memory without specific guidance
Introduction to Assessment
Types of Assessments
Read the scenario to the left and answer the question to check your understanding of assessments.
Check for Understanding
01
Multiple-Choice Assessment
- Multiple-choice questions are objective and don't require the teacher to make judgements while grading.
- Good multiple-choice questions require time and effort to write, but they are efficient and useful in the long-term (provided students are skilled in taking multiple-choice exams)
- Multiple-choice questions should assess significant learning (not minor details) and need to be aligned to an appropriate level of difficulty
Multiple-Choice Questions
Basic difficulty questions are descriptive questions about foundational information of limited complexity. Measures: Baseline knoweldge, comprehension (recognition of famliar labels and definitions) Who Gets it Right: All participating students If Students Get it Wrong: They need remediation on the basics
Multiple-Choice Questions
Basic Difficulty
Standard difficulty multiple-choice questions involve understanding and use of more complex information (conceptual) Measures: Comprehension, application (recognition of associated concepts, recall of explanations) Who Gets it Right: Attentive, prepared students If Students Get it Wrong: Provide more reinforcement and encouragement to be more attentive
Multiple-Choice Questions
Standard Difficulty
Advanced difficulty multiple-choice questions require sophisiticated judgments with the most complex information, interpretations, and/or outcomes. Measures: Analysis, synthesis, evaluation (recognition of concepts, generalizations, associations) Who Gets it Right: Highly engaged students If Students Get it Wrong: Support students continued growth
Multiple-Choice Questions
Advanced Difficulty
Example (Standard Difficulty) Which of the following is a check on executive power in the U.S. Constitution? A. Congress can vote to override the president's veto B. Vice President casts tie-breaking votes in the Senate C. President can impeach members of Congress D. Congress appoints justices to the Supreme Court
Most objective items provide four possible responses, each fulfilling a different fuction. A. Desired Answer (demonstrably correct response) B. Inferior Alternative (demonstrably weaker response) C. Distracter (tangentially related incorrrect response) D. False Response (demonstrably incorrect response)
Anatomy of a Question
What is the desired answer?
What is the distracter?
What difficulty is the question?
What is the inferior alternative?
Why did Thomas Jefferson believe the purchase of the Louisiana Territory was necessary? A. The United States feared a war with England B. The United States needed an outlet to transport goods for trade. C. The United States wished to expand into Spanish Florida D. The United States needed a better route to the Rocky Mountains
Ninth Grade Assessment
You Try
02
Test Design Considerations
How you structure the multiple-choice and the choose the number of answers affects the outcome and usefulness of data from students.
- A/B/C/D (25% guess rate) most common
- A/B/C/D/E (20%) = extra inferior or false response
- A/B/C (33%) - easier, usually doesn’t have a distracter
- A/B (50%) - True or False (desires and false response)
Multiple-Choice Structure Matters
- “Which of the following does not…” = false response is desired, plus usually 3 not-desired (correct) responses
- “Which of the following best…” = typically replaces false response or distracter with extra inferior responses
- “who/what/where/when” tend to be basic
- “how/why” tend to be more advanced
Language & Design Choices Matter
03
Writing Quality Assessments
- Multiple-choice questions are based in recognition (prompts) but can also involve recall (memory to make connections).
- Quality questions need time to answer thoughtfully.
- Allow around 90 seconds per item (e.g., 20 - 30 minutes)
- A test of fewer quality items is always superior to a longer test of weaker items.
- Distribute Across Difficulties
- Vary question difficulty across basic, standard, and advanced
- Guess Rate + Basic Items = around 60% of test grade
- A typical A/B/C/D test should have one-third each basic, standard, and advanced difficulty questions.
Quality Objective Items
How can we tell whether a test item is valid?
- Questions are clearly written.
- Confusion over answer/response traits undermines the validity
Item Validity
04
Written Responses
Writing Levels of Difficulty
- Basic: Single-part question (often definitional)
- Standard: Can require multiple parts with multiple facts
- Advanced: Requires application of concepts and generalizations with factual support to explain, evaluate, or advance claims
Role of Writing
- Including writing in assessments aids instructors in evaluating students' quality of thinking and recall
Writing in Assessments
Short Answer Typically requires 1-2 sentences answers and takes about 3 minutes to answer. Should be concise, factually oriented, and usually descriptive (basic) Extended Response Typicall requires 1-2 paragraphs and takes 5-15 minutes to answer. Should require moderate depth and is usually explanatory and analytical Essay Should require at least three paragraphs and will take a minimum of 20 minutes to answer. Should require the greatest depth and is usually analytical, evaluative, or persuasive.
Types of Written Responses
05
Conclusions
For multiple choice:
- Make sure you can distinguish the response components
- Check validity (ration, who) of items with high miss rates
- Phrase prompt in manner consist with intended difficulty and content learning made available to your students
- Allow adequate time to thin and write for each prompt
- Be prepared to defend “subjective” evaluation (be consistent)
Be intentional about your assessment goal:
- Basic know-it-or-not = True/False, Matching
- Identify effective from ineffective responses = Multiple Choice
- Evaluate content understanding = Short Answer
- Evaluate content explanation/analysis = Extended Response
- Evaluate intellectual reasoning about content = Essay
Conclusions
Dr. Stanley's Testing Principles
Make Your Own Exams
Teach Study Skills
Print Exams
Grading
Edit v. Not Edit
Brain Breaks on Tests
Track Testing Data
Teach Testing Vocab & Skills
I made an example and template so you can see what I am looking for. You must use this template for this application. You may only change the font size and content. Access this below.
- Choose a topic of a previous lesson/application and either grade level 7-9 or grade level 10-12
- Practice designing assessment items:
- TASK A - Write a basic-, standard-, and advanced-difficulty multiple-choice question on topic (A-D, response components)
- TASK B - Choose four topics that need to be assessed for a unit and write four different types of questions for each topic (total of 16).
- Tie questions from both tasks to specific MI standards (Task A & B)
Application (Due 11/22 @ 5:00 PM)
Examples:
- Exit Ticket
- Think-Pair-Share
- Class Discussion
Edutopia: 7 Smart, Fast Ways to do Formative Assessment
Formative
Purpose: Track real-time learning Learn More:
Examples:
- Unit Exams
- Papers
- Cumulative Projects
University of Colorado Boulder: Summative Assessments
Summative
Purpose: Assess what was learned Learn More:
Examples:
- Pre-Unit Tests
- Skill Level Tests (i.e., language proficiency)
- Mind Maps
ICAS Assessments: Diagnostic Assessments
Diagnostic
Purpose: To take the "temperature" of students before learning Learn More:
Examples:
- Timed Tasks (i.e., typing speed)
- Reading Level
- Annual Physical Fitness Tests
Classtime: Ipsative Assessment
Ipsative
Purpose: To compare results to previous results Learn More:
Examples:
- SAT
- ASVAB
- IQ Tests
Classtime: Norm-Referenced Assessments
Norm-Referenced
Purpose: To compare with peers Learn More:
Examples:
- Driver's License Exams
- AP Exams
- Citizenship Tests
University of Tasmania: Criterion-Referenced Assessment
Criterion-Referenced
Purpose: Determine peformance level against set criteria Learn More:
The desired answer is B.
Thomas Jefferson believed that securing the Louisiana Territory was crucial because it provided control over the Mississippi River and the port of New Orleans, which were vital for transporting goods and supporting American commerce and westward expansion.
The inferior alternative is C.
While territorial expansion was a broader goal of the United States at the time, the Louisiana Purchase specifically focused on acquiring land west of the Mississippi River and securing trade routes, not directly seeking control over Spanish Florida.
The distracter is D.
While westward expansion was a goal, the Louisiana Purchase was focused on securing trade routes along the Mississippi River and New Orleans, not directly aimed at improving access to the Rocky Mountains. The primary motivation was economic control, not specific routes to the west.
This question is standard difficulty.
The question requires students to undertand and use more complex information and it measures their comprehension and application.
Students do not naturally know how to study and typically no one thinks to teach them. Teaching your students skills for studying like how to set up a good study spot, avoid cramming, and how to take notes go a long way. One of the things I always taught my students is that our brains only retain about 10% of what we learn so we need to read things at least 10 times to remember them. Frequent small study sessions in which students just read are the most effective.
You are in charge of the learning in your classroom, why would you let someone else design your assessments? It's okay to pull from other exams (don't reinvent the wheel), but packaged tests, test generators, etc. are not effective to use without considerable editing.
Tests use very specific language and students are generally not familiar with the terms. Teaching students testing vocabularly as well as tips and tricks for taking exams will help them achieve better outcomes.
It is incredibly time consuming, but it is absolutely essential that you track testing data for your students. This will allow you to identify trends with individual students, have better conversations with parents, make pedagogical decisions, and grow as an educator.
One of the benefits of printing your exams is that you can add brain breaks throughout the test. On each exam I gave, I added cartoons to give students a place to destress and breathe. My social studies teacher did this and I continued the practice. My students reported feeling less test anxiety because they could disengage from the questions for a moment.
As technology use became more accessible in schools, teachers began moving exams to online formats. The appeal is that Google Forms or other platforms will grade multiple-choice items or fill-in-the blank items for you. Printing exams helps students engage with the content in a different way, reduces the likelihood of cheating, and also ensures that you can have a variety of question formats.
Each teacher needs to decide whether they will permit test corrections in their class or not. If you decide that you want to allow test corrections, make sure you set up your gradebook to add these as a separate score rather than changing the original test score. If you don't, the validity of your grades will be skewed. When considering what you want to do, it's helpful to ask what test corrections will achieve. Are students just looking up information to get the points or are they relearning the material?
How we grade matters. It seems like a small thing, but using + for partial points instead of - makes a massive difference in how students engage with learning in your class and how they think about their own performance on assignments. Pen color also matters. Red pen is a teacher staple, but it creates anxiety for students. If you can grade in another color (i.e., teal, purple, green) it will help reduce that anxiety for your students.