Baking the Perfect Assessment


Students in an online cooking class spent a week of their course reading about how to make several kinds of cookies, including cookies that are dropped onto a baking sheet, cookies that are rolled and cut out, and cookies that are not cooked at all.


The student will be able to accurately, completely, and without assistance, follow a written recipe to make a batch of one kind of cookie studied this week.


  1. Via the learning management system, students completed a 10-question multiple-choice quiz that asked them to recall details about specific cookie recipes.
  2. Students responded to a short-answer essay question in which they described and defended three cookie-making practices they would recommend to a fellow chef.
  3. Each student submitted a 30-minute video of himself or herself following a cookie recipe from the beginning of the recipe to the completion of the first batch of cookies.

Correlation to Objective.

The performance objective “describes an observable event that will indicate that a student has acquired the targeted knowledge” (Oosterhof, Conrad, & Ely, 2008, p. 14). Further, Oosterhof (2008) proposes performance objectives should include four components: type of capability (information, rule, concept or problem solving), observable behavior, situation/context of assessment and any special conditions required. Therefore, this performance objective could be broken down into the following components: type – procedural knowledge, specifically application of rules; behavior – follow a written recipe to make a batch of cookies; situation – provided cookie recipe; special conditions – accurately, completely and without assistance. The last component of this objective is deficient in that the terms “accurately” and “ completely” are not quantified and/or appropriately defined. How accurate? Every single step of the recipe in a particular order? How complete? From preparation through baking through packaging? Is it all or nothing?

Activity 1 for the course does not correlate well with the performance objective; although some declarative (informational) knowledge is required for procedural processes, the specifics of any particular recipe are irrelevant because the objective requires preparation and cooking utilizing a recipe of instructions.

Similarly, Activity 2 does not correlate well as it more appropriately assesses a concept based objective. However, the answers could be somewhat useful in determining the learners mindset in terms of discriminating which cookie making practices are deemed important, especially explicit aspects of recipe instructions.

Activity 3 is appropriate for the performance objective.

Evaluation of Validity.

The most general definition of validity “pertains to the degree to which a test measures what it is supposed to measure” (Oosterhof et al., 2008, p. 29). Utilizing only this definition, the first two activities are invalid for the performance objective.

“Validity evidence is typically grouped into three interrelated categories: construct-related evidence, content-related evidence, and criterion-related evidence” (Oosterhof et al., 2008, p. 29). Specifically, criterion-related evidence “indicates how well performance on a test correlates with performance on relevant criterion measures that are external to the test” (Oosterhof et al., 2008, p. 31). Activities 1 and 2 are unlikely to correlate well with actual objective of being able to follow directions accurately and completely or making a batch of cookies. Construct-related evidence “establishes a link between the underlying psychological construct we wish to measure and the visible performance we choose to observe” (Oosterhof et al., 2008, p. 32). Activity 1 is not linked to the intended objective of being able to follow a recipe because it is not a specific recipe itself that is being addressed by the learning objective, but, rather, the procedure to follow utilizing a recipe. Activity 2 is slightly more valid in that it requires the learners to reflect on and express their knowledge of cookie making practices in terms of importance. Activity 3 is clearly the most valid as it involves the actual procedures outlined in the objective. Lastly, content-related evidence “establishes how well the content of the test corresponds to the learner performance being observed” (Oosterhof et al., 2008, p. 37). Again, activity 1 is invalid. Activity 2 is not completely invalid because it requires the learner to reflect and articulate important aspects of cookie making behavior; however, it is not nearly as valid as the final activity, which requires the learner to actually demonstrate the learned procedures.

Recommendations to improve validity. Activity 1 could be made more valid through restructuring the content of the questions. For instance, rather than quizzing the learners on details of the recipes, more appropriately questions could provide possible cookie making scenarios and requesting the learner determine the appropriate procedures to use, order of events and/or methods to utilize in baking cookies from recipes. This could include content such as being able to expand on a recipe and/or reduce a recipe depending on the outcome quantity required by a recipe. This would be especially important if the learner is expected to convert various quantities of ingredients for a particular recipe. Activity 2 could similarly be made more valid by altering the content of the question. Scenarios should be provided by the instructor with built in procedural questions for reflection and articulation of the learner’s position. For instance, comparing the order of procedures in two different types of cookie recipes or elaborating on when or why you would alter the rules of preparing the cookies from the recipes.

Evaluation of Generalizability.

“Generalizability is concerned with inconsistencies between different samples of student proficiency that are or could be included in an assessment” (Oosterhof et al., 2008, p. 53). One inconsistency discussed in our text occurs when there are “ inconsistencies among test items that supposedly measure the same skill” (Oosterhof et al., 2008, p. 45). This can be the result of learners inherent interest in the topic, their perception of the what the assessment is asking and/or good/bad luck with guessing. Activities 1 and 2 have poor generalizability as written because they inherently do not correlate well with the performance objective.

Another issue could be “inconsistencies among alternative skills within the same domain” (Oosterhof et al., 2008, p. 47). This could actually pose a threat to Activity 3 dependent upon the recipe selected. For instance, we are not given information regarding the three types of recipes. It is entirely possible that the recipes different in difficulty level, number of steps, measurements, numbers of ingredients, etc. This could pose a generalizability problem. Further, following a cookie recipe is not the same as following recipes for barbecue or broiling. Therefore, it would be important to remember that the generalizability of this activity is limited specifically to instructions within the baking domain. Although, it would be nice to believe that the skills in following directions precisely would carry over, that could be an erroneous assumption.

Another issue is inconsistency between raters and/or graders. This could pose a problem for Activities 2 and 3 unless the assessments are founded on strict adherence to rubrics outlining the grading system.

The last type of generalizability discussed in “ inconsistencies between earlier and later occasions” (Oosterhof et al., 2008, p. 49). Unfortunately, this is seen quite a bit in schools as students often cram for exams and then promptly forget the material due to issues with retention. Many younger students are especially vulnerable to understanding that concept of review and reflection. Activity 1 is most likely to have this type of generalizability problem. Activity 2 requires reflection and articulation which will help them assimilate the information more deeply. Activity 3 actually requires thought and activity combined thereby aiding in retention.

Recommendations to improve generalizability. The recommendations from our text include four techniques: “improving the quality of observations, improving the scoring of performances, increasing the number of observations, and expanding the breadth of observations” (Oosterhof et al., 2008, p. 50-51). First, improving the quality of the observations has been discussed previously as relates to validity issues. If Activities 1 and 2 are prepared to appropriate address the performance address, increasing validity, they will also increase some aspects of generalizability. Further, utilizing a rubric as previously mentioned will reduce inconsistencies among raters. Utilizing all three types of activities, once properly reframed would expand the breadth of the observations and lend more generalizability to the learner outcomes because they would have been assessed by three distinctly different methods of assessment: multiple choice, essay, and performance. This particular exercise does not lend itself to many additional types of observations unless the entire course is regard to cookie baking and/or learning to follow recipe instructions. If that is the case, then practice following recipes for many different types of dishes would improve generalizability for that particular skill.


Horton, W. (2006). E-learning by design. In Designing for the Virtual Classroom (pp. 268-284). Retrieved from

Laureate Education, Inc. (Producer). (n.d.). Designing effective and appropriate online assessments . Available from;url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_551855_1%26url%3D.

Oosterhof, A., Conrad, R. M., & Ely, D. P. (2008). Assessing learners online. Upper Saddle River, NJ: Pearson.

Sewell, J. P., Frith, K. H., & Colvin, M. M. (2010). Online assessment strategies: A primer. MERLOT Journal of Online Learning and Teaching, 6(1), 1-9. Retrieved from

Suskie, L. (2007). Answering the complex question of “how good is good enough?”. Assessment Update, 19(4), 1-4. Retrieved from EBSCOhost


One thought on “Baking the Perfect Assessment

  1. Ι do not еven understand how Istopped up hеre, but I believed this publkish was once greɑt.

    I ɗo not ҝnow wɦo you are but ceгtainly you’re going to a well-known bloggeг in case you arre nott already.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s