Part of the NCA Commission on Accreditation and School Improvement Journal of School Improvement, Volume 2, Issue 1, Spring 2001
Using Grades to Assess Student Performance

John Woodward


About the Author:  Dr. John Woodward is the Director of Research and Development for the NCA Commission on Accreditation and School Improvement.  He can be reached at jwoodwrd@uillinois.edu.

 
Previous Article | Next Article | Contents, This Issue | Feedback | JSI Home | NCA Home
 

The Intent of the Article

Grades can be more reliable than norm-referenced tests and yet they are often not accurate measures of what students know.  At first this sounds like a conundrum, but upon closer examination it becomes clear that the statement is addressing the difference between theory and practice.  In theory grades could be one of the best indicators of student learning, if certain conditions were met. However, in practice, teachers include many factors that are not related to what students know when grading those students.  This article is not intended to convince teachers to necessarily change their grading practices but rather to identify what teachers should and should not do if they wish to use grades as an assessment of what students know.

Basics About Assessment

Before one selects an assessment instrument or format to measure student learning, the content to be learned needs to be clearly identified and articulated to those who are responsible for teaching it, and the achievement standards need to be specified.  In the case of a student performance goal, NCA encourages schools to come to consensus on the “essence” of the goal, by performing a conceptual analysis of the goal.  (See the Spring 2000 Journal of School Improvement article “You Have a Writing Goal?” for information on how to identify the essence of a goal.)

The next step is to identify the assessment instruments to be used to measure student learning.  When considering assessments, it is important to examine validity, reliability, and fairness issues.  That is, the assessments should measure the content we intend to measure (validity); they should yield close to the same results if a person were to take the assessment multiple times (reliability); and they should not be biased against any gender, ethnic, or racial group (fairness).

For an assessment to be valid, it must measure the knowledge, the ability, the behavior, or the attitude that it is intended to measure.  For example, a paper and pencil test that asks students about vitamins and minerals in certain foods does not necessarily measure whether the student could create a meal plan for a family of four for a week that represents good nutrition.  The second task is to look at the assessment and determine what it measures and how well.  In short, assessments themselves are neither valid nor invalid.  Their validity depends upon the purposes for which they are used.  Are the conclusions drawn from the assessment data accurate for a particular use or purpose?  A final exam may be valid for determining whether a student should receive an “A” or a “B” for the class, but it may not be valid for determining which students would most benefit from accelerated instruction. For an assessment to be reliable, it must be standardized and generalizable.  That is, there needs to be precise instructions, clearly written rubrics used by everyone who will evaluate student performance, and standardization of raters.  Also, there must be a sufficient number of questions or tasks to make certain whether the student can perform the task.

For an assessment task to be fair, its content, context, and performance expectations should:

(a) reflect knowledge, values, experiences that are equally familiar and appropriate to all students, (b) tap knowledge and skills that all students have had adequate time to acquire in the classroom, and (c) be as free as possible of cultural, gender, ethnic, and age bias.

Unfortunately, any single test is often biased against some group of students because it may depend upon only one approach to determine what students have learned and what they know (Supovitz, 1997).  For example, norm-referenced tests tend to be highly correlated to inherent qualities such as IQ or socio-economic status, both of which are independent of teacher influence (Darling Hammond, 1983).  Hence, NCA has recommended the use of multiple assessments in its school improvement framework.  By combining the use of multiple assessments with teacher knowledge of individual students, there is a greater probability that the assessment will document student learning and knowledge.  In particular, schools will want to use a blend of norm-referenced and criterion-referenced (including those locally developed) assessments to take advantage of the strengths of each while compensating for the weaknesses of each.

Therefore, grades (or any other assessment) could be one of a set of assessments when determining student learning in a school improvement process, and each assessment in the set should be made or selected with regard for validity, reliability, and fairness. 

Making Grades A Better Assessment of Student Learning

The first step is to avoid the common practices that reduce accuracy of grades as a measure of learning.  “Teachers consider many extraneous factors when assigning grades” (Cross & Frary, 1996).  “Grades reflect a combination of achievement and social factors” (Parsons, 1959).  That is, if grades are to reflect what students have learned and what students know about specific content, then grades must not be based on factors other than knowledge about that specific content.  The following guidelines are provided to make grades a better assessment of learning:

  1. If grades are used to report progress toward the achievement standards that have been established, teachers need to be clear about the knowledge/skills to be attained and what the grade is to represent relative to content learning
  2. Grades need to be based upon fixed standards and not adjusted to fit a curve.  That is, grades should not be raised or lowered because everyone did well or because everyone did poorly.  The intent is not to determine how students did in comparison to other students but rather how well they performed compared to a predetermined standard.
  3. Grades should not be based upon attendance, punctuality, or behavior in class.  Although these are important, if the grade is to be representation of student learning, teachers must not blend these other factors into the rating of student learning.
  4. Grades should not be used to reward or to punish students.  The purpose of the grade is to represent what students have learned.
  5. Homework completion should not be a part of the grade.  For many reasons homework completion is not an indicator of what was learned.  Who did the homework?  Could a student know the content and not do the homework?

In summary, grades can be an indicator of learning with very high validity if they directly match the student achievement in the learning desired.  By following the guidelines above they can also be reliable, accurate, and fair measures of student learning.  By using such grades as part of a multiple assessment scheme, schools can determine a very accurate picture of student learning.

References

Austin, S., & McCann, R. (1992, April 23)  Here's another arbitrary grade for your collection: A statewide study of grading policies.  San Francisco: American Educational Research Association, Research for Better Schools.

Cross, L. H., & Frary R. B. (1996, April)  Hodgepodge grading:  Endorsed by students and teachers alike.  New York: National Council on Measurement in Education.  Virginia Polytechnic Institute and State University.  Educational Leadership and Policy Studies.

Darling-Hammond, L. (1983).  Teacher evaluation in the organizational context:  A review of the literature.  Review of Educational Research, 53(3), 285-328.

Parsons T. (1959).  The school class as a social system:  Some of its functions in American society.  Harvard Educational Review, 29, 297-318.

Supovitz, J. A., & Brennan, R. T. (1997).  Mirror, mirror on the wall, which is the fairest test of all?  Harvard Educational Review, 67(3), 472-505.

Previous Article | Next Article | Contents, This Issue | Feedback | JSI Home | NCA Home


All material on this site © 2000-08 NCA Commission on Accreditation and School Improvement unless otherwise noted.
Questions may be directed to the Webmaster (webmaster@ncacasi.org).