Sign up & Download
Sign in

A comparison of student evaluations of teaching between online and face-to-face courses

by H Kelly, M Ponton, A Rovai
The Internet and Higher Education (2007)

Abstract

Student evaluations of teaching (SET) need to be reliable, valid, and accurate because they are frequently used for high-stakes summative evaluation decisions about instructors, such as promotion, tenure, and merit pay. Although widely used by administrators, SET are often criticized by faculty as being inadequate measures of instructional effectiveness. However, the majority of researchers believe SET are generally reliable, valid, and worthwhile means of evaluating teaching despite evidence that some factors may influence SET independent of instructional effectiveness. One of these potential biasing factors is delivery method, and the literature contains indications of a SET bias against online instruction compared to face-to-face instruction. If such a bias exists, SET of online courses cannot be equitably compared with those of face-to-face courses. This case study used qualitative research techniques to answer the following research question: What are the differences in SET between online and face-to-face courses as evidenced by a thematic analysis of open-ended questions? Participants were students enrolled in 82 class sections taught by 41 instructors, one online and one face-to-face class section for each instructor, at Regent University, Virginia, during academic year 2004-05. Responses to open-ended SET questions were content analyzed and divided into 1,742 text segments. Each text segment was classified in two different sets of categories: appraisal and topical. Crosstabulation resulted in no significant difference in the proportion of appraisal text segments by delivery method. However, there were significant differences in the proportion of text segments for topical themes and topical categories by delivery method. Online students considered the course topical theme and organization and materials topical categories more important than face-to-face students. Face-to-face students considered the instructor theme and person and knowledgeable categories more important than online students. MANOVA conducted on responses to closed-ended questions of overall evaluations of the instructor and course found no significant differences between delivery methods. Finally, responses to open- and closed-ended questions were correlated. Classes that provided negative criticism tended to give lower overall evaluations, and classes that commented about the instructor grading unfairly tended to give lower overall evaluations. Implications for research and practice are presented.

Cite this document (BETA)

Available from linkinghub.elsevier.com
Page 1
hidden

A comparison of student evaluations of teaching between online and face-to-face courses

The literature contains indications of a bias in student evaluations of teaching (SET) against online instruction compared to face-
Student evaluations of teaching (SET) need to be reliable, valid, and accurate because they are frequently used for
high-stakes summative evaluation decisions about instructors, such as promotion, tenure, and merit pay. Therefore,
Internet and Higher Education 10 (2007) 89–101
Corresponding author. Tel.: +1 740 420 5924; fax: +1 740 477 7845.these evaluations should adequately assess the effectiveness of instruction and not be biased by factors outside the
instructor's control. In general SET are believed to be valid, reliable, and worthwhile means of evaluating instructional
effectiveness (Braskamp & Ory, 1994; Cashin, 1995; Centra, 1993; D'Apollonia & Abrami, 1997; Feldman, 1997;
Marsh & Dunkin, 1997; Marsh & Roche, 2000; McKeachie, 1997; Theall & Franklin, 2001). However, the literature
identifies several moderating variables that may bias student evaluations. Some of these variables include academic
discipline, class size, content area, grading leniency, level of course, student motivation, teacher personality, type of
course requirements, and method of course delivery (face-to-face, distance education).
A definition of bias is a characteristic of instructor, course, or student that affects SET, either positively or
negatively, but is unrelated to the criteria of good teaching (Centra & Gaubatz, 2000). Therefore, student evaluationsto-face instruction. The present case study consists of content analysis of anonymous student responses to open-ended SET
questions submitted by 534 students enrolled in 82 class sections taught by 41 instructors, one online and one face-to-face class
section for each instructor. There was no significant difference in the proportion of appraisal text segments by delivery method,
suggesting no delivery method bias existed. However, there were significant differences in the proportion of text segments for
topical themes and topical categories by delivery method. Implications of the findings for research and practice are presented.
' 2007 Elsevier Inc. All rights reserved.
Keywords: Assessment; Distance education; Higher education; Instructional effectiveness; Student evaluation of instruction
1. IntroductionA comparison of student evaluations of teaching between online and
face-to-face courses
Henry F. Kelly
a,

, Michael K. Ponton
b,1
, Alfred P. Rovai
b,2
a
Ohio Christian University, 1476 Lancaster Pike, Circleville, OH USA 43113
b
Regent University, 1000 Regent University Drive, Virginia Beach, VA USA 23464
AbstractE-mail addresses: hkelly@ohiochristian.edu (H.F. Kelly), alfrrov@regent.edu (A.P. Rovai).
1
Tel.: +1 757 226 4806; fax: +1 757 226 4857.
2
Tel: +1 757 226 4861; fax: +1 757 226 4318.
1096-7516/$ - see front matter ' 2007 Elsevier Inc. All rights reserved.
doi:10.1016/j.iheduc.2007.02.001
Page 2
hidden
are biased if some characteristic of high-quality teaching tends to cause low student ratings or some characteristic of
low-quality teaching tends to cause high student ratings (Martin, 1998). Ideally all influences other than instructional
effectiveness are randomly distributed throughout the student pool and, therefore, do not introduce bias in the SET
(Haladyna & Hess, 1994). Unwanted influences are often systematic and introduce bias into the ratings. However, the
existence of a significant correlation between SET and some other variable does not necessarily denote the existence of
a bias or a threat to validity (Martin, 1998). Systematic differences may be caused by bias or by valid instructional
effectiveness. If a variable is related both to student ratings and to other indicators of effective instruction, the validity
of the ratings is supported. Conversely, if a variable is related to student ratings without similarly affecting instructional
effectiveness, a bias is supported (Marsh, 1987).
If there is a SET bias against online courses compared to face-to-face courses, then evaluations may not equitability
be compared. This is of concern because the use of the Internet to deliver courses has risen tremendously over the last
several years. The most recent Sloan Consortium survey (Allen & Seaman, 2006) reported 3.18 million students took
online courses in fall 2005, a 35% increase over the previous year. The majority of institutions surveyed agreed that
online education is critical to their long term strategy, and there is no evidence that online enrollment has reached a
plateau. Numerous studies of student learning in online and other methods of distance education compared to face-to-
face instruction have resulted in findings of no significant difference (Moore & Thompson, 1997; Russell, 2001), but
perceived effectiveness or satisfaction may differ from academic performance results.
Because SET are often used for high-stakes personnel decisions, it is vital that they accurately assess teaching
effectiveness. However, even if quantitative ratings of online instruction may significantly differ from that of face-to-
face instruction, these ratings cannot reveal why. Examining qualitative SET comments may help explain the
differences between how online students evaluate instructional effectiveness as compared to face-to-face students. Few
studies of the potential bias in student evaluations of the online delivery of education have been accomplished. The
majority of these studies compared SET of only a few online and face-to-face courses; thus, there is a need to review
data from many courses across an institution. Knowledge of a SET bias will inform administrators' use of SET for
evaluations.
The problem addressed in this study is the possibility that the method of course delivery affects SET independent of
instructional effectiveness. Therefore, the purpose of this study is to examine students' responses to open-ended
questions evaluating instructional effectiveness for both online and face-to-face courses in order to determine if a
potential SET bias exists. The research question of this study is what are the differences in SET between online and
face-to-face courses as evidenced by a thematic analysis of open-ended questions?
2. Literature review
SET are considered by many researchers to be the single most valid source of data on teaching effectiveness (e.g.,
Braskamp & Ory, 1994; Centra, 1993; Marsh & Dunkin, 1997). One view of SET is that they are valid if they
accurately reflect students' assessment of instruction quality regardless of the amount of learning that occurred; a
second view is that SET are valid if they accurately reflect instructional effectiveness (Abrami, d'Apollonia, & Cohen,
1990). In the second view, a significant correlation would exist between SET and instructional effectiveness scores.
Unfortunately, SET are difficult to validate because no single criterion of instructional effectiveness is sufficient or
acceptable (e.g., Kulik, 2001; Marsh, 1987). Researchers have correlated SET with various indicators of instructional
effectiveness: (a) student learning, (b) changes in student behavior, (c) instructor self-evaluations, (d) evaluations of
peers or administrators who attend class sessions, (e) frequency of occurrence of specific behaviors observed by trained
observers, and (f) alumni ratings (Kulik, 2001; Marsh & Dunkin, 1997).
The most widely accepted criterion of instructional effectiveness is student learning. Cohen (1981) and Feldman
(1989) conducted meta-analyses of multi-section courses and found moderate correlations between SET and student
learning as measured by examination scores. These studies indicate that students tend to give higher SET ratings to
instructors from whom they learn much and lower SET ratings to those instructors from whom they learn little, thus
supporting the validity of SET instruments.
Reliability of well-designed SET instruments varies with the number of raters; the more the raters, the higher the
reliability (Hoyt & Lee, 2002; Marsh & Dunkin, 1997). Given a sufficient number of raters, the reliability of SET
compares favorably with that of the best objective tests (Marsh & Dunkin, 1997). SET have been shown to be stable
90 H.F. Kelly et al. / Internet and Higher Education 10 (2007) 89–101over time (Overall & Marsh, 1980) with ratings of instructors at the end of a course and retrospectively one or more

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

10 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
20% Student (Bachelor)
 
10% Lecturer
 
10% Student (Master)
by Country
 
40% United States
 
40% United Kingdom
 
10% Republic of Singapore