C. Standard 2. MOVING TO TARGET LEVEL
Assessment system - Data Collection, Analysis & Evaluation - Use of Data for Program Improvement - Plans
Standard 2a. Assessment system
FSEHD is moving to the Target Level in terms of its practice of routinely evaluating the capacity and effectiveness of its assessment system and making changes consistent with the evaluation results since 2008. Since that time, the entire initial programs assessment system has been revised, as have significant portions of the advanced programs assessment system (see summary of changes). The rationale for evaluating the capacity and effectiveness of the unit assessment system arose from multiple sources. First, analyses of unit assessment data revealed little variability in candidate scores over time points or across conceptual framework competencies and professional/state standards, rendering it challenging to identify areas for program or unit improvement, instances of candidate growth, or the value-added of initial teacher preparation or advanced professional training. For example, mean ratings of both initial and advanced candidate performance at admissions, mid-point, exit, and post-graduation exhibited little variability and generally averaged 3.5 or higher on a scale of 1 to 4. While these results appeared very positive, faculty questioned their validity and were hard pressed to use the data for program improvement. Validity and reliability studies called into question the validity of some of the constructs that were purportedly being measured, as well as the clarity and consistency of performance expectations. Finally, qualitative feedback from faculty members overwhelmingly supported the notion that the unit assessment system be evaluated and refined. Issues repeatedly brought up by initial and advanced faculty included: their feedback had not been adequately sought or incorporated in the existing design of the assessment system; rubrics were vague and interpreted in different ways across and within programs; performance expectations were implicitly understood and differed within and across programs; important candidate performances and data were not included in the assessment system; and the system did not capture candidate's effects on student learning.
What evolved from these observations and subsequent discussion was an assessment system revision process that generally followed the following course of action:
Following analysis of quantitative and qualitative data regarding an existing unit assessment, the assessment committee gathered and asked, “What evidence do we need about candidates at various time points?” From there, committee members reviewed the strengths and weaknesses of existing assessments and assessment models used nationwide. Using this information, a new or revised assessment was developed. This was presented to faculty by way of a retreat, feedback was gathered from faculty, and the assessment was revised. Next, faculty volunteered to pilot the new assessment. Based on pilot feedback, continuing solicitations to other faculty for feedback, and analysis of data collected in pilots, the assessment was revised further. The process began again, as the revised, improved assessment was presented again to faculty at a subsequent retreat. Pilot and revision activities were repeated as needed until faculty and the committee recommended that the new assessment undergo full scale implementation. For more information, see details describing the evaluations of various components of FSEHD's unit assessment system and the process implemented to improve the system.
Validity and Utility of Data
FSEHD is also moving to the Target Level in terms of the comprehensiveness with which it defines and examines the validity and utility of its assessment system and the inferences that the system yields. Validity is a multifaceted concept and the most important technical consideration for assessments and assessment system design and use. Furthermore, the notion of usefulness, or utility, of assessment results is inherent in any examination of validity. Critical components of validity and utility have been taken into account in the design and monitoring of the Initial and Advanced Programs Assessment Systems. They include:
- Content-related validity: Do assessment items/components adequately and representatively sample the content area(s) to be measured?
- Construct validity: Do assessments and the assessment system measure the content they purport to measure?
- Prediction: How well do assessment instrument predict how well candidates will do in future situations?
- Fairness: Are all candidates afforded a fair opportunity to demonstrate their skills, knowledge, and dispositions?
- Utility: How useful are the data generated from unit assessments?
- Consequences: Are assessment uses and interpretations contributing to increased student achievement and not producing unintended negative consequences? (Linn, 1994)
With regard to content-related validity, assessments are aligned with the RIPTS, Conceptual Framework elements, and/or FSEHD Advanced Competencies, and Cultural Competency Areas identified by the larger professional community as important. Detailed alignment documents attesting to this high degree of alignment exist. Additionally, new unit assessments were designed based on best practice in teacher candidate assessment as identified in the literature and on the professional knowledge, experience, and consensus of FSEHD faculty, many of whom are developers and definers of best practice in their professional areas. Input from experienced practitioners (i.e., cooperating teachers) were also sought to help establish the validity of these assessment measures.
A next step in the examination of the content-related validity relating to the assessments in the FSEHD Unit Assessment System concerned “balance of representation,” or whether the content of the assessments in the assessment system is balanced or, on the other hand, weighted to represent the relative importance of relevant standards (Webb, 2005). FSEHD has analyzed the balance of representation of assessment indicators in its exit assessments (TCWS and OPR) according to RIBTS, Conceptual Framework element, Cultural Competence Area, and Professional Disposition. The goal of this exercise was to ensure that every standard was represented among the exit assessments and that the balance of representation was weighted to represent the relative importance of standards. Results of this analysis indicate that every RIBT, Conceptual Framework element, Cultural Competence Area, and Professional Disposition standard is represented among the two exit assessments. Additionally, the balance of representation for each set of expectations was more heavily weighted in the following areas: RIBTS (#9-Teachers Use Appropriate Formal/Informal Assessment Strategies, 15%; #8-Teachers Use Effective Communication, 14%; #6: Teachers Create a Supportive Learning Environment, 13%; #4-Teachers Create Instructional Opportunities that Reflect Respect for Diversity of Learners, 11%); Conceptual Framework (Pedagogy, 52%; Professionalism, 29%); Cultural Competence Areas (Planning & Instruction, 46%; Communication, 25%); Professional Dispositions (Commitment to Equity, 27%; Work Ethic, 24%; Caring Nature, 20%). This balance of representation among the standards sets assures that multiple, important qualities/skills of candidates are assessed at the Exit transition point. Together, they provide a very comprehensive picture of a FSEHD candidate at the end of his/her program.
The unit also consulted research-based evidence regarding the content validity of its new assessments. For example, findings from a validity study conducted by Denner, Norman, Salzman, Pankratz, and Evans (2004) revealed that a panel of experts judged the processes targeted by the Renaissance Teacher Work Sample (upon which FSEHD's TCWS is based) to be very similar to those addressed by the INTASC Standards. Similarly, this expert panel determined that the Teacher Work Sample tasks to be authentic and critical to success as a classroom teacher.
An assessment has construct validity if it accurately measures a theoretical, non-observable construct or trait. The construct validity of an assessment is worked out over a period of time on the basis of an accumulation of evidence. FSEHD is investigating the construct validity of its unit assessments and accumulating validity evidence in a number of ways. High internal consistency is one type of evidence used to establish construct-related validity. That is, if an assessment or scale has construct validity, scores on the individual items/indicators should correlate highly with the total test score. This is evidence that the test is measuring a single construct. Internal consistency of TCWS components is high and in fact has increased over the past two years: Fall 2008/Spring 2009, n=48; Fall 2009, n=120; and Spring 2010, n=253. Estimates of internal reliability (coefficient alpha) during these time periods for the seven TCWS constructs has been as follows: Contextual Factors, α=.89, .93, .94; Learning Goals & Objectives, α=.83, .96, .94; Assessment Plan, α=.75, .96, .94; Design for Instruction, α=.91, .94, .91; Instructional Decision Making, α=.87, .94, .95; Analysis of Student Learning, α=.87, .96, .94; Self Evaluation, α=.85 (Fall 2008/Spring 2009 only); Candidate Reflection on Student Teaching Experience, α=.61, .94, .87. These findidings support the construct validity of the OPR.
The unit has also begun to investigate the construct validity of the Mini TCWS, TCWS, ILP and OPR by examining whether the developmental changes expected to occur from Preparing to Teach to Exit are actually evident. While not many candidates have yet completed both a Mini TCWS and a TCWS, a small scale study of the performance of candidates who completed both a Mini TCWS in Spring 2010 and a TCWS in Fall 2010 (n=4) reveals that candidates improved their skills in developing learning goals, assessment plans, and designs for instruction during the expected timeframe. (The same indicators and rubrics are used in the Mini TCWS and TCWS.) These findings constitute positive evidence regarding the construct validity of the Mini TCWS and TCWS, but further studies will be implemented to corroborate these results.
The construct validity of the ILP and OPR were analyzed similarly. In this case, ILP results submitted at Preparing to Teach in Spring 2010 from ten students were compared those same students' OPR results for the same indicators during their third student teaching observation in Fall 2010, in order to determine whether hypothesized developmental changes had taken place. This analysis revealed that candidate skills in Planning, Implementation, Climate, Reflection, and Professional Behaviors increased during this time frame. These findings constitute preliminary positive evidence regarding the construct validity of these scales in the ILP and OPR. In contrast, Content and Classroom Management mean scores decreased very slightly for these students from Preparing to Teach to Exit. However, with the extremely small sample size, it is difficult to say if such a decrease is meaningful and representative of all FSEHD candidates or a random finding due to chance. As more FSEHD candidates complete both assessments, data will continue to be collected and analyzed.
In 2009, the Director of Assessment conducted a study of the validity of FSEHD's Teacher Candidate Dispositions Assessment. Findings revealed a substantive lack of construct validity in this assessment. Confirmatory factor analyses revealed that the instrument is not comprised of six factors, in opposition to the intent of the designers of the instrument. The Teacher Candidate Dispositions Assessment also was not capable of discriminating among teacher candidates at varying levels of the dispositions. Furthermore, “construct underrepresentation” was found, meaning that the assessment failed to include important aspects of the construct(s) being measured. These results detracted from the construct validity of scores yielded through the use of this assessment, suggesting a need to reconceptualize and revise the unit's assessment of candidate dispositions.
The unit conducts ongoing checks of the predictive validity of assessments in the unit assessment system. While it is not feasible to investigate the predictive validity of every assessment each year, the unit conducts “spot checks” of predictive validity periodically. For example, following questions of whether the Career Commitment Essay admissions requirement might be redundant to the admissions requirement that students pass Writing 100, the Director of Assessment conducted a study examining the relationship between Career Commitment Essay Scores and Writing 100 scores among applicants to FSEHD Teacher preparation programs in 2009. Study findings found almost no correlation between the CCE scores and Writing 100 grades of students who applied to a FSEHD teacher preparation program between January 2006 and May 2008. The almost complete absence of a relationship between these two variables indicates that the CCE and Writing 100 entrance requirements are not redundant and that Writing 100 grades did not predict essay scores.
In response to new state requirements for admission to teacher preparation programs, the Director of Assessment examined the validity of admissions-based PPST scores (math, reading, writing, and sum scores), and the Elementary Education Content Exercise in predicting subsequent candidate PLT (K-6 and 7-12) scores and GPA. Using data from 1200 candidates over the past 3 years, findings revealed low and statistically insignificant correlation between these mandatory admissions measures and subsequent GPA. Further analyses revealed that the new, higher admissions scores would prevent many minority candidates from being admitted to teacher education programs. Based on these two sets of findings, the unit recommended that the state not adopt the new admissions requirements.
Similarly, the Director of Assessment conducted a study of the utility of standardized test scores (MAT, GRE) as admission criteria in the prediction of subsequent program performance was conducted. Results revealed that the MAT is highly and significantly correlated with GPA among advanced program non-completers. Among program completers, the MAT, GRE Verbal, and GRE Analytical tests are significantly correlated with GPA. These findings were used to support the recommendation that the standardized testing requirement be retained at admission to an advanced program.
Finally, in revising and developing assessments, the unit has based its new assessments on models that have been show through research to have predictive validity. For example, findings from a validity study conducted by Denner, Norman, Salzman, Pankratz, and Evans (2004) revealed an expert panel determined that the Teacher Work Sample tasks to be critical to success as a classroom teacher.
The following components are included in the fairness criterion and contribute to the extent to which inferences and actions on the basis of assessment scores are appropriate and accurate.
- Freedom from bias: The language and form of assessments must be free of cultural and gender bias.
- Transparency of expectations: Assessment instructions and rubrics must clearly state what is expected for successful performance.
- Opportunity to learn: All candidates must have had learning experiences that prepare them to succeed on an assessment.
- Accommodations: Candidates with documented learning differences must be afforded accommodations in instruction and assessment.
- Multiple opportunities: Candidates must have the opportunity to demonstrate their learning in multiple ways and at different times. (Smith & Miller, 2003)
The unit has addressed the fairness criterion as follows:
Freedom from Bias
As unit assessments have been designed and revised, the wording and design of assessment tasks were reviewed by the respective program faculty with a focus on whether the selected tasks were fair and accessible to all candidates. The design, format, wording, and presentation of unit assessments have been reviewed multiple times by internal and external constituents, and changes have been made in an effort to minimize any unintentional bias. For example, the new summative advanced program assessment, the Professional Impact Project, was renamed as such after faculty indicated that the previously suggested title, Professional Intervention Project, suggested a “deficit model” of education. The title and select other terms in the document were subsequently revised to reflect less biased language.
Transparency of Expectations
The unit has also made numerous efforts to make assessment expectations transparent and to keep faculty, cooperating teachers, and candidates informed. First, the Director of Assessment and the assessment committee have strived to keep faculty and cooperating teachers up to date and informed on all aspects of the revised assessment system. Information was shared with them during faculty retreats in February 2007, August 2008, August 2009, February 2010, and March 2010. Information has also been shared with faculty through the Dean's Leadership Committee and electronic correspondence with faculty. Evidence of efforts to inform stakeholders at all times is available. The TCWS and Mini TCWS rubrics and prompts are highly descriptive and provide detailed guidance to the candidate and evaluators about expectations and process. The ILP and OPR contain multiple, observable indicators that make expectations explicit and yield granular information about multiple dimensions of a candidate performance. The Assessment of Professional Dispositions in the College Classroom also contain clear, observable, behavioral indicators for evaluators to rate.
In order to foster greater transparency of expectations, FSEHD has engaged in training of faculty and cooperating teachers in FSEHD unit assessments. In Fall 2009, four workshops for the College Supervisors and Cooperating Teachers were successfully conducted. Approval by Rhode Island Department of Education (RIDE) was obtained for participants to earn two continuing education units to attend one of the four workshops. Three hundred eight cooperating teachers attended the OPR workshop. The workshop taught the Cooperating Teachers how to use and appropriately score with the new observation instrument to evaluate teacher candidate teaching behaviors. The specific objectives for the workshop were to: introduce the participants to an assessment instrument that analyzes teaching behaviors and documents the growth of these behaviors; examine the scoring criteria and the six-point rubric of the defined assessment instrument; teach participants to reliably use an assessment instrument with teacher candidates; be exposed to the proper terminology of scoring (analytic holistic scoring, criterion versus norm referenced scoring, performance level rubric, scoring criteria, indicators, normative versus developmental); discuss teaching behaviors, using the defined assessment rubric, in large and then small groups; and analyze teaching videos with respect to Implementation, Climate, and Classroom Management; and reflect on how exposure to this assessment will assist the classroom teachers with reflective teaching practices in their own classrooms. OPR and TCWS training session for faculty were held in October 2009. While all faculty were invited to attend, 20 attended. The session focused on understanding the components of the TCWS and scoring expectations, with opportunities for participants to score and discuss actual candidate work. Training evaluation data indicated that participants found the training valuable and useful. In Fall 2010, the Assistant Dean for Partnerships and Placements and Assistant Director of Assessment developed an online course entitled EDU 580 Workshop: Professional Development for Cooperating Teachers. This 3-credit graduate workshop contains substantial content related to unit assessment in FSEHD and is anticipated to be instrumental in developing more competent assessors among FSEHD Cooperating Teachers. The course is being piloted in Spring 2011, with plans for full implementation in Fall 2011. In addition, face-to-face trainings on student teaching assessment are being discussed for Spring 2011. Feedback from pilot participants indicate that the course has been invaluable in helping them understand and apply unit assessment instruments and procedures appropriately.
Attempts to maintain transparency of expectations for candidates are ongoing. Program handbooks and web pages clearly delineate program expectations, as well as program and unit assessments. The unit also regularly holds orientation meetings and information sessions for students at each transition point (admission, preparing to teach, and exit). Additionally, the unit conducts specialized trainings aimed at helping candidates understand what is expected of them and how to best meet expectations. For example, the unit holds regular workshops for teacher candidates preparing for the PRAXIS II tests. Data show that over 90% of students who successfully complete these intensive workshops go on to pass the Praxis II on their next attempt.
Additionally, In Spring 2008, FSEHD instituted a Tips for Writing a Successful Career Commitment Essay workshop. The goal of this workshop was to help prospective teacher candidates understand this component of the FSEHD admissions process. Approximately 40 students attended the Spring 2008 workshop. A similar number attended the Fall 2009 workshop. A study was conducted to examine the relationship between attendance at the Spring 2008 Career Commitment Essay Workshop and subsequent Career Commitment Essay performance. Findings revealed that mean Career Commitment Essay scores were higher for those who attended the Tips for Writing a Successful Career Commitment Essay workshop as compared to those who did not attend the workshop. The mean essay score differences between workshop attendees and non-attendees was statistically significant for students submitting their essays for the second or third time, students submitting their very first essay, and students overall. Additionally, higher proportions of students who attended the workshop passed the essay requirement as compared to students who did attend the workshop.
Opportunity to Learn
Assessment tasks were reviewed by the respective program faculty in terms of opportunity to learn and succeed at the content/skills inherent in the tasks during the assessment design and revision process,. Furthermore, during the first 3 semesters in which the TCWS was piloted, the unit was respectful of faculty members' willingness to take a risk with a new assessment AND conscious that the instrument was indeed being piloted. Programs were also frank that the focus on analyzing student work and teacher impacts on student learning represented an area that many of them were in the beginning stages of implementing. Hence, FSEHD granted programs (and candidates) some leniency in these areas during the early pilot semesters. The unit did not set a cut off score for the TCWS; rather, it provided faculty with general guidelines and then allowed faculty members to assign a TCWS as passing or failing based on their professional judgment and the status of the program in relation to emphasizing certain TCWS concepts and skills. Additionally, faculty in FSEHD programs have begun to engage in curriculum mapping and other processes to examine what is taught at different time points to ensure that candidates have opportunities to learn and succeed at the content and skills inherent in unit assessments. Examples of how they have done this are included in Exhibit 8.
Rhode Island College is committed to making reasonable efforts to assist individuals with documented disabilities. Candidates seeking reasonable classroom or assessment accommodations under the ADA of 1990 and/or Section 504 of the Rehabilitation Act of 1973 are required to register with Disability Services in the Student Life Office. To receive accommodations for any class or unit assessment students must obtain a Request for Reasonable Accommodations form and submit it to their professor at the beginning of the semester. This information is shared with all students at the beginning of each course. In addition, this information is provided in each course syllabus.
Multiple opportunities to demonstrate learning and growth are built into the very design of the FSEHD assessment system. The system includes many opportunities for candidates to demonstrate their learning—in multiple ways and at different times. Furthermore, candidates are afforded opportunities to retake or redo all or part of their unit assessments. The use of multiple assessments with multiple formats, as opposed to a single, “one-shot” assessment, increases the validity of the inferences subsequently made regarding the knowledge, skills, and dispositions of FSEHD candidates.
The revision/development of new FSEHD unit assessments stemmed largely from faculty. A goal, therefore, was that faculty would find the new assessment system to be of utility to them as professionals and as trainers of future teachers. For example, the TCWS was designed to address faculty concerns that the existing Exit Portfolio was not cohesive and did not place sufficient emphasis on student learning. The OPR was designed as it was to meet faculty requests that the existing Observation Report be replaced with an instrument with specific, observable behavioral indicators, a more precise rating scale, and less need to script an entire lesson. It was decided that the ILP and Mini TCWS at Preparing to Teach would consist of pieces of the OPR and TCWS in order to maintain consistency in candidate performance expectations over time. The Assessment of Professional Dispositions in the College Classroom was designed in response to faculty concerns that unit assessment of dispositions only took place in field settings despite the fact that faculty often recognized candidate disposition while they were still enrolled in classes, long before going out into the field. Over the past three years, faculty and cooperating teachers have provided ample feedback suggesting that they find FSEHD's revised assessments to be useful, user friendly, and superior to past unit assessments. Typical feedback on the utility of specific unit assessments is available.
The Director of Assessment and the assessment committee have also been informed on numerous occasions that the assessments are considerably easier to implement the second and subsequent times that faculty use them. (The first semester presents the largest learning curve.) Additionally, faculty have commented that the Exit assessments are much easier to implement when teacher candidates have already been exposed to them in their coursework and during the Preparing to Teach phase. These findings are to be expected and are a normal part of the change process.
Linn (1994) states, “it is not enough to provide evidence that the assessments are measuring intended constructs. Evidence is also needed that the uses and interpretations are contributing to enhanced student achievement and, at the same time, not producing unintended negative consequences.” (p. 8) Positive, intended consequences of the FSEHD unit assessments include improved learning on the part of candidates/graduates, as well as program and unit improvement based on the use of assessment data. Negative, unintended consequences might include a narrowing of the curriculum (to focus on preparation for assessments) or increased student drop out due to unanticipated burdens of the Assessment System. Positive, unintended consequences of the system may occur, as well, and these should be identified.
Graduate follow up surveys of graduates are conducted at the “post” transition point and include opportunities for graduates to provide open-ended feedback regarding the strengths and weaknesses of their programs and overall experiences at FSEHD. This qualitative data has been analyzed for clues as to consequences of the assessment system. At this time, data from program graduates do not reveal any negative unintended consequences of the unit assessment system at the initial or advanced levels. Feedback from faculty regarding the functioning of the assessment system and the consequences thereof is always welcome at FSEHD and is solicited on an ongoing basis. Feedback regarding the positive and negative intended and unintended consequences of the Advanced Program Assessment System are gathered and reflected upon, and none reveal unintended negative consequences of unit assessment.
FSEHD unit assessments draw on multiple formats—“traditional” and “alternative” alike. There are many methods for assessing learning; yet, no single assessment format is adequate for all purposes. (American Educational Research Association, 2000) Consequently, the FSEHD assessment system allows candidates to demonstrate their knowledge, skills, and dispositions using a variety of methodologies. The various assessment methodologies include: Selected Response and Short Answers; Constructed Response; Performance Tasks; and Observation and Personal Communication.
As shown in the Assessment System blueprint, all four assessment formats are utilized throughout the four assessment transition checkpoints at FSEHD. This attempt to “balance” assessment in terms of assessment methods yields multiple forms of diverse and redundant types of evidence that can used to check the validity and reliability of judgments and decisions. (Wiggins, 1998)
FSEHD also routinely conducts studies to establish consistency of assessment procedures and unit operations. These studies and their results are described in Exhibit 3.
Standard 2B: Data Collection, Analysis & Evaluation
The unit provides regular and comprehensive data on program quality, unit operations, and candidate performance at each stage of its programs, extending into the first years of completers' practice. Assessment data from candidates, graduates, faculty, and other members of the professional community are based on multiple assessments from both internal sources (faculty) and external sources (cooperating teachers, internship mentors, employers, and other field contacts) that are systematically collected and reported as candidates progress through programs. These data are compiled, aggregated, summarized, analyzed, and reported publicly each semester for the purpose of improving candidate performance, program quality, and unit operations. Data are generally reported in terms of descriptive statistics (measures of central tendency, standard deviation, range, frequencies), cross tabulations, correlations, and comparisons of means. Results are presented in table, chart, and graph form.
Candidates complete most unit and program assessments in their courses and therefore receive feedback on their performance within their courses through the scoring rubrics associated with the assessment. Scoring rubrics provide concrete information on the specific standards that were met and not met, and whether they were adequately addressed. Those candidates who do not adequately meet the standards on a unit assessment meet with their faculty member to discuss standards/indicators that have not been met and what the candidate needs to do to meet those standards/indicators. Candidates then have the opportunity to revise and resubmit the assessment.
Initial and advanced programs assessment reports are shared with faculty and published on the FSEHD web site each semester as they are completed by the Data Management Coordinator. They are available under Teaching Resources on the FSEHD web site, where they are easily accessible by faculty and open to the public. In addition, the Data Management Coordinator and Director of Assessment regularly respond to faculty and administrator requests for specialized data sets that they wish to analyze themselves. Faculty are regularly provided with raw data in a format that they request so that they can conduct their own investigations. Access to raw data is crucial to fostering consistent data exploration and use, and research has demonstrated that educators who have ready access to data tend to use data more frequently and more effectively. In addition, educators who explore their own data “invariably want more detailed data, or want data presented in different ways, than paper reports typically provide… Preformatted data reports, while useful, cannot be cross-analyzed or connected with other data.” (McLeod, 2005, p. 2) This underscores the continued need for data that is accessible to FSEHD programs faculty and staff, data that staff can “get their hands on.”
Since 2008, the unit has also been developing and testing different information technologies to improve its assessment system. In 2008, the unit adopted True Outcomes, an assessment software program used to assess, track, analyze, and report on student outcomes. The unit transitioned to Chalk & Wire, a portfolio authoring and data analysis system used extensively in the U.S., Canada, and Australia in 2009 when True Outcomes was no longer supported by its parents company. Even without full scale implementation of Chalk & Wire, FSEHD has been moving toward electronic data collection for almost two years. In Fall 2009, all OPR data at Exit were collected via Surveymonkey. In Spring 2010, all Exit, Preparing to Teach, and Student Teaching assessments were loaded into CheckBox, an alternative to Surveymonkey and a user friendly vehicle for electronic data collection. In Fall 2010, the unit switched to Surveygizmo because it offered features that CheckBox did not. By Spring 2010, the unit had achieved close to 100% electronic data collection. While these changes are positive, the staggered implementation and varying modes of electronic data collection do present challenges in the compilation of data.
As FSEHD has moved forward in its efforts to implement electronic data collection, the unit has conducted periodic evaluations of its efforts in this area. Faculty and administration involved in the implementation of Chalk & Wire meet regularly to reflect on the process and lessons learned. Based on their experience learning about and implementing this electronic portfolio/assessment product the “Chalk & Wire team” has developed a list of considerations that they recommend that other education units consider before selecting such a product. They also shared their recommendations at a presentation at the Association for Authentic, Experiential and Evidence-Based Learning (AAEEBL) Northeast U.S. Conference in March 2011. To gather faculty and cooperating teacher input on the move toward electronic assessment, the Director of Assessment and Assistant Dean for Partnerships and Placements administered an electronic survey to faculty and cooperating teachers in Summer 2010. The purpose of the survey was to gather their input regarding: their student teaching role/experience; their experience and comfort level using computer-related technology; the quality of their experience using online evaluation (specifically, CheckBox) in Spring or Summer 2010; and their overall opinion of the use of online evaluation. These findings have been used to better understand the transition to electronic evaluation and will shape future training.
Standard 2C: Use of Data for Program Improvement
The procedures and findings described above provide evidence that FSEHD has fully developed evaluations and continuously searches for stronger relationships in the evaluations, revising the underlying data systems and analytic techniques as necessary and making changes based on the data. For further information on how FSEHD has used data for program improvement, see CHANGES MADE TO COURSES PROGRAMS UNIT and POLICIES AND PROCEDURES THAT ENSURE THAT DATA ARE REGULARLY COLLECTED, COMPILED, AGGREGATED, SUMMARIZED, ANALYZED, AND USED TO MAKE IMPROVEMENTS.
Plans for Continuing to Improve
An important technical criteria for a high quality assessment system is standard setting. In other words, programs and the unit must identify the amount and quality of evidence necessary to demonstrate proficiency on assessments. These are performance standards (Measured Measures, 2000). The standard setting process cannot begin until criteria for levels of student performance (i.e., rubrics) are well-articulated (Smith & Miller, 2003) and reliability is established. With reliability established, it is now time for FSEHD to conduct standard setting processes for the TCWS, OPR, ILP, and Mini TCWS (initial programs) and the PIP (advanced programs). In Fall 2011, selected faculty will be trained in two approaches to standard setting: the Angoff method and the Examination of Student Work method. A subsequent training will be offered to faculty in Spring 2012. Additionally, faculty who are trained in standard-setting methods will be encouraged to share their knowledge with their peers in their departments. Standard setting sessions will also be held to determine cut scores on relevant FSEHD unit assessments. Skill in using these standard-setting methods and implementation of these procedures in the unit and within programs will yield more consistent scoring of student work samples at each transition point, as well as within courses in programs, resulting in higher reliability in candidate's final course grades. An additional, useful benefit of the standard-setting process is that it often exposes flaws in scoring rubrics or the design of assessments. As such, it is part of an iterative process of ongoing revision and improvement.
The unit is discussing alternative means to encourage and facilitate faculty use of assessment data. One strategy under consideration is the organization of a “data day” in which faculty are brought together to review assessment and other data, reflect on its significance, and plan how to use and respond to the data.
FSEHD will also continue to revise and improve its unit assessments based on faculty, candidate, and external partner feedback. At the same time, the unit will conduct further investigations into the reliability and validity of its assessment processes. Potential tasks that are being considered first include:
- An investigation will be conducted into whether/how the OPR and ILP can be shortened. While these instruments exhibit high degrees of internal consistency and faculty have found these instruments to have considerable utility, some faculty have expressed a desire to shorten them, if possible. Based on the extremely high, near perfect internal consistency reliability of the scales in these assessments, internal consistency reliability will not be sacrificed too much by eliminating some redundant items.
- Based on the model created by Special Education, the unit will annotate the OPR rating scales so that descriptive performance indicators are available for each indicator. This will provide more guidance to evaluators and candidate.
- Future research will examine the predictive validity of TCWS performance as teacher education candidates enter the profession and become teachers.
- FSEHD will continue to investigate the validity and reliability of assessment findings on a regular basis, as this must be done on an ongoing basis.
- The unit will move toward full implementation of Chalk & Wire. Storing all data in a common location will greatly facilitate the process of compiling, aggregating, and reporting assessment results. It will also make assessment data more transparent and available to faculty, hopefully fostering a greater culture of assessment within the unit.
- Advanced program faculty will be offered an extended opportunity to review FSEHD Assessment of Candidate Dispositions in the College Classroom to evaluate whether it is indeed appropriate for advanced classrooms or whether it needs to be revised for their purposes. The advanced Professional Impact Project also needs to be reviewed for alignment with field dispositions indicators to ultimately determine whether a field-based instrument is needed for advanced programs.
- The unit will provide further training to faculty and cooperating teachers in order to foster greater inter-rater reliability and increased understanding of assessment processes.
- The Director of Assessment will institute novel ways to engage faculty in the examination, analysis, and interpretation of assessment data. Strategies being considered include regular assessment updates or newsletters and "brown bag" luncheons in which specific assessment findings are presented and discussed.