2024-03-29T13:24:43Z
https://ejournals.bc.edu/index.php/jtla/oai
oai:ejournals.bc.edu:article/1601
2011-05-09T22:14:07Z
jtla:ART
nmb a2200000Iu 4500
"100719 2010 eng "
1540-2525
dc
The Effectiveness and Efficiency of Distributed Online, Regional Online, and Regional Face-to-Face Training for Writing Assessment Raters
Wolfe, Edward W.
Pearson
Matthews, Staci
Pearson
Vickers, Daisy
Pearson
This study examined the influence of rater training and scoring context on training time, scoring time, qualifying rate, quality of ratings, and rater perceptions. 120 raters participated in the study and experienced one of three training contexts: (a) online training in a distributed scoring context, (b) online training in a regional scoring context, and (c) stand-up training in a regional context. After training, raters assigned scores to qualification sets, scored 400 student essays, and responded to a questionnaire that measured their perceptions of the effectiveness of, and satisfaction with, the training and scoring process, materials, and staff. The results suggest that the only clear difference on the outcomes for these three groups of raters concerned training time—online training was considerably faster. There were no clear differences between groups concerning qualification rate, rating quality, or rater perceptions.
The Journal of Technology, Learning and Assessment
2010-07-19 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1601
The Journal of Technology, Learning and Assessment; Vol. 10 No. 1 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1602
2011-05-09T22:44:43Z
jtla:ART
nmb a2200000Iu 4500
"100821 2010 eng "
1540-2525
dc
Examining the Feasibility and Effect of Transitioning GED Tests to Computer
Higgins, Jennifer
Nimble Assessment Systems
Patterson, Margaret Becker
American Council on Education, GED Testing Service
Bozman, Martha
American Council on Education, GED Testing Service
Katz, Michael
Nimble Assessment Systems
This study examined the feasibility of administering GED tests using a computer based testing system with embedded accessibility tools and the impact on test scores and test-taker experience when GED tests are transitioned from paper to computer. Nineteen test centers across five states successfully installed the computer based testing program, followed the research protocol, and transmitted testing data with minimal issues, providing evidence of the feasibility of administering GED tests on computer. Two hundred and sixteen GED candidates participated in the research by completing two GED mathematics practice test forms and a survey. Participants completed the first form on paper and were randomly assigned to take the second form on computer or paper. The survey asked students to report demographic information, information about their use of computers, and their preference for using a computer to take tests. Regression analyses showed that participants were neither advantaged nor disadvantaged by taking the GED mathematics test on computer. This finding also holds true after accounting for student’s reported computer use and preference for taking tests on computer.
The Journal of Technology, Learning and Assessment
2010-08-21 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1602
The Journal of Technology, Learning and Assessment; Vol. 10 No. 2 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1603
2011-05-09T22:14:07Z
jtla:ART
nmb a2200000Iu 4500
"100825 2010 eng "
1540-2525
dc
Performance of a Generic Approach in Automated Essay Scoring
Attali, Yigal
Educational Testing Service
Bridgeman, Brent
Educational Testing Service
Trapani, Catherine
Educational Testing Service
A generic approach in automated essay scoring produces scores that have the same meaning across all prompts, existing or new, of a writing assessment. This is accomplished by using a single set of linguistic indicators (or features), a consistent way of combining and weighting these features into essay scores, and a focus on features that are not based on prompt-specific information or vocabulary. This approach has both logistical and validity-related advantages. This paper evaluates the performance of generic scores in the context of the e-rater® automated essay scoring system. Generic scores were compared with prompt-specific scores and scores that included prompt-specific vocabulary features. These comparisons were performed with large samples of essays written to three writing assessments: The GRE General Test argument and issue tasks and the TOEFL independent task. Criteria for evaluation included level of agreement with human scores, discrepancy from human scores across prompts, and correlations with other available scores. Results showed small differences between generic and prompt-specific scores and adequate performance of both types of scores compared to human performance.
The Journal of Technology, Learning and Assessment
2010-08-25 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1603
The Journal of Technology, Learning and Assessment; Vol. 10 No. 3 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1604
2011-05-09T22:14:08Z
jtla:ART
nmb a2200000Iu 4500
"101108 2010 eng "
1540-2525
dc
Measuring Cognition of Students with Disabilities Using Technology-Enabled Assessments: Recommendations for a National Research Agenda
Bechard, Sue
Measured Progress
Sheinker, Jan
Abell, Rosemary
ROSE Consulting
Barton, Karen
CTB/McGraw-Hill
Burling, Kelly
Pearson
Camacho, Christopher
LanguageMate
Cameto, Renée
SRI International
Haertel, Geneva
SRI International
Hansen, Eric
Educational Testing Service
Johnstone, Chris
National Center on Educational Outcomes
Kingston, Neal
University of Kansas
Murray, Elizabeth (Boo)
Center for Applied Special Technology
Parker, Caroline E
Education Development Center, Inc.
Redfield, Doris
Edvantia, Inc.
Tucker, Bill
Education Sector
This paper represents one outcome from the Invitational Research Symposium on Technology-Enabled and Universally Designed Assessments, which examined technology-enabled assessments (TEA) and universal design (UD) as they relate to students with disabilities (SWD). It was developed to stimulate research into TEAs designed to better understand the pathways to achievement for the full range of the student population through enhanced measurement capabilities offered by TEA. This paper presents important questions in four critical areas that need to be addressed by research efforts to enhance the measurement of cognition for students with disabilities: (a) better measurement of achievement for students with unique cognitive pathways to learning, (b) how interactive-dynamic assessments can assist investigations into learning progressions, (c) improvement of the validity of assessments for students previously in the margins, and (d) the potential consequences of TEA for students with disabilities. The current efforts for educational reform provide a unique window for action, and test designers are encouraged to take advantage of new opportunities to use TEA in ways that were not possible with paper and pencil tests. Symposium participants describe how technology-enabled assessments have the potential to provide more diagnostic information about students from various assessment sources about progress toward learning targets, generate better information to guide instruction and identify areas of focus for professional development, and create assessments that are more inclusive and measure achievement with improved validity for all students, especially students with disabilities.
The Journal of Technology, Learning and Assessment
2010-11-08 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1604
The Journal of Technology, Learning and Assessment; Vol. 10 No. 4 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1605
2011-05-09T22:14:08Z
jtla:ART
nmb a2200000Iu 4500
"101108 2010 eng "
1540-2525
dc
Technology-Enabled and Universally Designed Assessment: Considering Access in Measuring the Achievement of Students with Disabilities—A Foundation for Research
Almond, Patricia
University of Oregon
Winter, Phoebe
Pacific Metrics Corporation
Cameto, Renée
SRI International
Russell, Michael
Boston College
Sato, Edynn
WestEdand
Clarke-Midura, Jody
Harvard Graduate School of Education
Torres, Chloe
Measured Progress
Haertel, Geneva
SRI International
Dolan, Robert
Pearson
Beddow, Peter
Vanderbilt University
Lazarus, Sheryl
University of Minnesota
This paper represents one outcome from the Invitational Research Symposium on Technology-Enabled and Universally Designed Assessments, which examined technology-enabled assessments (TEA) and universal design (UD) as they relate to students with disabilities (SWD). It was developed to stimulate research into TEAs designed to make tests appropriate for the full range of the student population through enhanced accessibility. Four themes are explored: (a) a construct-centered approach to developing accessible assessments; (b) how technology and UD can provide access to targeted knowledge, skills, and abilities by embedding access and interactive features directly into systems that deliver TEAs; (c) the possibility of incorporating scaffolding directly into innovative assessment items; and (d) the importance of investigating the validity of inferences from TEAs that incorporate accessibility features designed to maximize validity. Through the paper, symposium participants and contributing authors share their understanding of issues and offer insights to researchers who conduct studies on the design, development, and validation of technology-enabled and universally designed assessments that include SWD. The paper proposes a focused research agenda and makes it clear that a principled program of research is needed to properly develop and use technology-enabled and universally designed educational assessments that encourage the inclusion of SWD. As research progresses, TEAs need to improve how they assess studentsâ understanding of complex academic content and how they provide equitable access to all students including SWD.
The Journal of Technology, Learning and Assessment
2010-11-08 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1605
The Journal of Technology, Learning and Assessment; Vol. 10 No. 5 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1606
2011-05-09T22:30:32Z
jtla:ART
nmb a2200000Iu 4500
"100103 2010 eng "
1540-2525
dc
Educational Outcomes and Research from 1:1 Computing Settings
Bebell, Damian
Boston College
O'Dwyer, Laura
Boston College
Despite the growing interest in 1:1 computing initiatives, relatively little empirical research has focused on the outcomes of these investments. The current special edition of the Journal of Technology and Assessment presents four empirical studies of K–12 1:1 computing programs and one review of key themes in the conversation about 1:1 computing among advocates and critics. In this introduction to our 1:1 special edition, we synthesize across the studies and discuss the emergent themes. Looking specifically across these studies, we summarize evidence that participation in the 1:1 programs was associated with increased student and teacher technology use, increased student engagement and interest level, and modest increases in student achievement.
The Journal of Technology, Learning and Assessment
2010-01-03 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1606
The Journal of Technology, Learning and Assessment; Vol. 9 No. 1 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1607
2011-05-09T22:30:32Z
jtla:ART
nmb a2200000Iu 4500
"100302 2010 eng "
1540-2525
dc
One to One Computing: A Summary of the Quantitative Results from the Berkshire Wireless Learning Initiative
Bebell, Damian
Boston College
Kay, Rachel
Boston College
This paper examines the educational impacts of the Berkshire Wireless Learning Initiative (BWLI), a pilot program that provided 1:1 technology access to all students and teachers across five public and private middle schools in western Massachusetts. Using a pre/post comparative study design, the current study explores a wide range of program impacts over the three years of the project’s implementation. Specifically, the current document provides an overview of the project background, implementation, research design and methodology, and a summary of the quantitative results. The study details how teaching and learning practices changed when students and teachers were provided with laptops, wireless learning environments, and additional technology resources. The results found that both the implementation and outcomes of the program were varied across the five 1:1 settings and over the three years of the student laptop implementation. Despite these differences, there was evidence that the types of educational access and opportunities afforded by 1:1 computing through the pilot program led to measurable changes in teacher practices, student achievement, student engagement, and students’ research skills.
The Journal of Technology, Learning and Assessment
2010-03-02 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1607
The Journal of Technology, Learning and Assessment; Vol. 9 No. 2 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1608
2011-05-09T22:30:32Z
jtla:ART
nmb a2200000Iu 4500
"100103 2010 eng "
1540-2525
dc
After Installation: Ubiquitous Computing and High School Science in Three Experienced, High-Technology Schools
Drayton, Brian
TERC, Inc.
Falk, Joni K.
TERC, Inc.
Stroud, Rena
TERC, Inc.
Hobbs, Kathryn
TERC, Inc.
Hammerman, James
TERC, Inc.
There are few studies of the impact of ubiquitous computing on high school science, and the majority of studies of ubiquitous computing report on the first period of implementation. The present study presents data on 3 high schools with carefully elaborated ubiquitous computing systems, who have gone through at least one "obsolescence cycle" and are therefore several years past first implementation. Data shows how the affordances of 1:1, wireless environment are being deployed in these science classrooms, and the effects of the environment on science content, data analysis, labs and other uses for visualizations, and classroom interaction. While some positive effects are clearly seen in these classrooms, even 5 years or more into the innovation, problems remain, and school cultural factors seem to play an important role in teacher uptake and integration of the technology. Implications for teacher learning are discussed.
The Journal of Technology, Learning and Assessment
2010-01-03 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1608
The Journal of Technology, Learning and Assessment; Vol. 9 No. 3 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1609
2011-05-09T22:30:33Z
jtla:ART
nmb a2200000Iu 4500
"100103 2010 eng "
1540-2525
dc
Evaluating the Implementation Fidelity of Technology Immersion and its Relationship with Student Achievement
Shapley, Kelly S.
Shapley Research Associates
Sheehan, Daniel
Texas Center for Educational Research
Maloney, Catherine
Texas Center for Educational Research
Caranikas-Walker, Fanny
Texas Center for Educational Research
In a pilot study of the Technology Immersion model, high-need middle schools were “immersed” in technology by providing a laptop for each student and teacher, wireless Internet access, curricular and assessment resources, professional development, and technical and pedagogical support. This article examines the fidelity of model implementation and associations between implementation indicators and student achievement. Results across three years for 21 immersion schools show that the average levels of school support for Technology Immersion and teachers’ Classroom Immersion increased slightly, while the level of Student Access and Use declined. Implementation quality varied across schools and classrooms, with a quarter or less of schools and core-content classrooms reaching substantial implementation. Using hierarchical linear modeling, we found that teacher-level implementation components (Immersion Support, Classroom Immersion) were inconsistent and mostly not statistically significant predictors of student achievement, whereas students’ use of laptops outside of school for homework and learning games was the strongest implementation mediator of achievement.
The Journal of Technology, Learning and Assessment
2010-01-03 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1609
The Journal of Technology, Learning and Assessment; Vol. 9 No. 4 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1610
2011-05-09T22:30:33Z
jtla:ART
nmb a2200000Iu 4500
"100303 2010 eng "
1540-2525
dc
Laptops and Fourth Grade Literacy: Assisting the Jump over the Fourth-Grade Slump
Suhr, Kurt A.
Newport Mesa Unified School District
Hernandez, David A.
Newport Mesa Unified School District
Grimes, Doug
University of California, Irvine
Warschauer, Mark
University of California, Irvine
School districts throughout the country are considering how to best integrate technology into instruction. There has been a movement in many districts toward one-to-one laptop instruction, in which all students are provided a laptop computer, but there is concern that these programs may not yield sufficiently improved learning outcomes to justify their substantial cost. And while there has been a great deal of research on the use of laptops in schools, there is little quantitative research systematically investigating the impact of laptop use on test outcomes, and none among students at the fourth-to-fifth grade levels. This study investigated whether a one-to-one laptop program could help improve English language arts (ELA) test scores of upper elementary students, a group that often faces a slowdown of literacy development during the transition from learning to read to reading to learn known as the fourth-grade slump. We explore these questions by comparing changes in the ELA test scores of a group of students who entered a one-to-one laptop program in the fourth grade to a similar group of students in a traditional program in the same school district. After two years’ participation in the program, laptop students outperformed non-laptop students on changes in the ELA total score and in the three subtests that correspond most closely to frequent laptop use: writing strategies, literary response and analysis, and reading comprehension.
The Journal of Technology, Learning and Assessment
2010-03-03 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1610
The Journal of Technology, Learning and Assessment; Vol. 9 No. 5 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1611
2011-05-09T22:30:33Z
jtla:ART
nmb a2200000Iu 4500
"100204 2010 eng "
1540-2525
dc
The End of Techno-Critique: The Naked Truth about 1:1 Laptop Initiatives and Educational Change
Weston, Mark E.
University of Colorado at Denver and Health Sciences Center
Bain, Alan
Charles Sturt University
This article responds to a generation of techno-criticism in education. It contains a review of the key themes of that criticism. The context of previous efforts to reform education reframes that criticism. Within that context, the question is raised about what schools need to look and be like in order to take advantage of laptop computers and other technology. In doing so, the article presents a vision for self-organizing schools.
The Journal of Technology, Learning and Assessment
2010-02-04 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1611
The Journal of Technology, Learning and Assessment; Vol. 9 No. 6 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1620
2011-05-09T22:57:42Z
jtla:ART
nmb a2200000Iu 4500
"090812 2009 eng "
1540-2525
dc
Measuring Conditions Conducive to Knowledge Development
Adams, Nan B
Southeastern Louisiana University
DeVaney, Thomas
Southeastern Louisiana University
Sawyer, Susan G
Southeastern Louisiana University
The design of virtual learning environments for post-secondary instruction is rapidly increasing among public and private universities. While the quantity of online courses over the past 10 years has exponentially increased, the quality of these courses has not. As universities increase their online teaching activities, real concern about the best design for these online learning opportunities underscores the need to create effective and responsive virtual learning environments. Adams (2007) developed the Recursive Model for Knowledge Development in Virtual Environments. The premise of this model is the belief that good teaching and engaged learning should not be determined by the use of certain instructional tools but by the guiding principal that learning is an active and recursive process, where knowledge must be contextualized to be relevant to the learner. To this purpose, this article describes the initial development in the ongoing process of designing a valid and reliable assessment tool, the Virtual Learning Environment Survey—VLES, for exploring the degree to which the Recursive Model for Knowledge Development relates to effective design of online learning environments. This student self-report survey will seek to provide guidance for the assessment of online learning environments through collection of student perceptions of teaching strategies, knowledge approach, and knowledge ownership in online classrooms.
The Journal of Technology, Learning and Assessment
2009-08-12 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1620
The Journal of Technology, Learning and Assessment; Vol. 8 No. 1 (2009)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1621
2011-05-09T22:57:42Z
jtla:ART
nmb a2200000Iu 4500
"100107 2010 eng "
1540-2525
dc
On the Roles of External Knowledge Representations in Assessment Design
Mislevy, Robert J
University of Maryland, College Park
Behrens, John T
Cisco Systems, Inc.
Bennett, Randy E
ETS
Demark, Sarah F
Cisco Systems, Inc.
Frezzo, Dennis C
Cisco Systems, Inc.
Levy, Roy
Arizona State University
Robinson, Daniel H
University of Texas at Austin
Wise Rutstein, Daisy
University of Maryland, College Park
Shute, Valerie J
Florida State University
Stanley, Ken
Cisco Systems, Inc.
Winters, Fielding I
University of Maryland, College Park
People use external knowledge representations (EKRs) to identify, depict, transform, store, share, and archive information. Learning how to work with EKRs is central to be-coming proficient in virtually every discipline. As such, EKRs play central roles in cur-riculum, instruction, and assessment. Five key roles of EKRs in educational assessment are described: (1) An assessment is itself an EKR, which makes explicit the knowledge that is valued, ways it is used, and standards of good work. (2) The analysis of any domain in which learning is to be assessed must include the iden-tification and analysis of the EKRs in that domain. (3) Assessment tasks can be structured around the knowledge, relationships, and uses of domain EKRs. (4) "Design EKRs" can be created to organize knowledge about a domain in forms that support the design of assessment. (5) EKRs in the discipline of assessment design can guide and structure the domain analyses (#2), task construction (#3), and the creation and use of design EKRs (#4). The third and fourth roles are discussed and illustrated in greater detail, through the per-spective of an "evidence-centered" assessment design framework that reflects the fifth role. Connections with automated task construction and scoring are highlighted. Ideas are illustrated with two examples: "generate examples" tasks and simulation-based tasks for assessing computer network design and troubleshooting skills.
The Journal of Technology, Learning and Assessment
2010-01-07 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1621
The Journal of Technology, Learning and Assessment; Vol. 8 No. 2 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1622
2011-05-09T22:57:42Z
jtla:ART
nmb a2200000Iu 4500
"100128 2010 eng "
1540-2525
dc
Controlling Test Overlap Rate in Automated Assembly of Multiple Equivalent Test Forms
Lin, Chuan-Ju
National University of Tainan, Taiwan
Assembling equivalent test forms with minimal test overlap across forms is important in ensuring test security. Chen and Lei (2009) suggested a exposure control technique to control test overlap-ordered item pooling on the fly based on the essence that test overlap rate – ordered item pooling for the first t examinees is a function of test overlap rate – ordered item pooling for the previous (t-1) examinees. The exposure control procedure to control test overlap-ordered item pooling on the fly appears to meet the needs of controlling test overlap rate for tests assembled sequentially. To develop a better understanding of how well the ordered-item-pooling control method function in automated assembly of multiple forms with the WDM heuristic, this study evaluated its performance under different conditions of test length and test-content outline by comparing the outcomes to those from the corresponding baseline automated-test-assembly (ATA) conditions, where test overlap controls were not considered. The evaluation criteria included (i) the conformity to the test-assembly constraints, (ii) test parallelism in terms of the resultant psychometric properties, (iii) average test overlap rate, and (iv) distribution of item exposure rate. The results showed that the ordered-item-pooling control procedure demonstrated its effectiveness in most experimental conditions by achieving an acceptable average test overlap rate across multiple forms without compromising the conformity to the test-assembly constraints and the test equity of the assembled forms. Moreover, test security might be ensured in less supportive contexts for ATA by imposing item exposure control together with test overlap control that would be less likely to compromise test quality. More research is needed to verify this anticipation.
The Journal of Technology, Learning and Assessment
2010-01-28 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1622
The Journal of Technology, Learning and Assessment; Vol. 8 No. 3 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1623
2011-05-09T22:57:42Z
jtla:ART
nmb a2200000Iu 4500
"100131 2010 eng "
1540-2525
dc
Evidence-centered Design of Epistemic Games: Measurement Principles for Complex Learning Environments
Rupp, André A.
University of Maryland, College Park
Gushta, Matthew
American Institutes for Research
Mislevy, Robert J
University of Maryland, College Park
Shaffer, David Williamson
University of Wisconsin at Madison
We are currently at an exciting juncture in developing effective means for assessing so-called 21st-century skills in an innovative yet reliable fashion. One of these avenues leads through the world of epistemic games (Shaffer, 2006a), which are games designed to give learners the rich experience of professional practica within a discipline. They serve to develop domain-specific expertise based on principles of collaborative learning, distributed expertise, and complex problem-solving. In this paper, we describe a comprehensive research program for investigating the methodological challenges that await rigorous inquiry within the epistemic games context. We specifically demonstrate how the evidence-centered design framework (Mislevy, Almond, & Steinberg, 2003) as well as current conceptualizations of reliability and validity theory can be used to structure the development of epistemic games as well as empirical research into their functioning. Using the epistemic game Urban Science (Bagley & Shaffer, 2009), we illustrate the numerous decisions that need to be made during game development and their implications for amassing qualitative and quantitative evidence about learners’ developing expertise within epistemic games.
The Journal of Technology, Learning and Assessment
2010-01-31 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1623
The Journal of Technology, Learning and Assessment; Vol. 8 No. 4 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1624
2011-05-09T22:57:43Z
jtla:ART
nmb a2200000Iu 4500
"100208 2010 eng "
1540-2525
dc
Specificity of Structural Assessment of Knowledge
Trumpower, David L
University of Ottawa
Sharara, Harold
University of Ottawa
Goldsmith, Timothy E
University of New Mexico, Main Campus
This study examines the specificity of information provided by structural assessment of knowledge (SAK). SAK is a technique which uses the Pathfinder scaling algorithm to transform ratings of concept relatedness into network representations (PFnets) of individuals’ knowledge. Inferences about individuals’ overall domain knowledge based on the similarity between their PFnets and a referent PFnet have been shown to be valid. We investigate a more fine grained evaluation of specific links in individuals’ PFnets for identifying particular strengths and weaknesses. Thirty-five undergraduates learned about a computer programming language and were then tested on their knowledge of the language with SAK and a problem solving task. The presence of two subsets of links in participants’ PFnets differentially predicted performance on two types of problems, thereby providing evidence of the specificity of SAK. Implications for the formative use of SAK in the classroom and in computer-based environments are discussed.
The Journal of Technology, Learning and Assessment
2010-02-08 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1624
The Journal of Technology, Learning and Assessment; Vol. 8 No. 5 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1625
2011-05-09T22:57:43Z
jtla:ART
nmb a2200000Iu 4500
"100302 2010 eng "
1540-2525
dc
Utility in a Fallible Tool: A Multi-Site Case Study of Automated Writing Evaluation
Grimes, Douglas
University of California, Irvine
Warschauer, Mark
University of California, Irvine
Automated writing evaluation (AWE) software uses artificial intelligence (AI) to score student essays and support revision. We studied how an AWE program called MY Access!® was used in eight middle schools in Southern California over a three-year period. Although many teachers and students considered automated scoring unreliable, and teachers’ use of AWE was limited by the desire to use conventional writing methods, use of the software still brought important benefits. Observations, interviews, and a survey indicated that using AWE simplified classroom management and increased students’ motivation to write and revise.
The Journal of Technology, Learning and Assessment
2010-03-02 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1625
The Journal of Technology, Learning and Assessment; Vol. 8 No. 6 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1626
2011-05-09T22:57:43Z
jtla:ART
nmb a2200000Iu 4500
"100405 2010 eng "
1540-2525
dc
Automated Scoring of an Interactive Geometry Item: A Proof-of-Concept
Masters, Jessica
Boston College
An online interactive geometry item was developed to explore students’ abilities to create prototypical and “tilted” rectangles out of line segments. The item was administered to 1,002 students. The responses to the item were hand-coded as correct, incorrect, or incorrect with possible evidence of a misconception. A variation of the nearest neighbor algorithm was used to automatically predict one of these categories for each of the student responses. The predicted category was compared to the hand-coded category. The algorithm accurately predicted the category for 94.6% of responses.
The Journal of Technology, Learning and Assessment
2010-04-05 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1626
The Journal of Technology, Learning and Assessment; Vol. 8 No. 7 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1627
2011-05-09T22:57:43Z
jtla:ART
nmb a2200000Iu 4500
"100609 2010 eng "
1540-2525
dc
Measuring Problem Solving with Technology: A Demonstration Study for NAEP
Bennett, Randy E
ETS
Persky, Hilary
ETS
Weiss, Andy
ETS
Jenkins, Frank
Westat
This paper describes a study intended to demonstrate how an emerging skill, problem solving with technology, might be measured in the National Assessment of Educational Progress (NAEP). Two computer-delivered assessment scenarios were designed, one on solving science-related problems through electronic information search and the other on solving science-related problems by conducting simulated experiments. The assessment scenarios were administered in 2003 to nationally representative samples of 8th-grade students in over 200 schools. Results are reported on the psychometric functioning of the scenarios and the performance of population groups. Implications are offered for using online performance assessment to measure emerging skills in NAEP and other large-scale testing programs.
The Journal of Technology, Learning and Assessment
2010-06-09 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1627
The Journal of Technology, Learning and Assessment; Vol. 8 No. 8 (2010)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1628
2011-05-10T17:42:40Z
jtla:ART
nmb a2200000Iu 4500
"080905 2008 eng "
1540-2525
dc
Students’ Experiences with an Automated Essay Scorer
Scharber, Cassandra
University of Minnesota
Dexter, Sara
University of Virginia
Riedel, Eric
Walden University
The purpose of this research is to analyze preservice teachers’ use of and reactions to an automated essay scorer used within an online, case-based learning environment called ETIPS. Data analyzed include post-assignment surveys, a user log of students’ actions within the cases, instructor-assigned scores on final essays, and interviews with four selected students. These in-depth data about students’ reactions to and opinions of the ETIPS automated essay scorer help inform the field about users’ perceptions of automated scoring.
The Journal of Technology, Learning and Assessment
2008-09-05 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1628
The Journal of Technology, Learning and Assessment; Vol. 7 No. 1 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1629
2011-05-10T17:42:41Z
jtla:ART
nmb a2200000Iu 4500
"081209 2008 eng "
1540-2525
dc
Developing a Taxonomy of Item Model Types to Promote Assessment Engineering
Gierl, Mark J.
University of Alberta
Zhou, Jiawen
University of Alberta
Alves, Cecilia
University of Alberta
An item model serves as an explicit representation of the variables in an assessment task. An item model includes the stem, options, and auxiliary information. The stem is the part of an item which formulates context, content, and/or the question the examinee is required to answer. The options contain the alternative answers with one correct option and one or more incorrect options or distractors. The auxiliary information includes any additional material, in either the stem or option, required to generate an item, including texts, images, tables, and/or diagrams. In this study, we first present a taxonomy for item model development where variables in the stem are crossed with variables in the options to create a matrix of possible item model types. We then provide examples of each stem-by-option combination. Finally, we develop a software engine and apply the software to each item model type to generate multiple instances for each model.
The Journal of Technology, Learning and Assessment
2008-12-09 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1629
The Journal of Technology, Learning and Assessment; Vol. 7 No. 2 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1630
2011-05-10T17:42:41Z
jtla:ART
nmb a2200000Iu 4500
"081218 2008 eng "
1540-2525
dc
Online Courses for Math Teachers: Comparing Self-Paced and Facilitated Cohort Approaches
Carey, Rebecca
Education Development Center (EDC)
Kleiman, Glenn
North Carolina State University at Raleigh
Russell, Michael
Boston College
Douglas Venable, Joanne
Louie, Josephine
EDC
This study investigated whether two different versions of an online professional development course produced different impacts on the intended outcomes of the course. Variations of an online course for middle school algebra teachers were created for two experimental conditions. One was an actively facilitated course with asynchronous peer interactions among participants. The second was a self-paced condition, in which neither active facilitation nor peer interactions were available. Both conditions showed significant impact on teachers’ mathematical understanding, pedagogical beliefs, and instructional practices. Surprisingly, the positive outcomes were comparable for both conditions. Further research is needed to determine whether this finding is limited to self-selected teachers, the specifics of this online course, or other factors that limit generalizability.
The Journal of Technology, Learning and Assessment
2008-12-18 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1630
The Journal of Technology, Learning and Assessment; Vol. 7 No. 3 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1631
2011-05-10T18:23:03Z
jtla:ART
nmb a2200000Iu 4500
"070801 2007 eng "
1540-2525
dc
Toward More Substantively Meaningful Automated Essay Scoring
Ben-Simon, Anat
National Institute for Testing & Evaluation, Israel
Bennett, Randy Elliot
ETS
This study evaluated a “substantively driven” method for scoring NAEP writing assessments automatically. The study used variations of an existing commercial program, e-rater®, to compare the performance of three approaches to automated essay scoring: a brute-empirical approach in which variables are selected and weighted solely according to statistical criteria, a hybrid approach in which a fixed set of variables more closely tied to the characteristics of good writing was used but the weights were still statistically determined, and a substantively driven approach in which a fixed set of variables was weighted according to the judgments of two independent committees of writing experts. The research questions concerned (1) the reproducibility of weights across writing experts, (2) the comparison of scores generated by the three automated approaches, and (3) the extent to which models developed for scoring one NAEP prompt generalize to other NAEP
prompts of the same genre. Data came from the 2002 NAEP Writing Online study and from the main NAEP 2002 writing assessment. Results showed that, in carrying out the substantively driven approach, experts initially assigned weights to writing dimensions that were highly similar across committees but that diverged from one another after committee 1 was shown the empirical weights for possible use in its judgments and committee 2 was not shown those weights. The substantively driven approach based on the judgments of committee 1 generally did not operate in a markedly different way from the brute empirical or hybrid approaches in most of the analyses conducted. In contrast, many consistent differences with those approaches were observed for the substantively driven approach based on the judgments of committee 2. This study suggests that empirical weights might provide a useful starting point for expert committees, with the understanding that the
weights be moderated only somewhat to bring them more into line with substantive considerations. Under such circumstances, the results may turn out to be reasonable, though not necessarily as highly related to human ratings as statistically optimal approaches would produce.
The Journal of Technology, Learning and Assessment
2007-08-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1631
The Journal of Technology, Learning and Assessment; Vol. 6 No. 1 (2007)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1632
2011-05-10T18:23:03Z
jtla:ART
nmb a2200000Iu 4500
"071018 2007 eng "
1540-2525
dc
Automated Essay Scoring Versus Human Scoring: A Comparative Study
Wang, Jinhao
South Texas College
Brown, Michelle Stallone
Texas A & M University -- Kingsville
The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by AES and human raters. Data collection included two standardized writing tests – WritePlacer Plus and the Texas Higher Education Assessment (THEA) writing test. The research sample of 107 participants was drawn from a Hispanic serving institution in South Texas. The One-Way Repeated-Measures ANOVA and the follow-up Paired Samples t test were conducted to examine the group mean differences. Results of the tests indicated that the mean score assigned by IntelliMetric™ was significantly higher than faculty human raters’ mean score on WriterPlacer Plus test, and IntelliMetric™ mean score was also significantly higher than THEA mean score assigned by human raters from National Evaluation Systems. A statistically significant difference also existed between the human raters’ mean score on WritePlacer Plus and human raters’ mean score on THEA. These findings did not corroborate previous studies that reported non-significant mean score differences between AES and human scoring.
The Journal of Technology, Learning and Assessment
2007-10-18 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1632
The Journal of Technology, Learning and Assessment; Vol. 6 No. 2 (2007)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1633
2011-05-10T18:23:04Z
jtla:ART
nmb a2200000Iu 4500
"071120 2007 eng "
1540-2525
dc
Examining Differences in Examinee Performance in Paper and Pencil and Computerized Testing
Puhan, Gautam
ETS
Boughton, Keith A
CTB
Kim, Sooyeon
ETS
The study evaluated the comparability of two versions of a certification test: a paper-and-pencil test (PPT) and computer-based test (CBT). An effect size measure known as Cohen’s d and differential item functioning (DIF) analyses were used as measures of comparability at the test and item levels, respectively. Results indicated that the effect sizes were small (d < 0.20) and not statistically significant (p > 0.05), suggesting no substantial difference between the two test versions. Moreover, DIF analysis revealed that reading and mathematics items were comparable for both versions. However, three writing items were flagged for DIF. Substantive reviews failed to identify format differences that could explain the performance differences, so the causes of DIF could not be identified.
The Journal of Technology, Learning and Assessment
2007-11-20 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1633
The Journal of Technology, Learning and Assessment; Vol. 6 No. 3 (2007)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1634
2011-05-10T18:23:04Z
jtla:ART
nmb a2200000Iu 4500
"071221 2007 eng "
1540-2525
dc
Comparability of Computer and Paper-and-Pencil Versions of Algebra and Biology Assessments
Kim, Do-Hong
University of North Carolina, Charlotte
Huynh, Huynh
University of South Carolina
This study examined comparability of student scores obtained from computerized and paper-and-pencil formats of the large-scale statewide end-of-course (EOC) examinations in the two subject areas of Algebra and Biology. Evidence in support of comparability of computerized and paper-based tests was sought by examining scale scores, item parameter estimates, test characteristic curves, test information functions, Rasch ability estimates at the content domain level, and the equivalence of the construct. Overall, the results support the comparability of computerized and paper-based tests at the item-level, subtest-level and whole test-level in both subject areas. For both subject areas, no evidence was found to suggest that the administration mode changed the construct being measured.
The Journal of Technology, Learning and Assessment
2007-12-21 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1634
The Journal of Technology, Learning and Assessment; Vol. 6 No. 4 (2007)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1635
2011-05-10T18:23:04Z
jtla:ART
nmb a2200000Iu 4500
"080118 2008 eng "
1540-2525
dc
Examining the Relationship between Students’ Mathematics Test Scores and Computer Use at Home and at School
O'Dwyer, Laura
Boston College
Russell, Michael
Boston College
Bebell, Damian
Boston College
Tucker-Seeley, Kevon R.
Northeast & Islands Regional Educational Laboratory (NEIREL) at the Educational Development Center (EDC)
Over the past decade, standardized test results have become the primary tool used to judge the effectiveness of schools and educational programs, and today, standardized testing serves as the keystone for educational policy at the state and federal levels. This paper examines the relationship between fourth grade mathematics achievement and technology use at home and at school. Using item level achievement data, individual student’s state test scores on the Massachusetts Comprehensive Assessment System (MCAS), and student and teacher responses to detailed technology-use surveys, this study examines the relationship between technology-use and mathematics performance among 986 regular students, from 55 intact fourth grade classrooms in 25 schools across 9 school districts in Massachusetts. The findings from this study suggest that various uses of technology are differentially related to student outcomes and that in general, student and teacher technology uses are weakly related to mathematics achievement on the MCAS. Implications for improving methods for examining the relationship between technology use and standardized test scores are presented.
The Journal of Technology, Learning and Assessment
2008-01-18 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1635
The Journal of Technology, Learning and Assessment; Vol. 6 No. 5 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1636
2011-05-10T18:23:04Z
jtla:ART
nmb a2200000Iu 4500
"080212 2008 eng "
1540-2525
dc
Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees' Cognitive Skills in Algebra on the SAT
Gierl, Mark J.
University of Alberta
Wang, Changjiang
University of Alberta
Zhou, Jiawen
University of Alberta
The purpose of this study is to apply the attribute hierarchy method (AHM) to a subset of SAT algebra items administered in March 2005 to promote cognitive diagnostic inferences about examinees. The AHM is a psychometric method for classifying examinees’ test item responses into a set of structured attribute patterns associated with different components from a cognitive model of task performance. An attribute is a description of the procedural or declarative knowledge needed to perform a task. These attributes form a hierarchy of cognitive skills that represent a cognitive model of task performance. The study was conducted in two steps. In step 1, a cognitive model was developed by having content specialists, first, review the SAT algebra items, identify their salient attributes, and order the item-based attributes into a hierarchy. Then, the cognitive model was validated by having a sample of students think aloud as they solved each item. In step 2, psychometric analyses were conducted on the SAT algebra cognitive model by evaluating the model-data fit between the expected response patterns generated by the cognitive model and the observed response patterns produced from a random sample of 5000 examinees who wrote the items. Attribute probabilities were also computed for this random sample of examinees so diagnostic inferences about their attribute-level performances could be made. We conclude the study by describing key limitations, highlighting challenges inherent to the development and analysis of cognitive diagnostic assessments, and proposing directions for future research. This article contains embedded media (video and audio files) and may take a few minutes to download. You will need Flash Player 9.0 (available from www.adobe.com) to play the files. An alternate, smaller version of this article, that does not contain media files is available below under the Alternate Version heading.
The Journal of Technology, Learning and Assessment
2008-02-12 00:00:00
application/pdf
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1636
The Journal of Technology, Learning and Assessment; Vol. 6 No. 6 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1637
2011-05-10T18:23:04Z
jtla:ART
nmb a2200000Iu 4500
"080411 2008 eng "
1540-2525
dc
Does Survey Medium Affect Responses? An Exploration of Electronic and Paper Surveying in British Columbia Schools
Walt, Nancy
British Columbia Ministry of Education
Atwood, Kristin
Agency Research Consultants
Mann, Alex
British Columbia Ministry of Education
The purpose of this study was to determine whether or not survey medium (electronic versus paper format) has a significant effect on the results achieved. To compare survey media, responses from elementary students to British Columbia’s Satisfaction Survey were analyzed. Although this study was not experimental in design, the data set served as a rich source for which to investigate the research question. The methods included reliability, item mean, factor analysis, response rate and response completeness comparisons across survey media. From the analyses, the differences between electronic and paper media in this study appear to be minor, and do not seem to have a significant effect on overall results. In conclusion, the medium does not seem to overly affect response patterns and does not pose any threats to the validity or reliability of survey results.
The Journal of Technology, Learning and Assessment
2008-04-11 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1637
The Journal of Technology, Learning and Assessment; Vol. 6 No. 7 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1638
2011-05-10T18:23:05Z
jtla:ART
nmb a2200000Iu 4500
"080429 2008 eng "
1540-2525
dc
Comparisons between Classical Test Theory and Item Response Theory in Automated Assembly of Parallel Test Forms
Lin, Chuan-Ju
National University of Tainan, Taiwan
The automated assembly of alternate test forms for online delivery provides an alternative to computer-administered, fixed test forms, or computerized-adaptive tests when a testing program migrates from paper/pencil testing to computer-based testing. The weighted deviations model (WDM) heuristic particularly promising for automated test assembly (ATA) because it is computationally straightforward and produces tests with desired properties under realistic testing conditions. Unfortunately, research into the WDM heuristic has focused exclusively on the Item Response Theory (IRT) methods even though there are situations under which Classical Test Theory (CTT) item statistics are the only data available to test developers. The purpose of this study was to investigate the degree of parallelism of test forms assembled with the WDM heuristic using both CTT and IRT methods. Alternate forms of a 60-item test were assembled from a pool of 600 items. One CTT and two IRT approaches were used to generate content and psychometric constraints. The three methods were compared in terms of conformity to the test-assembly constraints, average test overlap rate, content parallelism, and statistical parallelism. The results led to a primary conclusion that the CTT approach performed at least as well as the IRT approaches. The possible reasons for the results of the comparability of the three test-assembly approaches were discussed and the suggestions for the future ATA applications were provided in this paper.
The Journal of Technology, Learning and Assessment
2008-04-29 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1638
The Journal of Technology, Learning and Assessment; Vol. 6 No. 8 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1639
2011-05-10T18:23:05Z
jtla:ART
nmb a2200000Iu 4500
"080617 2008 eng "
1540-2525
dc
Does it Matter if I Take My Mathematics Test on Computer? A Second Empirical Study of Mode Effects in NAEP
Bennett, Randy Elliot
ETS
Braswell, James
AIR
Oranje, Andreas
ETS
Sandene, Brent
ETS
Kaplan, Bruce
ETS
Yan, Fred
ETS
This article describes selected results from the Math Online (MOL) study, one of three field investigations sponsored by the National Center for Education Statistics (NCES) to explore the use of new technology in NAEP. Of particular interest in the MOL study was the comparability of scores from paper- and computer-based tests. A nationally representative sample of eighth-grade students was administered a computer-based mathematics test and a test of computer facility, among other measures. In addition, a randomly parallel group of students was administered a paper-based test containing the same math items as the computer-based test. Results showed that the computer-based mathematics test was significantly harder statistically than the paper-based test. In addition, computer facility predicted online mathematics test performance after controlling for performance on a paper-based mathematics test, suggesting that degree of familiarity with computers may matter when taking a computer-based mathematics test in NAEP.
The Journal of Technology, Learning and Assessment
2008-06-17 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1639
The Journal of Technology, Learning and Assessment; Vol. 6 No. 9 (2008)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1640
2011-05-10T20:10:28Z
jtla:ART
nmb a2200000Iu 4500
"060816 2006 eng "
1540-2525
dc
An Overview of Automated Scoring of Essays
Dikli, Semire
Florida State University
Automated Essay Scoring (AES) is defined as the computer technology that evaluates and scores the written prose (Shermis & Barrera, 2002; Shermis & Burstein, 2003; Shermis, Raymat, & Barrera, 2003). AES systems are mainly used to overcome time, cost, reliability, and generalizability issues in writing assessment (Bereiter, 2003; Burstein, 2003; Chung & O’Neil, 1997; Hamp-Lyons, 2001; Myers, 2003; Page, 2003; Rudner & Gagne, 2001; Rudner & Liang, 2002; Sireci & Rizavi, 1999; http://people.emich.edu). AES continue attracting the attention of public schools, universities, testing companies, researchers and educators (Burstein, Kukich, Wolff, Lu, & Chodorow, 1998; Shermis & Burstein, 2003; Sireci & Rizavi, 1999). The main purpose of this article is to provide an overview of current approaches to AES. After describing the most widely used AES systems (i.e., Project Essay Grader (PEG), Intelligent Essay Assessor (IEA), E-rater and Criterion, IntelliMetric and MY Access, and Bayesian Essay Test Scoring System (BETSY)), main characteristics of these systems will be discussed and current issues regarding the use of them both in low-stakes assessment (in classrooms) and high-stakes assessment (as standardized tests) will be discussed in this article.
The Journal of Technology, Learning and Assessment
2006-08-16 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1640
The Journal of Technology, Learning and Assessment; Vol. 5 No. 1 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1641
2011-05-10T20:10:28Z
jtla:ART
nmb a2200000Iu 4500
"061122 2006 eng "
1540-2525
dc
Does it Matter if I take My Writing Test on Computer? An Empirical Study of Mode Effects in NAEP
Horkay, Nancy
Bennett, Randy Elliot
ETS
Allen, Nancy
Kaplan, Bruce A.
ETS
Yan, Fred
ETS
This study investigated the comparability of scores for paper and computer versions of a writing test administered to eighth grade students. Two essay prompts were given on paper to a nationally representative sample as part of the 2002 main NAEP writing assessment. The same two essay prompts were subsequently administered on computer to a second sample also selected to be nationally representative. Analyses looked at overall differences in performance between the delivery modes, interactions of delivery mode with group membership, differences in performance between those taking the computer test on different types of equipment (i.e., school machines vs. NAEP-supplied laptops), and whether computer familiarity was associated with online writing test performance. Results generally showed no significant mean score differences between paper and computer delivery. However, computer familiarity significantly predicted online writing test performance after controlling for paper writing skill. These results suggest that, for any given individual, a computer-based writing assessment may produce different results than a paper one, depending upon that individual’s level of computer familiarity. Further, for purposes of estimating population performance, as long as substantial numbers of students write better on computer than on paper (or better on paper than on computer), conducting a writing assessment in either mode alone may underestimate the performance that would have resulted if students had been tested using the mode in which they wrote best.
The Journal of Technology, Learning and Assessment
2006-11-22 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1641
The Journal of Technology, Learning and Assessment; Vol. 5 No. 2 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1642
2011-05-10T20:10:29Z
jtla:ART
nmb a2200000Iu 4500
"061104 2006 eng "
1540-2525
dc
Individualizing Learning Using Intelligent Technology and Universally Designed Curriculum
Abell, Michael
University of Louisville
The American education system and its rigorous accountability and performance standards continually force educators to explore new ways to increase student achievement. The improvement in computer technology and intelligent computing systems may offer new tools for student learning and higher academic achievement. These systems have the potential to meet individual student learning needs using universally designed curricula and assessments. The purpose of this paper is to present a conceptual framework that harnesses the potential of intelligent learning systems, machine learning models, and universal design for learning principles to help formulate next generation instructional materials. By using intelligent and interactive curricula, educators could begin to move away from information disseminator into a facilitator of the learning experience.
The Journal of Technology, Learning and Assessment
2006-11-04 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1642
The Journal of Technology, Learning and Assessment; Vol. 5 No. 3 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1643
2011-05-10T20:10:29Z
jtla:ART
nmb a2200000Iu 4500
"061211 2006 eng "
1540-2525
dc
Differential Item Functioning of GRE Mathematics Items Across Computerized and Paper-and-Pencil Testing Media
Gu, Lixiong
Michigan State University
Drake, Samuel
Michigan State University
Wolfe, Edward W.
Virginia Tech
This study seeks to determine whether item features are related to observed differences in item difficulty (DIF) between computer- and paper-based test delivery media. Examinees responded to 60 quantitative items similar to those found on the GRE general test in either a computer-based or paper-based medium. Thirty-eight percent of the items were flagged for cross-medium DIF, and post hoc content analyses were performed focusing on page formatting, mathematical notation, and mathematical content of the items. Although findings suggest that differences in page formatting and response processes across the delivery media contribute little to the observed cross-medium DIF, differences in the mathematical notation contained in the item text as well as differences in the mathematical content of the items provided the strongest apparent relationships with cross-medium DIF.
The Journal of Technology, Learning and Assessment
2006-12-11 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1643
The Journal of Technology, Learning and Assessment; Vol. 5 No. 4 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1644
2011-05-10T20:10:29Z
jtla:ART
nmb a2200000Iu 4500
"061211 2006 eng "
1540-2525
dc
Comprehensive Assessment of a Software Development Project for Engineering Instruction
Hall, Richard H
University of Missouri - Rolla
Philpot, Timothy A.
University of Missouri - Rolla
Hubing, Nancy
University of Missouri - Rolla
This paper reviews a series of formative assessment studies that were conducted to inform and evaluate a large-scale instructional software development project at the University of Missouri –Rolla (UMR). The three-year project, entitled “Taking the Next Step in Engineering Education: Integrating Educational Software and Active Learning,” was funded by the U.S. Department of Education Fund for the Improvement of Post Secondary Education (FIPSE). The assessment was carried out under the auspices of UMR’s Laboratory for Information Technology Evaluation (LITE) and guided by the LITE model for evaluation of learning technologies. The fundamental premise of the model is that evaluation should consist of the triangulation of multiple research methodologies and measurement tools. Five representative evaluation studies, consisting of eight experiments, are presented here. The studies range from basic experimentation and usability testing to applied research conducted within the classroom as well as a multi-national cross-cultural applied dissemination survey conducted during the last semester of the project. This paper demonstrates that the LITE model can be an effective tool for guiding a comprehensive evaluation program. In addition, the research findings provide evidence that the instructional multimedia developed in this project can have a substantial positive impact in enhancing fundamental engineering classes.
The Journal of Technology, Learning and Assessment
2006-12-11 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1644
The Journal of Technology, Learning and Assessment; Vol. 5 No. 5 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1645
2011-05-10T20:10:29Z
jtla:ART
nmb a2200000Iu 4500
"061213 2006 eng "
1540-2525
dc
An Exploratory Study of a Novel Online Formative Assessment and Instructional Tool to Promote Students’ Circuit Problem Solving
Chung, Gregory K. W. K.
UCLA/CRESST
Shel, Tammy
UCLA/CRESST
Kaiser, William J
UCLA/HSSEAS
We examined a novel formative assessment and instructional approach with 89 students in three electrical engineering classes in special computer-based discussion sections. The technique involved students individually solving circuit problems online, with their real-time responses observed by the instructor. While exploratory, survey and interview responses from 26 students suggest the technique offers important instructional and assessment advantages: Compared to typical discussion sessions, a large majority of respondents reported being more engaged, learning more, and interacting more with the instructor. Students reported the anonymous mode allowed them to ask “dumb” questions. The instructor was able to address student problems and questions immediately, and the amount of formative assessment information from the interaction far exceeded what was available in typical settings.
The Journal of Technology, Learning and Assessment
2006-12-13 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1645
The Journal of Technology, Learning and Assessment; Vol. 5 No. 6 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1646
2011-05-10T20:10:29Z
jtla:ART
nmb a2200000Iu 4500
"070330 2007 eng "
1540-2525
dc
The Effect of Using Item Parameters Calibrated from Paper Administrations in Computer Adaptive Test Administrations
Pommerich, Mary
Defense Manpower Data Center
Computer administered tests are becoming increasingly prevalent as computer technology becomes more readily available on a large scale. For testing programs that utilize both computer and paper administrations, mode effects are problematic in that they can result in examinee scores that are artificially inflated or deflated. As such, researchers have engaged in extensive studies of whether scores differ across paper and computer presentations of the same tests. The research generally seems to indicate that the more complicated it is to present or take a test on computer, the greater the possibility of mode effects. In a computer adaptive test, mode effects may be a particular concern if items are calibrated using item responses obtained from one administration mode (i.e., paper), and those parameters are then used operationally in a different administration mode (i.e., computer). This paper studies the suitability of using parameters calibrated from a
paper administration for item selection and scoring in a computer adaptive administration, for two tests with lengthy passages that required navigation in the computer administration. The results showed that the use of paper calibrated parameters versus computer calibrated parameters in computer adaptive administrations had small to moderate effects on the reliability of examinee scores, at fairly short test lengths. This effect was generally diminished for longer test lengths. However, the results suggest that in some cases, some loss in reliability might be inevitable if paper-calibrated parameters are used in computer adaptive administrations.
The Journal of Technology, Learning and Assessment
2007-03-30 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1646
The Journal of Technology, Learning and Assessment; Vol. 5 No. 7 (2007)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1647
2011-05-10T20:10:30Z
jtla:ART
nmb a2200000Iu 4500
"070508 2007 eng "
1540-2525
dc
A Review of Item Exposure Control Strategies for Computerized Adaptive Testing Developed from 1983 to 2005
Georgiadou, Elissavet G
Hellenic Open University, Greece
Triantafillou, Evangelos
Aristotle University of Thessaloniki, Greece
Economides, Anastasios A
University of Macedonia, Greece
Since researchers acknowledged the several advantages of computerized adaptive testing (CAT) over traditional linear test administration, the issue of item exposure control has received increased attention. Due to CAT’s underlying philosophy, particular items in the item pool may be presented too often and become overexposed, while other items are rarely selected by the CAT algorithm and thus become underexposed. Several item exposure control strategies have been presented in the literature aiming to prevent overexposure of some items and to increase the use rate of rarely or never selected items. This paper reviews such strategies that appeared in the relevant literature from 1983 to 2005. The focus of this paper is on studies that have been conducted in order to evaluate the effectiveness of item exposure control strategies for dichotomous scoring, polytomous scoring and testlet-based CAT systems. In addition, the paper discusses the strengths and
weaknesses of each strategy group using examples from simulation studies. No new research is presented but rather a compendium of models is reviewed with an overall objective of providing researchers of this field, especially newcomers, a wide view of item exposure control strategies.
The Journal of Technology, Learning and Assessment
2007-05-08 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1647
The Journal of Technology, Learning and Assessment; Vol. 5 No. 8 (2007)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1648
2011-05-10T22:03:09Z
jtla:ART
nmb a2200000Iu 4500
"051001 2005 eng "
1540-2525
dc
The Effects of Online Formative and Summative Assessment on Test Anxiety and Performance
Cassady, Jerrell C.
Ball State University
Gridley, Betty E.
Ball State University
This study analyzed the effects of online formative and summative assessment materials on undergraduates' experiences with attention to learners' testing behaviors (e.g., performance, study habits) and beliefs (e.g., test anxiety, perceived test threat). The results revealed no detriment to students' perceptions of tests or performances on tests when comparing online to paper-pencil summative assessments. In fact, students taking tests online reported lower levels of perceived test threat. Regarding formative assessment, findings indicate a small benefit for using online practice tests prior to graded course exams. This effect appears to be in part due to the reduction of the deleterious effects of negative test perceptions afforded in conditions where practice tests were available. The results support the integration of online practice tests to help students prepare for course exams and also reveal that secure web-based testing can aid undergraduate instruction through improved student confidence and increased instructional time.
The Journal of Technology, Learning and Assessment
2005-10-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1648
The Journal of Technology, Learning and Assessment; Vol. 4 No. 1 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1649
2011-05-10T21:38:39Z
jtla:ART
nmb a2200000Iu 4500
"051101 2005 eng "
1540-2525
dc
Knowing What All Students Know: Procedures for Developing Universal Design for Assessment
Ketterlin-Geller, Leanne R.
University of Oregon
Universal design for assessment (UDA) is intended to increase participation of students with disabilities and English-language learners in general education assessments by addressing student needs through customized testing platforms. Computer-based testing provides an optimal format for creating individually-tailored tests. However, although a theoretical basis for universal design is well established, little practical information is available to assist test developers in creating and implementing universally designed tests. This article discusses the application of universal design to assessment and describes how these principles are applied to a test of 3rd grade mathematics ability. I present the steps involved in conceptualizing, constructing, and implementing a universally designed test in anticipation that test developers, state department assessment coordinators, and other researchers will benefit from this application. Recommendations for future research and development efforts to create accessible computer-based learning environments for all students are explored.
The Journal of Technology, Learning and Assessment
2005-11-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1649
The Journal of Technology, Learning and Assessment; Vol. 4 No. 2 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1650
2011-05-10T21:38:39Z
jtla:ART
nmb a2200000Iu 4500
"060201 2006 eng "
1540-2525
dc
Automated Essay Scoring With e-rater® V.2
Attali, Yigal
Educational Testing Service
Burstein, Jill
Educational Testing Service
E-rater has been used by the Educational Testing Service for automated essay scoring since 1999. This paper describes a new version of e-rater (V.2) that is different from other automated essay scoring systems in several important respects. The main innovations of e-rater V.2 are a small, intuitive, and meaningful set of features used for scoring; a single scoring model and standards can be used across all prompts of an assessment; modeling procedures that are transparent and flexible, and can be based entirely on expert judgment. The paper describes this new system and presents evidence on the validity and reliability of its scores.
The Journal of Technology, Learning and Assessment
2006-02-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1650
The Journal of Technology, Learning and Assessment; Vol. 4 No. 3 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1651
2011-05-10T21:38:39Z
jtla:ART
nmb a2200000Iu 4500
"060329 2006 eng "
1540-2525
dc
An Evaluation of IntelliMetric™ Essay Scoring System
Rudner, Lawrence M.
GMAC
Garcia, Veronica
GMAC
Welch, Catherine
Assessment Innovations at ACT, Inc.
This report provides a two-part evaluation of the IntelliMetric™ automated essay scoring system based on its performance scoring essays from the Analytic Writing Assessment of the Graduate Management Admission Test™ (GMAT™). The IntelliMetric system performance is first compared to that of individual human raters, a Bayesian system employing simple word counts, and a weighted probability model using more than 750 responses to each of six prompts. The second, larger evaluation compares the IntelliMetric system ratings to those of human raters using approximately 500 responses to each of 101 prompts. Results from both evaluations suggest the IntelliMetric system is a consistent, reliable system for scoring AWA essays with a perfect + adjacent agreement on 96% to 98% and 92% to 100% of instances in evaluations 1 and 2, respectively. The Pearson r correlations of agreement between human raters and the IntelliMetric system averaged .83 in both evaluations.
The Journal of Technology, Learning and Assessment
2006-03-29 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1651
The Journal of Technology, Learning and Assessment; Vol. 4 No. 4 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1652
2011-05-10T21:38:39Z
jtla:ART
nmb a2200000Iu 4500
"060329 2006 eng "
1540-2525
dc
On-line Mathematics Assessment: The Impact of Mode on Performance and Question Answering Strategies
Johnson, Martin
University of Cambridge Local Examinations Syndicate
Green, Sylvia
University of Cambridge Local Examinations Syndicate
The transition from paper-based to computer-based assessment raises a number of important issues about how mode might affect children’s performance and question answering strategies. In this project 104 eleven-year-olds were given two sets of matched mathematics questions, one set on-line and the other on paper. Facility values were analyzed to explore the impact of the mode on performance. Errors were coded and this allowed further investigation of the differences between questions in the different modes. The study also investigated children’s affective responses to working on computer, attempting to gain an insight into the effect of motivational factors. This was made possible by observing and interviewing a sub-sample of children. Findings suggested that although there were no statistically significant differences between overall performances on paper and computer, there were enough differences at the individual question-level to warrant further investigation. Close analysis of the data suggests that it is possible that the question type, the way it is asked, and the numbers involved, might interact with mode to affect students’ willingness to show working methods. The findings also suggest that certain types of questions in certain domains might have different impacts according to mode. The study concludes that there is scope for more research to probe further any links that may exist between children’s thinking, behavior and assessment mode in order to satisfy concerns about the relative reliability and validity of computer-based and paper-based testing.
The Journal of Technology, Learning and Assessment
2006-03-29 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1652
The Journal of Technology, Learning and Assessment; Vol. 4 No. 5 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1653
2011-05-10T21:38:40Z
jtla:ART
nmb a2200000Iu 4500
"060103 2006 eng "
1540-2525
dc
Computer-Based Assessment in E-Learning: A Framework for Constructing "Intermediate Constraint" Questions and Tasks for Technology Platforms
Scalise, Kathleen
University of Oregon
Gifford, Bernard
UC Berkeley
Technology today offers many new opportunities for innovation in educational assessment through rich new assessment tasks and potentially powerful scoring, reporting and real-time feedback mechanisms. One potential limitation for realizing the benefits of computer-based assessment in both instructional assessment and large scale testing comes in designing questions and tasks with which computers can effectively interface (i.e., for scoring and score reporting purposes) while still gathering meaningful measurement evidence. This paper introduces a taxonomy or categorization of 28 innovative item types that may be useful in computer-based assessment. Organized along the degree of constraint on the respondent’s options for answering or interacting with the assessment item or task, the proposed taxonomy describes a set of iconic item types termed “intermediate constraint” items. These item types have responses that fall somewhere between fully constrained responses (i.e., the conventional multiple-choice question), which can be far too limiting to tap much of the potential of new information technologies, and fully constructed responses (i.e. the traditional essay), which can be a challenge for computers to meaningfully analyze even with today’s sophisticated tools. The 28 example types discussed in this paper are based on 7 categories of ordering involving successively decreasing response constraints from fully selected to fully constructed. Each category of constraint includes four iconic examples. The intended purpose of the proposed taxonomy is to provide a practical resource for assessment developers as well as a useful framework for the discussion of innovative assessment formats and uses in computer-based settings.
The Journal of Technology, Learning and Assessment
2006-01-03 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1653
The Journal of Technology, Learning and Assessment; Vol. 4 No. 6 (2006)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1654
2011-05-10T22:36:54Z
jtla:ART
nmb a2200000Iu 4500
"040501 2004 eng "
1540-2525
dc
Telementoring as a Collaborative Agent for Change
Friedman, Audrey A.
Boston College
Zibit, Melanie
Boston College
Coote, Meca
This case study explored the effectiveness of telementoring as a vehicle for preservice teachers to hone skills in the teaching of writing, to establish a mentoring relationship with urban high school students, and to help struggling writers improve writing skills necessary for student achievement. Inherent in this research was the goal to develop a collaborative model between the university and the high school for using technology to improve “at-risk” urban students’ skills in writing. Additionally, the research allowed preservice teachers to learn about themselves as evolving teachers as they broached some of the difficulties of teaching writing to academically diverse students and learned about the scarcity of resources and difficult realities that exist for urban students.
The Journal of Technology, Learning and Assessment
2004-05-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1654
The Journal of Technology, Learning and Assessment; Vol. 3 No. 1 (2004)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1655
2011-05-10T22:36:55Z
jtla:ART
nmb a2200000Iu 4500
"050101 2005 eng "
1540-2525
dc
Learning With Technology: The Impact of Laptop Use on Student Achievement
Cengiz Gulek, James
Demirtas, Hakan
Rapid technological advances in the last decade have sparked educational practitioners’ interest in utilizing laptops as an instructional tool to improve student learning. There is substantial evidence that using technology as an instructional tool enhances student learning and educational outcomes. Past research suggests that compared to their non-laptop counterparts, students in classrooms that provide all students with their own laptops spend more time involved in collaborative work, participate in more project-based instruction, produce writing of higher quality and greater length, gain increased access to information, improve research analysis skills, and spend more time doing homework on computers. Research has also shown that these students direct their own learning, report a greater reliance on active learning strategies, readily engage in problem solving and critical thinking, and consistently show deeper and more flexible uses of technology than students without individual laptops. The study presented here examined the impact of participation in a laptop program on student achievement. A total of 259 middle school students were followed via cohorts. The data collection measures included students’ overall cumulative grade point averages (GPAs), end-of-course grades, writing test scores, and state-mandated norm- and criterion-referenced standardized test scores. The baseline data for all measures showed that there was no statistically significant difference in English language arts, mathematics, writing, and overall grade point average achievement between laptop and non-laptop students prior to enrollment in the program. However, laptop students showed significantly higher achievement in nearly all measures after one year in the program. Cross-sectional analyses in Year 2 and Year 3 concurred with the results from the Year 1. Longitudinal analysis also proved to be an independent verification of the substantial impact of laptop use on student learning outcomes.
The Journal of Technology, Learning and Assessment
2005-01-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1655
The Journal of Technology, Learning and Assessment; Vol. 3 No. 2 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1656
2011-05-10T22:36:55Z
jtla:ART
nmb a2200000Iu 4500
"050101 2005 eng "
1540-2525
dc
Examining the Relationship Between Home and School Computer Use and Students’ English/Language Arts Test Scores
O'Dwyer, Laura
University of Massachusetts-Lowell
Russell, Michael
Boston College
Bebell, Damian
Boston College
Tucker-Seeley, Kevon R.
Boston College
With increased emphasis on test-based accountability measures has come increased interest in examining the impact of technology use on students’ academic performance. However, few empirical investigations exist that address this issue. This paper (1) examines previous research on the relationship between student achievement and technology use, (2) discusses the methodological and psychometric issues that arise when investigating such issues, and (3) presents a multilevel regression analysis of the relationship between a variety of student and teacher technology uses and fourth grade test scores on the Massachusetts Comprehensive Assessment System (MCAS) English/Language Arts test. In total, 986 fourth grade students from 55 intact classrooms in nine school districts in Massachusetts were included in this study. This study found that, while controlling for both prior achievement and socioeconomic status, students who reported greater frequency of technology use at school to edit papers were likely to have higher total English/language arts test scores and higher writing scores. Use of technology at school to prepare presentations was associated with lower English/language arts outcome measures. Teachers' use of technology for a variety of purposes were not significant predictors of student achievement, and students’ recreational use of technology at home was negatively associated with the learning outcomes.
The Journal of Technology, Learning and Assessment
2005-01-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1656
The Journal of Technology, Learning and Assessment; Vol. 3 No. 3 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1657
2011-05-10T22:36:55Z
jtla:ART
nmb a2200000Iu 4500
"050101 2005 eng "
1540-2525
dc
Examining the Effect of Computer-Based Passage Presentation of Reading Test Performance
Higgins, Jennifer
Boston College
Russell, Michael
Boston College
Hoffmann, Thomas
Boston College
To examine the impact of transitioning 4th grade reading comprehension assessments to the computer, 219 fourth graders were randomly assigned to take a one-hour reading comprehension assessment on paper, on a computer using scrolling text to navigate through passages, or on a computer using paging text to navigate through passages. This study examined whether presentation form affected student test scores. Students also completed a computer skills performance assessment, a paper based computer literacy assessment, and a computer use survey. Results from the reading comprehension assessment and the three computer instruments were used to examine differences in students test scores while taking into account their computer skills. ANOVA and regression analyses provide evidence of the following findings: 1. There were no significant differences in reading comprehension scores across testing modes. On average, students in the paper group (n=75) answered 58.1% of the items correctly, students in the scrolling group (n=70) answered 52.2% of the items correctly, and students in the whole page group (n=74) answered 56.9% of the items correctly. The almost a 6% point difference in scores between the paper and scrolling groups was not significant at the p
The Journal of Technology, Learning and Assessment
2005-01-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1657
The Journal of Technology, Learning and Assessment; Vol. 3 No. 4 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1658
2011-05-10T22:44:34Z
jtla:ART
nmb a2200000Iu 4500
"050201 2005 eng "
1540-2525
dc
Designing Handheld Software to Support Classroom Assessment: Analysis of Conditions for Teacher Adoption
Penuel, William R.
SRI International
Yarnall, Louise
Since 2002, Project WHIRL (Wireless Handhelds In Reflection on Learning) has investigated potential uses of handheld computers in K–12 science classrooms using a teacher-involved process of software development and field trials. The project is a three-year research and development grant from the National Science Foundation, and it is a partnership between SRI International and a medium-sized district in South Carolina, Beaufort County School District. In contrast to many recent handheld development projects aimed at developing curricular materials, Project WHIRL focused on the development of assessment materials. In Project WHIRL, teachers were asked to apply their own curricular materials, content understanding, and pedagogical content knowledge to the project. Teachers and SRI researchers, software developers, and assessment specialists worked together to design software and activities that could be used across a variety of topic areas and science and in multiple phases of instruction to improve classroom assessment. This design process revealed to the research team teachers’ beliefs and assumptions about assessment as well as a wide range of practices they used to find out what their students know and can do, both informal and formal. In this paper, we focus on how teachers’ initial teaching and assessment practices influenced the design of handheld software and the ways in which these designs have been used across a variety of teachers’ classrooms. In addition, this paper provides some preliminary answers to two of the key research questions we outlined at the outset of our project:
• What kinds of software designs can be feasibly implemented in classrooms that support effective assessment practice?
• What are the conditions under which teachers can adopt handheld tools to support classroom assessment?
The Journal of Technology, Learning and Assessment
2005-02-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1658
The Journal of Technology, Learning and Assessment; Vol. 3 No. 5 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1659
2011-05-10T22:36:56Z
jtla:ART
nmb a2200000Iu 4500
"050201 2005 eng "
1540-2525
dc
A Comparative Evaluation of Score Results from Computerized and Paper & Pencil Mathematics Testing in a Large Scale State Assessment Program
Poggio, John
University of Kansas
Glasnapp, Douglas R.
University of Kansas
Yang, Xiangdong
University of Kansas
Poggio, Andrew J.
University of Kansas
The present study reports results from a quasi-controlled empirical investigation addressing the impact on student test scores when using fixed form computer based testing (CBT) versus paper and pencil (P&P) testing as the delivery mode to assess student mathematics achievement in a state's large scale assessment program. Grade 7 students served as the target population. On a voluntary basis, participation resulted in 644 students being "double" tested: once with a randomly assigned CBT test form, and once with another randomly assigned and equated P&P test form. Both the equivalency of total test scores across different student groupings and the differential impact on individual items were examined. Descriptively there was very little difference in performance between the CBT and P&P scores obtained (less than 1 percentage point). Results make very clear that there existed no meaningful statistical differences in the composite test scores attained by the same students on a computerized fixed form assessment and an equated form of that assessment when taken in a traditional paper and pencil format. While a few items (9 of 204) were found to behave differently based on mode, close review and inspection of these items were not able to identified factors accounting for the differences.
The Journal of Technology, Learning and Assessment
2005-02-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1659
The Journal of Technology, Learning and Assessment; Vol. 3 No. 6 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1660
2011-05-10T22:36:56Z
jtla:ART
nmb a2200000Iu 4500
"050201 2005 eng "
1540-2525
dc
Applying Principles of Universal Design to Test Delivery: The Effect of Computer-based Read-aloud on Test Performance of High School Students with Learning Disabilities
Dolan, Robert
CAST
Hall, Tracey E.
CAST
Banerjee, Manju
University of Connecticut
Chun, Euljung
University of Illinois, Urbana-Champaign
Strangman, Nicole
CAST
Standards-based reform efforts are highly dependent on accurate assessment of all students, including those with disabilities. The accuracy of current large-scale assessments is undermined by construct-irrelevant factors including access barriers, a particular problem for students with disabilities. Testing accommodations such as the read-aloud have led to improvement, but research findings suggest the need for a more flexible, individualized approach to accommodations. The current pilot study applies principles of Universal Design for Learning to the creation of a prototype computer-based test delivery tool that provides students with a flexible, customizable testing environment with the option for read-aloud of test content. Two contrasting methods were used to deliver two equivalent forms of a National Assessment of Educational Progress United States history and civics test to ten high school students with learning disabilities. In a counterbalanced design, students were administered one form via traditional paper-and-pencil (PPT) and the other via a computer-based system with optional text-to-speech (CBT-TTS). Test scores were calculated, and student surveys, structured interviews, field observations, and usage tracking were conducted to derive information about student preferences and patterns of use. Results indicate a significant increase in scores on the CBT-TTS versus PPT administration for questions with reading passages greater than 100 words in length. Qualitative findings also support the effectiveness of CBT-TTS, which students generally preferred over PPT. The results of this pilot study provide preliminary support for the potential benefits and usability of digital technologies in creating universally designed assessments that more fairly and accurately test students with disabilities.
The Journal of Technology, Learning and Assessment
2005-02-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1660
The Journal of Technology, Learning and Assessment; Vol. 3 No. 7 (2005)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1661
2011-05-11T17:28:55Z
jtla:ART
nmb a2200000Iu 4500
"030201 2003 eng "
1540-2525
dc
The Effect of Computers on Student Writing: A Meta-analysis of Studies from 1992 to 2002
Goldberg, Amie
Russell, Michael
Cook, Abigail
Meta-analyses were performed including 26 studies conducted between 1992–2002 focused on the comparison between K–12 students writing with computers vs. paper-and-pencil. Significant mean effect sizes in favor of computers were found for quantity of writing (d=.50, n=14) and quality of writing (d= .41, n=15). Studies focused on revision behaviors between these two writing conditions (n=6) revealed mixed results. Others studies collected for the meta-analysis which did not meet the statistical criteria were also reviewed briefly. These articles (n=35) indicate that the writing process is more collaborative, iterative, and social in computer classrooms as compared with paper-and-pencil environments. For educational leaders questioning whether computers should be used to help students develop writing skills, the results of the meta-analyses suggest that on average students who use computers when learning to write are not only more engaged and motivated in their writing, but they produce written work that is of greater length and higher quality.
The Journal of Technology, Learning and Assessment
2003-02-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1661
The Journal of Technology, Learning and Assessment; Vol. 2 No. 1 (2003)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1662
2011-05-11T17:28:55Z
jtla:ART
nmb a2200000Iu 4500
"030801 2003 eng "
1540-2525
dc
An Exploratory Study to Examine the Feasibility of Measuring Problem-Solving Processes Using a Click-Through Interface
Chung, Gregory K.W.K.
National Center for Research on Evaluation, Standards, and Student Testing
Baker, Eva L.
National Center for Research on Evaluation, Standards, and Student Testing
In this study we investigated the feasibility of a novel user interface to support the measurement of problem-solving processes. Our research questions addressed the use of a "click-through" interface to measure the "generate-and-test" problem-solving process for a design problem. A click-through interface requires the user to explicitly perform an online action (e.g., to view time, the user has to click on a "time" icon). This interface allowed us to measure participants' intentional acts. Freshman college students were given the task of modifying a given, computer-interactive bicycle pump to satisfy performance requirements. The simulation interface provided participants with point-and-click access to controls to modify pump parameters, to run the simulation, to view important information, and to attempt to solve the task. Lag sequential analyses of participants' problem-solving processes over time showed cyclical behavior consistent with the generate-and-test strategy of modifying the pump design, running the simulation, viewing the information, and then either modifying the design or attempting to solve the problem and then modifying the design again. This behavior set was remarkably stable, with most lag 1 associations greater than .80. Our approach to measuring problem-solving processes appears feasible and promising, but more work is needed to gather additional validity evidence.
The Journal of Technology, Learning and Assessment
2003-08-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1662
The Journal of Technology, Learning and Assessment; Vol. 2 No. 2 (2003)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1663
2011-05-11T17:28:56Z
jtla:ART
nmb a2200000Iu 4500
"031101 2003 eng "
1540-2525
dc
A Feasibility Study of On-the-Fly Item Generation in Adaptive Testing
Bejar, Isaac I.
Lawless, René R.
Morley, Mary E.
Wagner, Michael E.
Bennett, Randy E.
Revuelta, Javier
The goal of this study was to assess the feasibility of an approach to adaptive testing using item models based on the quantitative section of the Graduate Record Examination (GRE) test. An item model is a means of generating items that are isomorphic, that is, equivalent in content and equivalent psychometrically. Item models, like items, are calibrated by fitting an IRT response model. The resulting set of parameter estimates is imputed to all the items generated by the model. An on-the-fly adaptive test tailors the test to examinees and presents instances of an item model rather than independently developed items. A simulation study was designed to explore the effect an on-the-fly test design would have on score precision and bias as a function of the level of item model isomorphicity. In addition, two types of experimental tests were administered – an experimental, on-the-fly, adaptive quantitative-reasoning test as well as an experimental quantitative-reasoning linear test consisting of items based on item models. Results of the simulation study showed that under different levels of isomorphicity, there was no bias, but precision of measurement was eroded at some level. However, the comparison of experimental, on-the-fly adaptive test scores with the GRE test scores closely matched the test-retest correlation observed under operational conditions. Analyses of item functioning on the experimental linear test forms suggested that a high level of isomorphicity across items within models was achieved. The current study provides a promising first step toward significant cost reduction and theoretical improvement in test creation methodology for educational assessment.
The Journal of Technology, Learning and Assessment
2003-11-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1663
The Journal of Technology, Learning and Assessment; Vol. 2 No. 3 (2003)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1664
2011-05-11T17:28:56Z
jtla:ART
nmb a2200000Iu 4500
"031201 2003 eng "
1540-2525
dc
Examinee Characteristics Associated With Choice of Composition Medium on the TOEFL Writing Section
Wolfe, Edward W.
The Test of English as a Foreign Language (TOEFL) contains a direct writing assessment, and examinees are given the option of composing their responses at a computer terminal using a keyboard or composing their responses in handwriting. This study sought to determine whether examinees from different demographic groups choose handwriting versus word-processing composition media with equal likelihood. The relationship between several demographic characteristics of examinees and their composition medium choice on the TOEFL writing assessment is examined using logistic regression. Females, speakers of languages based on non-Roman/Cyrillic character systems, examinees from Africa and the Middle East, and examinees with less proficient English skills were more likely to choose handwriting. Although there were only small differences between age groups with respect to composition medium choice in most geographic regions, younger examinees from Europe and older examinees from Asia were more likely to choose handwriting than their regional counterparts.
The Journal of Technology, Learning and Assessment
2003-12-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1664
The Journal of Technology, Learning and Assessment; Vol. 2 No. 4 (2003)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1665
2011-05-11T17:28:56Z
jtla:ART
nmb a2200000Iu 4500
"031201 2003 eng "
1540-2525
dc
Computerized Adaptive Testing: A Comparison of Three Content Balancing Methods
Leung, Chi-Keung
Chang, Hua-Hua
Hau, Kit-Tai
Content balancing is often a practical consideration in the design of computerized adaptive testing (CAT). This study compared three content balancing methods, namely, the constrained CAT (CCAT), the modified constrained CAT (MCCAT), and the modified multinomial model (MMM), under various conditions of test length and target maximum exposure rate. Results of a series of simulation studies indicate that there is no systematic effect of content balancing method in measurement efficiency and pool utilization. However, among the three methods, the MMM appears to consistently over-expose fewer items.
The Journal of Technology, Learning and Assessment
2003-12-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1665
The Journal of Technology, Learning and Assessment; Vol. 2 No. 5 (2003)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1666
2011-05-11T17:28:56Z
jtla:ART
nmb a2200000Iu 4500
"040201 2004 eng "
1540-2525
dc
Developing Computerized Versions of Paper-and-Pencil Tests: Mode Effects for Passage-Based Tests
Pommerich, Mary
Boston College
As testing moves from paper-and-pencil administration toward computerized administration, how to present tests on a computer screen becomes an important concern. Of particular concern are tests that contain necessary information that cannot be displayed on screen all at once for an item. Ideally, the method of presentation should not interfere with examinee performance on the test. Examinees should perform similarly on an item regardless of the mode of administration. This paper discusses the development of a computer interface for passage-based, multiple-choice tests. Findings are presented from two studies that compared performance across computer and paper administrations of several fixed-form tests. The effect of computer interface changes made between the two studies is discussed. The results of both studies showed some performance differences across modes. Evaluations of individual items suggested a variety of factors that could have contributed to mode effects. Although the observed mode effects were in general small, overall the findings suggest that it would be beneficial to develop an understanding of factors that can influence examinee behavior and to design a computer interface accordingly, to ensure that examinees are responding to test content rather than features inherent in presenting the test on computer.
The Journal of Technology, Learning and Assessment
2004-02-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1666
The Journal of Technology, Learning and Assessment; Vol. 2 No. 6 (2004)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1667
2011-05-11T18:06:54Z
jtla:ART
nmb a2200000Iu 4500
"020601 2002 eng "
1540-2525
dc
Inexorable and Inevitable: The Continuing Story of Technology and Assessment
Bennett, Randy Elliot
Educational Testing Service
This paper argues that the inexorable advance of technology will force fundamental changes in the format and content of assessment. Technology is infusing the workplace, leading to widespread requirements for workers skilled in the use of computers. Technology is also finding a key place in education. This is occurring not only because technology skill has become a workplace requirement. It is also happening because technology provides information resources central to the pursuit of knowledge and because the medium allows for the delivery of instruction to individuals who couldn’t otherwise obtain it. As technology becomes more central to schooling, assessing students in a medium different from the one in which they typically learn will become increasingly untenable. Education leaders in several states and numerous school districts are acting on that implication, implementing technology-based tests for low- and high-stakes decisions in elementary and secondary schools and across all key content areas. While some of these examinations are already being administered statewide, others will take several years to bring to fully operational status. These groundbreaking efforts will undoubtedly encounter significant difficulties that may include cost, measurement, technological-dependability, and security issues. But most importantly, state efforts will need to go beyond the initial achievement of computerizing traditional multiple-choice tests to create assessments that facilitate learning and instruction in ways that paper measures cannot.
The Journal of Technology, Learning and Assessment
2002-06-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1667
The Journal of Technology, Learning and Assessment; Vol. 1 No. 1 (2002)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1668
2011-05-11T18:06:55Z
jtla:ART
nmb a2200000Iu 4500
"020601 2002 eng "
1540-2525
dc
Automated Essay Scoring Using Bayes' Theorem
Rudner, Lawrence M.
Liang, Tahung
Two Bayesian models for text classification from the information science field were extended and applied to student produced essays. Both models were calibrated using 462 essays with two score points. The calibrated systems were applied to 80 new, pre-scored essays with 40 essays in each score group. Manipulated variables included the two models; the use of words, phrases and arguments; two approaches to trimming; stemming; and the use of stopwords. While the text classification literature suggests the need to calibrate on thousands of cases per score group, accuracy of over 80% was achieved with the sparse dataset used in this study.
The Journal of Technology, Learning and Assessment
2002-06-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1668
The Journal of Technology, Learning and Assessment; Vol. 1 No. 2 (2002)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1669
2011-05-11T18:06:55Z
jtla:ART
nmb a2200000Iu 4500
"020601 2002 eng "
1540-2525
dc
Assessing Student Problem-Solving Skills With Complex Computer-Based Tasks
Vendlinksi, Terry
Stevens, Ron
Valid formative assessment is an essential element in improving both student learning and the professional development of educators. Various shortcomings in common assessment modalities, however, hinder our ability to make and evaluate such formative decisions. The diffusion of computer technology into American classrooms offers new opportunities to evaluate student learning and a rich, new source of data upon which to make inferences about the formative interventions that will improve learning. The path from data to inference, however, requires appropriate methodologies that can fully exploit the data without discarding or oversimplifying the behavioral complexity of student activity. This study used IMMEX™, a computerized simulation and problem-solving tool, along with artificial neural networks as pattern recognizers to identify the common types of strategies high school chemistry students used to solve qualitative chemistry problems. Then, based on the calculated probabilities that students would transition between these strategy types over time, Markov hidden chain analysis allowed us to develop a model of the capacity of the current curriculum to produce students able to apply chemistry content to a real-world problem.
The Journal of Technology, Learning and Assessment
2002-06-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1669
The Journal of Technology, Learning and Assessment; Vol. 1 No. 3 (2002)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1670
2011-05-11T18:06:55Z
jtla:ART
nmb a2200000Iu 4500
"020801 2002 eng "
1540-2525
dc
Investigating Children's Emerging Digital Literacies
Ba, Harouna
Tally, William
Tsikalas, Kallen
Departing from the view that the digital divide is a technical issue, the EDC Center for Children and Technology (CCT) and Computers for Youth (CFY) have completed a 1-year comparative study of children’s use of computers in low- and middle-income homes. To assess emerging digital literacy skills at home, we define digital literacy as a set of habits through which children use computer technology for learning, work, socializing, and fun. Our findings indicate that both groups of children used the computer to do schoolwork. Many children with leisure time at home also spent 2 to 3 hours a day communicating with peers, playing games, and pursuing creative hobbies. When solving technical problems, the children from low-income homes relied more on formal help providers such as CFY and schoolteachers, while the children from middle-income homes turned to themselves, their families, and their peers. All the children developed basic literacy with word processing, email, and the Web. Not surprisingly, those children who spent considerably more time online developed more robust skills in online communication and authoring. The results also show that children’s digital literacy skills are emerging in ways that reflect local circumstances, such as the length of time children had a computer at home; the family’s ability to purchase stable Internet connectivity; the number of computers in the home and where they are located (bedroom or public area); parents’ attitudes toward computer use; parents’ own experience and skills with computers; children’s leisure time at home; the computing habits of children’s peers; the technical expertise of friends, relatives, and neighbors; homework assignments; and the direct instruction provided by teachers in the classroom. The findings highlight issues impacting social, school, and assessment policy and practice. Specifically, these results have implications for local educational systems interested in developing digital literacy assessment instruments that demonstrate progress as well as specific areas that need improvement. The digital literacy analysis model developed in this study affords teachers opportunities to start to construct activities based on 5 central digital literacy components: computing for a range of purpose, understanding the function of and ability to use common tools, communication literacy, Web literacy, and troubleshooting skills. These activities can help teachers scaffold for their students and themselves the range of digital literacy proficiency skills, that is, their proficiency in using common tools as well as their use of different communications and Web tools. However, when it comes to large-scale assessments of digital literacy of teachers and students at the national and federal levels, the use of the digital literacy analysis model outlined in this study would be operationally and financially impractical. The field urgently needs to develop valid methods and instruments of assessment that help aggregate state and federal data as schools and districts at the local level acquire more and more technology. These methods and measurement instruments are likely to include surveys, e-readiness assessment tools, multiple-choice tests, pre- and post-tests, etc., that can measure individual as well as group progress in digital literacy.
The Journal of Technology, Learning and Assessment
2002-08-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1670
The Journal of Technology, Learning and Assessment; Vol. 1 No. 4 (2002)
eng
Copyright (c)
oai:ejournals.bc.edu:article/1671
2011-05-11T18:06:56Z
jtla:ART
nmb a2200000Iu 4500
"021001 2002 eng "
1540-2525
dc
Enhancing the Design and Delivery of Assessment Systems: A Four-Process Architecture
Almond, Russell
Steinberg, Linda
Mislevy, Robert
Persistent elements and relationships underlie the design and delivery of educational assessments, despite their widely varying purposes, contexts, and data types. One starting point for analyzing these relationships is the assessment as experienced by the examinee: 'What kinds of questions are on the test?,' 'Can I do them in any order?,' 'Which ones did I get wrong?,' and 'What's my score?' These questions, asked by people of all ages and backgrounds, reveal an awareness that an assessment generally entails the selection and presentation of tasks, the scoring of responses, and the accumulation of these response evaluations into some kind of summary score. A four-process architecture is presented for the delivery of assessments: Activity Selection, Presentation, Response Processing, and Summary Scoring. The roles and the interactions among these processes, and how they arise from an assessment design model, are discussed. The ideas are illustrated with hypothetical examples. The complementary modular structures of the delivery processes and the design framework are seen to encourage coherence among assessment purpose, design, and delivery, as well as to promote efficiency through the reuse of design objects and delivery processes.
The Journal of Technology, Learning and Assessment
2002-10-01 00:00:00
application/pdf
https://ejournals.bc.edu/index.php/jtla/article/view/1671
The Journal of Technology, Learning and Assessment; Vol. 1 No. 5 (2002)
eng
Copyright (c)