2024-03-29T00:21:03Z
https://ejournals.bc.edu/index.php/jtla/oai
oai:ejournals.bc.edu:article/1601
2011-05-09T22:14:07Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1601
2011-05-09T22:14:07Z
The Journal of Technology, Learning and Assessment
Vol. 10 No. 1 (2010)
The Effectiveness and Efficiency of Distributed Online, Regional Online, and Regional Face-to-Face Training for Writing Assessment Raters
Wolfe, Edward W.; Pearson
Matthews, Staci; Pearson
Vickers, Daisy; Pearson
2010-07-19
url:https://ejournals.bc.edu/index.php/jtla/article/view/1601
large-scale assessment
scorer training
reliability
scoring
validity
en_US
This study examined the influence of rater training and scoring context on training time, scoring time, qualifying rate, quality of ratings, and rater perceptions. 120 raters participated in the study and experienced one of three training contexts: (a) online training in a distributed scoring context, (b) online training in a regional scoring context, and (c) stand-up training in a regional context. After training, raters assigned scores to qualification sets, scored 400 student essays, and responded to a questionnaire that measured their perceptions of the effectiveness of, and satisfaction with, the training and scoring process, materials, and staff. The results suggest that the only clear difference on the outcomes for these three groups of raters concerned training time—online training was considerably faster. There were no clear differences between groups concerning qualification rate, rating quality, or rater perceptions.
oai:ejournals.bc.edu:article/1602
2011-05-09T22:44:43Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1602
2011-05-09T22:44:43Z
The Journal of Technology, Learning and Assessment
Vol. 10 No. 2 (2010)
Examining the Feasibility and Effect of Transitioning GED Tests to Computer
Higgins, Jennifer; Nimble Assessment Systems
Patterson, Margaret Becker; American Council on Education, GED Testing Service
Bozman, Martha; American Council on Education, GED Testing Service
Katz, Michael; Nimble Assessment Systems
2010-08-21
url:https://ejournals.bc.edu/index.php/jtla/article/view/1602
feasibility
GED
GED Tests
Computer
assessment
en_US
This study examined the feasibility of administering GED tests using a computer based testing system with embedded accessibility tools and the impact on test scores and test-taker experience when GED tests are transitioned from paper to computer. Nineteen test centers across five states successfully installed the computer based testing program, followed the research protocol, and transmitted testing data with minimal issues, providing evidence of the feasibility of administering GED tests on computer. Two hundred and sixteen GED candidates participated in the research by completing two GED mathematics practice test forms and a survey. Participants completed the first form on paper and were randomly assigned to take the second form on computer or paper. The survey asked students to report demographic information, information about their use of computers, and their preference for using a computer to take tests. Regression analyses showed that participants were neither advantaged nor disadvantaged by taking the GED mathematics test on computer. This finding also holds true after accounting for student’s reported computer use and preference for taking tests on computer.
oai:ejournals.bc.edu:article/1603
2011-05-09T22:14:07Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1603
2011-05-09T22:14:07Z
The Journal of Technology, Learning and Assessment
Vol. 10 No. 3 (2010)
Performance of a Generic Approach in Automated Essay Scoring
Attali, Yigal; Educational Testing Service
Bridgeman, Brent; Educational Testing Service
Trapani, Catherine; Educational Testing Service
2010-08-25
url:https://ejournals.bc.edu/index.php/jtla/article/view/1603
essay writing assessment
automated scoring
en_US
A generic approach in automated essay scoring produces scores that have the same meaning across all prompts, existing or new, of a writing assessment. This is accomplished by using a single set of linguistic indicators (or features), a consistent way of combining and weighting these features into essay scores, and a focus on features that are not based on prompt-specific information or vocabulary. This approach has both logistical and validity-related advantages. This paper evaluates the performance of generic scores in the context of the e-rater® automated essay scoring system. Generic scores were compared with prompt-specific scores and scores that included prompt-specific vocabulary features. These comparisons were performed with large samples of essays written to three writing assessments: The GRE General Test argument and issue tasks and the TOEFL independent task. Criteria for evaluation included level of agreement with human scores, discrepancy from human scores across prompts, and correlations with other available scores. Results showed small differences between generic and prompt-specific scores and adequate performance of both types of scores compared to human performance.
oai:ejournals.bc.edu:article/1604
2011-05-09T22:14:08Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1604
2011-05-09T22:14:08Z
The Journal of Technology, Learning and Assessment
Vol. 10 No. 4 (2010)
Measuring Cognition of Students with Disabilities Using Technology-Enabled Assessments: Recommendations for a National Research Agenda
Bechard, Sue; Measured Progress
Sheinker, Jan
Abell, Rosemary; ROSE Consulting
Barton, Karen; CTB/McGraw-Hill
Burling, Kelly; Pearson
Camacho, Christopher; LanguageMate
Cameto, Renée; SRI International
Haertel, Geneva; SRI International
Hansen, Eric; Educational Testing Service
Johnstone, Chris; National Center on Educational Outcomes
Kingston, Neal; University of Kansas
Murray, Elizabeth (Boo); Center for Applied Special Technology
Parker, Caroline E; Education Development Center, Inc.
Redfield, Doris; Edvantia, Inc.
Tucker, Bill; Education Sector
2010-11-08
url:https://ejournals.bc.edu/index.php/jtla/article/view/1604
technology
assessment
disability
cognition
en_US
This paper represents one outcome from the Invitational Research Symposium on Technology-Enabled and Universally Designed Assessments, which examined technology-enabled assessments (TEA) and universal design (UD) as they relate to students with disabilities (SWD). It was developed to stimulate research into TEAs designed to better understand the pathways to achievement for the full range of the student population through enhanced measurement capabilities offered by TEA. This paper presents important questions in four critical areas that need to be addressed by research efforts to enhance the measurement of cognition for students with disabilities: (a) better measurement of achievement for students with unique cognitive pathways to learning, (b) how interactive-dynamic assessments can assist investigations into learning progressions, (c) improvement of the validity of assessments for students previously in the margins, and (d) the potential consequences of TEA for students with disabilities. The current efforts for educational reform provide a unique window for action, and test designers are encouraged to take advantage of new opportunities to use TEA in ways that were not possible with paper and pencil tests. Symposium participants describe how technology-enabled assessments have the potential to provide more diagnostic information about students from various assessment sources about progress toward learning targets, generate better information to guide instruction and identify areas of focus for professional development, and create assessments that are more inclusive and measure achievement with improved validity for all students, especially students with disabilities.
oai:ejournals.bc.edu:article/1605
2011-05-09T22:14:08Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1605
2011-05-09T22:14:08Z
The Journal of Technology, Learning and Assessment
Vol. 10 No. 5 (2010)
Technology-Enabled and Universally Designed Assessment: Considering Access in Measuring the Achievement of Students with Disabilities—A Foundation for Research
Almond, Patricia; University of Oregon
Winter, Phoebe; Pacific Metrics Corporation
Cameto, Renée; SRI International
Russell, Michael; Boston College
Sato, Edynn; WestEdand
Clarke-Midura, Jody; Harvard Graduate School of Education
Torres, Chloe; Measured Progress
Haertel, Geneva; SRI International
Dolan, Robert; Pearson
Beddow, Peter; Vanderbilt University
Lazarus, Sheryl; University of Minnesota
2010-11-08
url:https://ejournals.bc.edu/index.php/jtla/article/view/1605
assessment
technology
disabiilities
accessibility
assistive technology
en_US
This paper represents one outcome from the Invitational Research Symposium on Technology-Enabled and Universally Designed Assessments, which examined technology-enabled assessments (TEA) and universal design (UD) as they relate to students with disabilities (SWD). It was developed to stimulate research into TEAs designed to make tests appropriate for the full range of the student population through enhanced accessibility. Four themes are explored: (a) a construct-centered approach to developing accessible assessments; (b) how technology and UD can provide access to targeted knowledge, skills, and abilities by embedding access and interactive features directly into systems that deliver TEAs; (c) the possibility of incorporating scaffolding directly into innovative assessment items; and (d) the importance of investigating the validity of inferences from TEAs that incorporate accessibility features designed to maximize validity. Through the paper, symposium participants and contributing authors share their understanding of issues and offer insights to researchers who conduct studies on the design, development, and validation of technology-enabled and universally designed assessments that include SWD. The paper proposes a focused research agenda and makes it clear that a principled program of research is needed to properly develop and use technology-enabled and universally designed educational assessments that encourage the inclusion of SWD. As research progresses, TEAs need to improve how they assess studentsâ understanding of complex academic content and how they provide equitable access to all students including SWD.
oai:ejournals.bc.edu:article/1606
2011-05-09T22:30:32Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1606
2011-05-09T22:30:32Z
The Journal of Technology, Learning and Assessment
Vol. 9 No. 1 (2010)
Educational Outcomes and Research from 1:1 Computing Settings
Bebell, Damian; Boston College
O'Dwyer, Laura; Boston College
2010-01-03
url:https://ejournals.bc.edu/index.php/jtla/article/view/1606
1
one to one
comupting settings
assessment
testing
special edition
en_US
Despite the growing interest in 1:1 computing initiatives, relatively little empirical research has focused on the outcomes of these investments. The current special edition of the Journal of Technology and Assessment presents four empirical studies of K–12 1:1 computing programs and one review of key themes in the conversation about 1:1 computing among advocates and critics. In this introduction to our 1:1 special edition, we synthesize across the studies and discuss the emergent themes. Looking specifically across these studies, we summarize evidence that participation in the 1:1 programs was associated with increased student and teacher technology use, increased student engagement and interest level, and modest increases in student achievement.
oai:ejournals.bc.edu:article/1607
2011-05-09T22:30:32Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1607
2011-05-09T22:30:32Z
The Journal of Technology, Learning and Assessment
Vol. 9 No. 2 (2010)
One to One Computing: A Summary of the Quantitative Results from the Berkshire Wireless Learning Initiative
Bebell, Damian; Boston College
Kay, Rachel; Boston College
2010-03-02
url:https://ejournals.bc.edu/index.php/jtla/article/view/1607
1 to 1 computing
middle school
laptop initiative
Berkshire
BWLI
student achievement
engagement
en_US
This paper examines the educational impacts of the Berkshire Wireless Learning Initiative (BWLI), a pilot program that provided 1:1 technology access to all students and teachers across five public and private middle schools in western Massachusetts. Using a pre/post comparative study design, the current study explores a wide range of program impacts over the three years of the project’s implementation. Specifically, the current document provides an overview of the project background, implementation, research design and methodology, and a summary of the quantitative results. The study details how teaching and learning practices changed when students and teachers were provided with laptops, wireless learning environments, and additional technology resources. The results found that both the implementation and outcomes of the program were varied across the five 1:1 settings and over the three years of the student laptop implementation. Despite these differences, there was evidence that the types of educational access and opportunities afforded by 1:1 computing through the pilot program led to measurable changes in teacher practices, student achievement, student engagement, and students’ research skills.
oai:ejournals.bc.edu:article/1608
2011-05-09T22:30:32Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1608
2011-05-09T22:30:32Z
The Journal of Technology, Learning and Assessment
Vol. 9 No. 3 (2010)
After Installation: Ubiquitous Computing and High School Science in Three Experienced, High-Technology Schools
Drayton, Brian; TERC, Inc.
Falk, Joni K.; TERC, Inc.
Stroud, Rena; TERC, Inc.
Hobbs, Kathryn; TERC, Inc.
Hammerman, James; TERC, Inc.
2010-01-03
url:https://ejournals.bc.edu/index.php/jtla/article/view/1608
ubiquitous computing
high school
science education
en_US
There are few studies of the impact of ubiquitous computing on high school science, and the majority of studies of ubiquitous computing report on the first period of implementation. The present study presents data on 3 high schools with carefully elaborated ubiquitous computing systems, who have gone through at least one "obsolescence cycle" and are therefore several years past first implementation. Data shows how the affordances of 1:1, wireless environment are being deployed in these science classrooms, and the effects of the environment on science content, data analysis, labs and other uses for visualizations, and classroom interaction. While some positive effects are clearly seen in these classrooms, even 5 years or more into the innovation, problems remain, and school cultural factors seem to play an important role in teacher uptake and integration of the technology. Implications for teacher learning are discussed.
oai:ejournals.bc.edu:article/1609
2011-05-09T22:30:33Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1609
2011-05-09T22:30:33Z
The Journal of Technology, Learning and Assessment
Vol. 9 No. 4 (2010)
Evaluating the Implementation Fidelity of Technology Immersion and its Relationship with Student Achievement
Shapley, Kelly S.; Shapley Research Associates
Sheehan, Daniel; Texas Center for Educational Research
Maloney, Catherine; Texas Center for Educational Research
Caranikas-Walker, Fanny; Texas Center for Educational Research
2010-01-03
url:https://ejournals.bc.edu/index.php/jtla/article/view/1609
Technology
Evaluation
Middle Schools
Implementation Fidelity
Achievement
Educational Reform
en_US
In a pilot study of the Technology Immersion model, high-need middle schools were “immersed” in technology by providing a laptop for each student and teacher, wireless Internet access, curricular and assessment resources, professional development, and technical and pedagogical support. This article examines the fidelity of model implementation and associations between implementation indicators and student achievement. Results across three years for 21 immersion schools show that the average levels of school support for Technology Immersion and teachers’ Classroom Immersion increased slightly, while the level of Student Access and Use declined. Implementation quality varied across schools and classrooms, with a quarter or less of schools and core-content classrooms reaching substantial implementation. Using hierarchical linear modeling, we found that teacher-level implementation components (Immersion Support, Classroom Immersion) were inconsistent and mostly not statistically significant predictors of student achievement, whereas students’ use of laptops outside of school for homework and learning games was the strongest implementation mediator of achievement.
oai:ejournals.bc.edu:article/1610
2011-05-09T22:30:33Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1610
2011-05-09T22:30:33Z
The Journal of Technology, Learning and Assessment
Vol. 9 No. 5 (2010)
Laptops and Fourth Grade Literacy: Assisting the Jump over the Fourth-Grade Slump
Suhr, Kurt A.; Newport Mesa Unified School District
Hernandez, David A.; Newport Mesa Unified School District
Grimes, Doug; University of California, Irvine
Warschauer, Mark; University of California, Irvine
2010-03-03
url:https://ejournals.bc.edu/index.php/jtla/article/view/1610
education technology
upper elementary reading
fourth grade slump
one-to-one
laptop learning
en_US
School districts throughout the country are considering how to best integrate technology into instruction. There has been a movement in many districts toward one-to-one laptop instruction, in which all students are provided a laptop computer, but there is concern that these programs may not yield sufficiently improved learning outcomes to justify their substantial cost. And while there has been a great deal of research on the use of laptops in schools, there is little quantitative research systematically investigating the impact of laptop use on test outcomes, and none among students at the fourth-to-fifth grade levels. This study investigated whether a one-to-one laptop program could help improve English language arts (ELA) test scores of upper elementary students, a group that often faces a slowdown of literacy development during the transition from learning to read to reading to learn known as the fourth-grade slump. We explore these questions by comparing changes in the ELA test scores of a group of students who entered a one-to-one laptop program in the fourth grade to a similar group of students in a traditional program in the same school district. After two years’ participation in the program, laptop students outperformed non-laptop students on changes in the ELA total score and in the three subtests that correspond most closely to frequent laptop use: writing strategies, literary response and analysis, and reading comprehension.
oai:ejournals.bc.edu:article/1611
2011-05-09T22:30:33Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1611
2011-05-09T22:30:33Z
The Journal of Technology, Learning and Assessment
Vol. 9 No. 6 (2010)
The End of Techno-Critique: The Naked Truth about 1:1 Laptop Initiatives and Educational Change
Weston, Mark E.; University of Colorado at Denver and Health Sciences Center
Bain, Alan; Charles Sturt University
2010-02-04
url:https://ejournals.bc.edu/index.php/jtla/article/view/1611
Education change
education reform
educational technology
one-to-one computing
en_US
This article responds to a generation of techno-criticism in education. It contains a review of the key themes of that criticism. The context of previous efforts to reform education reframes that criticism. Within that context, the question is raised about what schools need to look and be like in order to take advantage of laptop computers and other technology. In doing so, the article presents a vision for self-organizing schools.
oai:ejournals.bc.edu:article/1620
2011-05-09T22:57:42Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1620
2011-05-09T22:57:42Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 1 (2009)
Measuring Conditions Conducive to Knowledge Development
Adams, Nan B; Southeastern Louisiana University
DeVaney, Thomas; Southeastern Louisiana University
Sawyer, Susan G; Southeastern Louisiana University
2009-08-12
url:https://ejournals.bc.edu/index.php/jtla/article/view/1620
virtual learning environments
knowledge development
assessment
technology
learning
en_US
The design of virtual learning environments for post-secondary instruction is rapidly increasing among public and private universities. While the quantity of online courses over the past 10 years has exponentially increased, the quality of these courses has not. As universities increase their online teaching activities, real concern about the best design for these online learning opportunities underscores the need to create effective and responsive virtual learning environments. Adams (2007) developed the Recursive Model for Knowledge Development in Virtual Environments. The premise of this model is the belief that good teaching and engaged learning should not be determined by the use of certain instructional tools but by the guiding principal that learning is an active and recursive process, where knowledge must be contextualized to be relevant to the learner. To this purpose, this article describes the initial development in the ongoing process of designing a valid and reliable assessment tool, the Virtual Learning Environment Survey—VLES, for exploring the degree to which the Recursive Model for Knowledge Development relates to effective design of online learning environments. This student self-report survey will seek to provide guidance for the assessment of online learning environments through collection of student perceptions of teaching strategies, knowledge approach, and knowledge ownership in online classrooms.
oai:ejournals.bc.edu:article/1621
2011-05-09T22:57:42Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1621
2011-05-09T22:57:42Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 2 (2010)
On the Roles of External Knowledge Representations in Assessment Design
Mislevy, Robert J; University of Maryland, College Park
Behrens, John T; Cisco Systems, Inc.
Bennett, Randy E; ETS
Demark, Sarah F; Cisco Systems, Inc.
Frezzo, Dennis C; Cisco Systems, Inc.
Levy, Roy; Arizona State University
Robinson, Daniel H; University of Texas at Austin
Wise Rutstein, Daisy; University of Maryland, College Park
Shute, Valerie J; Florida State University
Stanley, Ken; Cisco Systems, Inc.
Winters, Fielding I; University of Maryland, College Park
2010-01-07
url:https://ejournals.bc.edu/index.php/jtla/article/view/1621
Design patterns
evidence-centered assessment design
knowledge representation
simulation tasks
en_US
People use external knowledge representations (EKRs) to identify, depict, transform, store, share, and archive information. Learning how to work with EKRs is central to be-coming proficient in virtually every discipline. As such, EKRs play central roles in cur-riculum, instruction, and assessment. Five key roles of EKRs in educational assessment are described: (1) An assessment is itself an EKR, which makes explicit the knowledge that is valued, ways it is used, and standards of good work. (2) The analysis of any domain in which learning is to be assessed must include the iden-tification and analysis of the EKRs in that domain. (3) Assessment tasks can be structured around the knowledge, relationships, and uses of domain EKRs. (4) "Design EKRs" can be created to organize knowledge about a domain in forms that support the design of assessment. (5) EKRs in the discipline of assessment design can guide and structure the domain analyses (#2), task construction (#3), and the creation and use of design EKRs (#4). The third and fourth roles are discussed and illustrated in greater detail, through the per-spective of an "evidence-centered" assessment design framework that reflects the fifth role. Connections with automated task construction and scoring are highlighted. Ideas are illustrated with two examples: "generate examples" tasks and simulation-based tasks for assessing computer network design and troubleshooting skills.
oai:ejournals.bc.edu:article/1622
2011-05-09T22:57:42Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1622
2011-05-09T22:57:42Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 3 (2010)
Controlling Test Overlap Rate in Automated Assembly of Multiple Equivalent Test Forms
Lin, Chuan-Ju; National University of Tainan, Taiwan
2010-01-28
url:https://ejournals.bc.edu/index.php/jtla/article/view/1622
automated test assembly
test overlap control
ordered item pooling
test security
expected baseline test-overlap rate
en_US
Assembling equivalent test forms with minimal test overlap across forms is important in ensuring test security. Chen and Lei (2009) suggested a exposure control technique to control test overlap-ordered item pooling on the fly based on the essence that test overlap rate – ordered item pooling for the first t examinees is a function of test overlap rate – ordered item pooling for the previous (t-1) examinees. The exposure control procedure to control test overlap-ordered item pooling on the fly appears to meet the needs of controlling test overlap rate for tests assembled sequentially. To develop a better understanding of how well the ordered-item-pooling control method function in automated assembly of multiple forms with the WDM heuristic, this study evaluated its performance under different conditions of test length and test-content outline by comparing the outcomes to those from the corresponding baseline automated-test-assembly (ATA) conditions, where test overlap controls were not considered. The evaluation criteria included (i) the conformity to the test-assembly constraints, (ii) test parallelism in terms of the resultant psychometric properties, (iii) average test overlap rate, and (iv) distribution of item exposure rate. The results showed that the ordered-item-pooling control procedure demonstrated its effectiveness in most experimental conditions by achieving an acceptable average test overlap rate across multiple forms without compromising the conformity to the test-assembly constraints and the test equity of the assembled forms. Moreover, test security might be ensured in less supportive contexts for ATA by imposing item exposure control together with test overlap control that would be less likely to compromise test quality. More research is needed to verify this anticipation.
oai:ejournals.bc.edu:article/1623
2011-05-09T22:57:42Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1623
2011-05-09T22:57:42Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 4 (2010)
Evidence-centered Design of Epistemic Games: Measurement Principles for Complex Learning Environments
Rupp, André A.; University of Maryland, College Park
Gushta, Matthew; American Institutes for Research
Mislevy, Robert J; University of Maryland, College Park
Shaffer, David Williamson; University of Wisconsin at Madison
2010-01-31
url:https://ejournals.bc.edu/index.php/jtla/article/view/1623
evidence-centered design
epistemic games
measurement
en_US
We are currently at an exciting juncture in developing effective means for assessing so-called 21st-century skills in an innovative yet reliable fashion. One of these avenues leads through the world of epistemic games (Shaffer, 2006a), which are games designed to give learners the rich experience of professional practica within a discipline. They serve to develop domain-specific expertise based on principles of collaborative learning, distributed expertise, and complex problem-solving. In this paper, we describe a comprehensive research program for investigating the methodological challenges that await rigorous inquiry within the epistemic games context. We specifically demonstrate how the evidence-centered design framework (Mislevy, Almond, & Steinberg, 2003) as well as current conceptualizations of reliability and validity theory can be used to structure the development of epistemic games as well as empirical research into their functioning. Using the epistemic game Urban Science (Bagley & Shaffer, 2009), we illustrate the numerous decisions that need to be made during game development and their implications for amassing qualitative and quantitative evidence about learners’ developing expertise within epistemic games.
oai:ejournals.bc.edu:article/1624
2011-05-09T22:57:43Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1624
2011-05-09T22:57:43Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 5 (2010)
Specificity of Structural Assessment of Knowledge
Trumpower, David L; University of Ottawa
Sharara, Harold; University of Ottawa
Goldsmith, Timothy E; University of New Mexico, Main Campus
2010-02-08
url:https://ejournals.bc.edu/index.php/jtla/article/view/1624
assessment
structural knowledge
en_US
This study examines the specificity of information provided by structural assessment of knowledge (SAK). SAK is a technique which uses the Pathfinder scaling algorithm to transform ratings of concept relatedness into network representations (PFnets) of individuals’ knowledge. Inferences about individuals’ overall domain knowledge based on the similarity between their PFnets and a referent PFnet have been shown to be valid. We investigate a more fine grained evaluation of specific links in individuals’ PFnets for identifying particular strengths and weaknesses. Thirty-five undergraduates learned about a computer programming language and were then tested on their knowledge of the language with SAK and a problem solving task. The presence of two subsets of links in participants’ PFnets differentially predicted performance on two types of problems, thereby providing evidence of the specificity of SAK. Implications for the formative use of SAK in the classroom and in computer-based environments are discussed.
oai:ejournals.bc.edu:article/1625
2011-05-09T22:57:43Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1625
2011-05-09T22:57:43Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 6 (2010)
Utility in a Fallible Tool: A Multi-Site Case Study of Automated Writing Evaluation
Grimes, Douglas; University of California, Irvine
Warschauer, Mark; University of California, Irvine
2010-03-02
url:https://ejournals.bc.edu/index.php/jtla/article/view/1625
automated essay scoring
automated writing evaluation
artifical intelligence
writing
composition
en_US
Automated writing evaluation (AWE) software uses artificial intelligence (AI) to score student essays and support revision. We studied how an AWE program called MY Access!® was used in eight middle schools in Southern California over a three-year period. Although many teachers and students considered automated scoring unreliable, and teachers’ use of AWE was limited by the desire to use conventional writing methods, use of the software still brought important benefits. Observations, interviews, and a survey indicated that using AWE simplified classroom management and increased students’ motivation to write and revise.
oai:ejournals.bc.edu:article/1626
2011-05-09T22:57:43Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1626
2011-05-09T22:57:43Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 7 (2010)
Automated Scoring of an Interactive Geometry Item: A Proof-of-Concept
Masters, Jessica; Boston College
2010-04-05
url:https://ejournals.bc.edu/index.php/jtla/article/view/1626
automated scoring
innovative items
geometry
artificial intelligence
en_US
An online interactive geometry item was developed to explore students’ abilities to create prototypical and “tilted” rectangles out of line segments. The item was administered to 1,002 students. The responses to the item were hand-coded as correct, incorrect, or incorrect with possible evidence of a misconception. A variation of the nearest neighbor algorithm was used to automatically predict one of these categories for each of the student responses. The predicted category was compared to the hand-coded category. The algorithm accurately predicted the category for 94.6% of responses.
oai:ejournals.bc.edu:article/1627
2011-05-09T22:57:43Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1627
2011-05-09T22:57:43Z
The Journal of Technology, Learning and Assessment
Vol. 8 No. 8 (2010)
Measuring Problem Solving with Technology: A Demonstration Study for NAEP
Bennett, Randy E; ETS
Persky, Hilary; ETS
Weiss, Andy; ETS
Jenkins, Frank; Westat
2010-06-09
url:https://ejournals.bc.edu/index.php/jtla/article/view/1627
performance assessment
constructed response
simulation
NAEP
en_US
This paper describes a study intended to demonstrate how an emerging skill, problem solving with technology, might be measured in the National Assessment of Educational Progress (NAEP). Two computer-delivered assessment scenarios were designed, one on solving science-related problems through electronic information search and the other on solving science-related problems by conducting simulated experiments. The assessment scenarios were administered in 2003 to nationally representative samples of 8th-grade students in over 200 schools. Results are reported on the psychometric functioning of the scenarios and the performance of population groups. Implications are offered for using online performance assessment to measure emerging skills in NAEP and other large-scale testing programs.
oai:ejournals.bc.edu:article/1628
2011-05-10T17:42:40Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1628
2011-05-10T17:42:40Z
The Journal of Technology, Learning and Assessment
Vol. 7 No. 1 (2008)
Students’ Experiences with an Automated Essay Scorer
Scharber, Cassandra; University of Minnesota
Dexter, Sara; University of Virginia
Riedel, Eric; Walden University
2008-09-05
url:https://ejournals.bc.edu/index.php/jtla/article/view/1628
automated essay scoring
formative assessment
computer-based assessment
software
preservice teacher education
en_US
The purpose of this research is to analyze preservice teachers’ use of and reactions to an automated essay scorer used within an online, case-based learning environment called ETIPS. Data analyzed include post-assignment surveys, a user log of students’ actions within the cases, instructor-assigned scores on final essays, and interviews with four selected students. These in-depth data about students’ reactions to and opinions of the ETIPS automated essay scorer help inform the field about users’ perceptions of automated scoring.
oai:ejournals.bc.edu:article/1629
2011-05-10T17:42:41Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1629
2011-05-10T17:42:41Z
The Journal of Technology, Learning and Assessment
Vol. 7 No. 2 (2008)
Developing a Taxonomy of Item Model Types to Promote Assessment Engineering
Gierl, Mark J.; University of Alberta
Zhou, Jiawen; University of Alberta
Alves, Cecilia; University of Alberta
2008-12-09
url:https://ejournals.bc.edu/index.php/jtla/article/view/1629
Item Modeling
Automatic Item Generation
Test Development
assessment
technology
education
en_US
An item model serves as an explicit representation of the variables in an assessment task. An item model includes the stem, options, and auxiliary information. The stem is the part of an item which formulates context, content, and/or the question the examinee is required to answer. The options contain the alternative answers with one correct option and one or more incorrect options or distractors. The auxiliary information includes any additional material, in either the stem or option, required to generate an item, including texts, images, tables, and/or diagrams. In this study, we first present a taxonomy for item model development where variables in the stem are crossed with variables in the options to create a matrix of possible item model types. We then provide examples of each stem-by-option combination. Finally, we develop a software engine and apply the software to each item model type to generate multiple instances for each model.
oai:ejournals.bc.edu:article/1630
2011-05-10T17:42:41Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1630
2011-05-10T17:42:41Z
The Journal of Technology, Learning and Assessment
Vol. 7 No. 3 (2008)
Online Courses for Math Teachers: Comparing Self-Paced and Facilitated Cohort Approaches
Carey, Rebecca; Education Development Center (EDC)
Kleiman, Glenn; North Carolina State University at Raleigh
Russell, Michael; Boston College
Douglas Venable, Joanne
Louie, Josephine; EDC
2008-12-18
url:https://ejournals.bc.edu/index.php/jtla/article/view/1630
Online professional development
distance learning
eLearning
Online facilitation
self-paced learning
test
testing
assessment
computer
en_US
This study investigated whether two different versions of an online professional development course produced different impacts on the intended outcomes of the course. Variations of an online course for middle school algebra teachers were created for two experimental conditions. One was an actively facilitated course with asynchronous peer interactions among participants. The second was a self-paced condition, in which neither active facilitation nor peer interactions were available. Both conditions showed significant impact on teachers’ mathematical understanding, pedagogical beliefs, and instructional practices. Surprisingly, the positive outcomes were comparable for both conditions. Further research is needed to determine whether this finding is limited to self-selected teachers, the specifics of this online course, or other factors that limit generalizability.
oai:ejournals.bc.edu:article/1631
2011-05-10T18:23:03Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1631
2011-05-10T18:23:03Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 1 (2007)
Toward More Substantively Meaningful Automated Essay Scoring
Ben-Simon, Anat; National Institute for Testing & Evaluation, Israel
Bennett, Randy Elliot; ETS
2007-08-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1631
automated essay scoring
writing assessment
writing scoring models
en_US
This study evaluated a “substantively driven” method for scoring NAEP writing assessments automatically. The study used variations of an existing commercial program, e-rater®, to compare the performance of three approaches to automated essay scoring: a brute-empirical approach in which variables are selected and weighted solely according to statistical criteria, a hybrid approach in which a fixed set of variables more closely tied to the characteristics of good writing was used but the weights were still statistically determined, and a substantively driven approach in which a fixed set of variables was weighted according to the judgments of two independent committees of writing experts. The research questions concerned (1) the reproducibility of weights across writing experts, (2) the comparison of scores generated by the three automated approaches, and (3) the extent to which models developed for scoring one NAEP prompt generalize to other NAEP
prompts of the same genre. Data came from the 2002 NAEP Writing Online study and from the main NAEP 2002 writing assessment. Results showed that, in carrying out the substantively driven approach, experts initially assigned weights to writing dimensions that were highly similar across committees but that diverged from one another after committee 1 was shown the empirical weights for possible use in its judgments and committee 2 was not shown those weights. The substantively driven approach based on the judgments of committee 1 generally did not operate in a markedly different way from the brute empirical or hybrid approaches in most of the analyses conducted. In contrast, many consistent differences with those approaches were observed for the substantively driven approach based on the judgments of committee 2. This study suggests that empirical weights might provide a useful starting point for expert committees, with the understanding that the
weights be moderated only somewhat to bring them more into line with substantive considerations. Under such circumstances, the results may turn out to be reasonable, though not necessarily as highly related to human ratings as statistically optimal approaches would produce.
oai:ejournals.bc.edu:article/1632
2011-05-10T18:23:03Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1632
2011-05-10T18:23:03Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 2 (2007)
Automated Essay Scoring Versus Human Scoring: A Comparative Study
Wang, Jinhao; South Texas College
Brown, Michelle Stallone; Texas A & M University -- Kingsville
2007-10-18
url:https://ejournals.bc.edu/index.php/jtla/article/view/1632
Automated essay scoring
human raters
group mean scores
WritePlacer
Texas Higher Education Assessment
One-Way Repeated Measures ANOVA
Paired Samples t test
technology
computer
en_US
The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by AES and human raters. Data collection included two standardized writing tests – WritePlacer Plus and the Texas Higher Education Assessment (THEA) writing test. The research sample of 107 participants was drawn from a Hispanic serving institution in South Texas. The One-Way Repeated-Measures ANOVA and the follow-up Paired Samples t test were conducted to examine the group mean differences. Results of the tests indicated that the mean score assigned by IntelliMetric™ was significantly higher than faculty human raters’ mean score on WriterPlacer Plus test, and IntelliMetric™ mean score was also significantly higher than THEA mean score assigned by human raters from National Evaluation Systems. A statistically significant difference also existed between the human raters’ mean score on WritePlacer Plus and human raters’ mean score on THEA. These findings did not corroborate previous studies that reported non-significant mean score differences between AES and human scoring.
oai:ejournals.bc.edu:article/1633
2011-05-10T18:23:04Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1633
2011-05-10T18:23:04Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 3 (2007)
Examining Differences in Examinee Performance in Paper and Pencil and Computerized Testing
Puhan, Gautam; ETS
Boughton, Keith A; CTB
Kim, Sooyeon; ETS
2007-11-20
url:https://ejournals.bc.edu/index.php/jtla/article/view/1633
PPT
CBT
differential item functioning
item impact
standardized mean difference
paper
pencil
computer
assessment
test
testing
item
technology
en_US
The study evaluated the comparability of two versions of a certification test: a paper-and-pencil test (PPT) and computer-based test (CBT). An effect size measure known as Cohen’s d and differential item functioning (DIF) analyses were used as measures of comparability at the test and item levels, respectively. Results indicated that the effect sizes were small (d < 0.20) and not statistically significant (p > 0.05), suggesting no substantial difference between the two test versions. Moreover, DIF analysis revealed that reading and mathematics items were comparable for both versions. However, three writing items were flagged for DIF. Substantive reviews failed to identify format differences that could explain the performance differences, so the causes of DIF could not be identified.
oai:ejournals.bc.edu:article/1634
2011-05-10T18:23:04Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1634
2011-05-10T18:23:04Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 4 (2007)
Comparability of Computer and Paper-and-Pencil Versions of Algebra and Biology Assessments
Kim, Do-Hong; University of North Carolina, Charlotte
Huynh, Huynh; University of South Carolina
2007-12-21
url:https://ejournals.bc.edu/index.php/jtla/article/view/1634
Computer-based testing
online testing
statewide assessment
Algebra
Biology
End-of-Course examinations
technology
assessment
pencil
PPT
CBT
en_US
This study examined comparability of student scores obtained from computerized and paper-and-pencil formats of the large-scale statewide end-of-course (EOC) examinations in the two subject areas of Algebra and Biology. Evidence in support of comparability of computerized and paper-based tests was sought by examining scale scores, item parameter estimates, test characteristic curves, test information functions, Rasch ability estimates at the content domain level, and the equivalence of the construct. Overall, the results support the comparability of computerized and paper-based tests at the item-level, subtest-level and whole test-level in both subject areas. For both subject areas, no evidence was found to suggest that the administration mode changed the construct being measured.
oai:ejournals.bc.edu:article/1635
2011-05-10T18:23:04Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1635
2011-05-10T18:23:04Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 5 (2008)
Examining the Relationship between Students’ Mathematics Test Scores and Computer Use at Home and at School
O'Dwyer, Laura; Boston College
Russell, Michael; Boston College
Bebell, Damian; Boston College
Tucker-Seeley, Kevon R.; Northeast & Islands Regional Educational Laboratory (NEIREL) at the Educational Development Center (EDC)
2008-01-18
url:https://ejournals.bc.edu/index.php/jtla/article/view/1635
Technology use
mathematics
MCAS
HLM
standardized test scores
assessment
learning
computer
en_US
Over the past decade, standardized test results have become the primary tool used to judge the effectiveness of schools and educational programs, and today, standardized testing serves as the keystone for educational policy at the state and federal levels. This paper examines the relationship between fourth grade mathematics achievement and technology use at home and at school. Using item level achievement data, individual student’s state test scores on the Massachusetts Comprehensive Assessment System (MCAS), and student and teacher responses to detailed technology-use surveys, this study examines the relationship between technology-use and mathematics performance among 986 regular students, from 55 intact fourth grade classrooms in 25 schools across 9 school districts in Massachusetts. The findings from this study suggest that various uses of technology are differentially related to student outcomes and that in general, student and teacher technology uses are weakly related to mathematics achievement on the MCAS. Implications for improving methods for examining the relationship between technology use and standardized test scores are presented.
oai:ejournals.bc.edu:article/1636
2011-05-10T18:23:04Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1636
2011-05-10T18:23:04Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 6 (2008)
Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees' Cognitive Skills in Algebra on the SAT
Gierl, Mark J.; University of Alberta
Wang, Changjiang; University of Alberta
Zhou, Jiawen; University of Alberta
2008-02-12
url:https://ejournals.bc.edu/index.php/jtla/article/view/1636
cognitive diagnostic assessment
cognition
assessment
testing
computer-based
technology
AHM
attribute Hierarchy method
en_US
The purpose of this study is to apply the attribute hierarchy method (AHM) to a subset of SAT algebra items administered in March 2005 to promote cognitive diagnostic inferences about examinees. The AHM is a psychometric method for classifying examinees’ test item responses into a set of structured attribute patterns associated with different components from a cognitive model of task performance. An attribute is a description of the procedural or declarative knowledge needed to perform a task. These attributes form a hierarchy of cognitive skills that represent a cognitive model of task performance. The study was conducted in two steps. In step 1, a cognitive model was developed by having content specialists, first, review the SAT algebra items, identify their salient attributes, and order the item-based attributes into a hierarchy. Then, the cognitive model was validated by having a sample of students think aloud as they solved each item. In step 2, psychometric analyses were conducted on the SAT algebra cognitive model by evaluating the model-data fit between the expected response patterns generated by the cognitive model and the observed response patterns produced from a random sample of 5000 examinees who wrote the items. Attribute probabilities were also computed for this random sample of examinees so diagnostic inferences about their attribute-level performances could be made. We conclude the study by describing key limitations, highlighting challenges inherent to the development and analysis of cognitive diagnostic assessments, and proposing directions for future research. This article contains embedded media (video and audio files) and may take a few minutes to download. You will need Flash Player 9.0 (available from www.adobe.com) to play the files. An alternate, smaller version of this article, that does not contain media files is available below under the Alternate Version heading.
oai:ejournals.bc.edu:article/1637
2011-05-10T18:23:04Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1637
2011-05-10T18:23:04Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 7 (2008)
Does Survey Medium Affect Responses? An Exploration of Electronic and Paper Surveying in British Columbia Schools
Walt, Nancy; British Columbia Ministry of Education
Atwood, Kristin; Agency Research Consultants
Mann, Alex; British Columbia Ministry of Education
2008-04-11
url:https://ejournals.bc.edu/index.php/jtla/article/view/1637
electronic surveys
validity
survey medium
surveying children
assessment
electronic versus paper format
learning
en_US
The purpose of this study was to determine whether or not survey medium (electronic versus paper format) has a significant effect on the results achieved. To compare survey media, responses from elementary students to British Columbia’s Satisfaction Survey were analyzed. Although this study was not experimental in design, the data set served as a rich source for which to investigate the research question. The methods included reliability, item mean, factor analysis, response rate and response completeness comparisons across survey media. From the analyses, the differences between electronic and paper media in this study appear to be minor, and do not seem to have a significant effect on overall results. In conclusion, the medium does not seem to overly affect response patterns and does not pose any threats to the validity or reliability of survey results.
oai:ejournals.bc.edu:article/1638
2011-05-10T18:23:05Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1638
2011-05-10T18:23:05Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 8 (2008)
Comparisons between Classical Test Theory and Item Response Theory in Automated Assembly of Parallel Test Forms
Lin, Chuan-Ju; National University of Tainan, Taiwan
2008-04-29
url:https://ejournals.bc.edu/index.php/jtla/article/view/1638
Automated Test Assembly
Weighted Deviations Model Heuristic
Reference Test
Automated-Test-Assembly Constraints
Content Parallelism
Statistical Parallelism
Item Characteristic Curve Parallelism
Assessment
Paper and pencil
technology
learning
en_US
The automated assembly of alternate test forms for online delivery provides an alternative to computer-administered, fixed test forms, or computerized-adaptive tests when a testing program migrates from paper/pencil testing to computer-based testing. The weighted deviations model (WDM) heuristic particularly promising for automated test assembly (ATA) because it is computationally straightforward and produces tests with desired properties under realistic testing conditions. Unfortunately, research into the WDM heuristic has focused exclusively on the Item Response Theory (IRT) methods even though there are situations under which Classical Test Theory (CTT) item statistics are the only data available to test developers. The purpose of this study was to investigate the degree of parallelism of test forms assembled with the WDM heuristic using both CTT and IRT methods. Alternate forms of a 60-item test were assembled from a pool of 600 items. One CTT and two IRT approaches were used to generate content and psychometric constraints. The three methods were compared in terms of conformity to the test-assembly constraints, average test overlap rate, content parallelism, and statistical parallelism. The results led to a primary conclusion that the CTT approach performed at least as well as the IRT approaches. The possible reasons for the results of the comparability of the three test-assembly approaches were discussed and the suggestions for the future ATA applications were provided in this paper.
oai:ejournals.bc.edu:article/1639
2011-05-10T18:23:05Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1639
2011-05-10T18:23:05Z
The Journal of Technology, Learning and Assessment
Vol. 6 No. 9 (2008)
Does it Matter if I Take My Mathematics Test on Computer? A Second Empirical Study of Mode Effects in NAEP
Bennett, Randy Elliot; ETS
Braswell, James; AIR
Oranje, Andreas; ETS
Sandene, Brent; ETS
Kaplan, Bruce; ETS
Yan, Fred; ETS
2008-06-17
url:https://ejournals.bc.edu/index.php/jtla/article/view/1639
computer-based testing
NAEP
constructed-response items
computer delivery
assessment
en_US
This article describes selected results from the Math Online (MOL) study, one of three field investigations sponsored by the National Center for Education Statistics (NCES) to explore the use of new technology in NAEP. Of particular interest in the MOL study was the comparability of scores from paper- and computer-based tests. A nationally representative sample of eighth-grade students was administered a computer-based mathematics test and a test of computer facility, among other measures. In addition, a randomly parallel group of students was administered a paper-based test containing the same math items as the computer-based test. Results showed that the computer-based mathematics test was significantly harder statistically than the paper-based test. In addition, computer facility predicted online mathematics test performance after controlling for performance on a paper-based mathematics test, suggesting that degree of familiarity with computers may matter when taking a computer-based mathematics test in NAEP.
oai:ejournals.bc.edu:article/1640
2011-05-10T20:10:28Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1640
2011-05-10T20:10:28Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 1 (2006)
An Overview of Automated Scoring of Essays
Dikli, Semire; Florida State University
2006-08-16
url:https://ejournals.bc.edu/index.php/jtla/article/view/1640
Computerized writing
assessment
technology
writing
automated
scoring
essays
AES
essay
computer
automated scroring
en_US
Automated Essay Scoring (AES) is defined as the computer technology that evaluates and scores the written prose (Shermis & Barrera, 2002; Shermis & Burstein, 2003; Shermis, Raymat, & Barrera, 2003). AES systems are mainly used to overcome time, cost, reliability, and generalizability issues in writing assessment (Bereiter, 2003; Burstein, 2003; Chung & O’Neil, 1997; Hamp-Lyons, 2001; Myers, 2003; Page, 2003; Rudner & Gagne, 2001; Rudner & Liang, 2002; Sireci & Rizavi, 1999; http://people.emich.edu). AES continue attracting the attention of public schools, universities, testing companies, researchers and educators (Burstein, Kukich, Wolff, Lu, & Chodorow, 1998; Shermis & Burstein, 2003; Sireci & Rizavi, 1999). The main purpose of this article is to provide an overview of current approaches to AES. After describing the most widely used AES systems (i.e., Project Essay Grader (PEG), Intelligent Essay Assessor (IEA), E-rater and Criterion, IntelliMetric and MY Access, and Bayesian Essay Test Scoring System (BETSY)), main characteristics of these systems will be discussed and current issues regarding the use of them both in low-stakes assessment (in classrooms) and high-stakes assessment (as standardized tests) will be discussed in this article.
oai:ejournals.bc.edu:article/1641
2011-05-10T20:10:28Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1641
2011-05-10T20:10:28Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 2 (2006)
Does it Matter if I take My Writing Test on Computer? An Empirical Study of Mode Effects in NAEP
Horkay, Nancy
Bennett, Randy Elliot; ETS
Allen, Nancy
Kaplan, Bruce A.; ETS
Yan, Fred; ETS
2006-11-22
url:https://ejournals.bc.edu/index.php/jtla/article/view/1641
computer-based testing
online assessment
writing assessment
NAEP
computer
assessment
WOL
en_US
This study investigated the comparability of scores for paper and computer versions of a writing test administered to eighth grade students. Two essay prompts were given on paper to a nationally representative sample as part of the 2002 main NAEP writing assessment. The same two essay prompts were subsequently administered on computer to a second sample also selected to be nationally representative. Analyses looked at overall differences in performance between the delivery modes, interactions of delivery mode with group membership, differences in performance between those taking the computer test on different types of equipment (i.e., school machines vs. NAEP-supplied laptops), and whether computer familiarity was associated with online writing test performance. Results generally showed no significant mean score differences between paper and computer delivery. However, computer familiarity significantly predicted online writing test performance after controlling for paper writing skill. These results suggest that, for any given individual, a computer-based writing assessment may produce different results than a paper one, depending upon that individual’s level of computer familiarity. Further, for purposes of estimating population performance, as long as substantial numbers of students write better on computer than on paper (or better on paper than on computer), conducting a writing assessment in either mode alone may underestimate the performance that would have resulted if students had been tested using the mode in which they wrote best.
oai:ejournals.bc.edu:article/1642
2011-05-10T20:10:29Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1642
2011-05-10T20:10:29Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 3 (2006)
Individualizing Learning Using Intelligent Technology and Universally Designed Curriculum
Abell, Michael; University of Louisville
2006-11-04
url:https://ejournals.bc.edu/index.php/jtla/article/view/1642
universal design for learning
intelligent computing
machine learning models
accessible assessment
universally designed assessment
universal
computer
testing
en_US
The American education system and its rigorous accountability and performance standards continually force educators to explore new ways to increase student achievement. The improvement in computer technology and intelligent computing systems may offer new tools for student learning and higher academic achievement. These systems have the potential to meet individual student learning needs using universally designed curricula and assessments. The purpose of this paper is to present a conceptual framework that harnesses the potential of intelligent learning systems, machine learning models, and universal design for learning principles to help formulate next generation instructional materials. By using intelligent and interactive curricula, educators could begin to move away from information disseminator into a facilitator of the learning experience.
oai:ejournals.bc.edu:article/1643
2011-05-10T20:10:29Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1643
2011-05-10T20:10:29Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 4 (2006)
Differential Item Functioning of GRE Mathematics Items Across Computerized and Paper-and-Pencil Testing Media
Gu, Lixiong; Michigan State University
Drake, Samuel; Michigan State University
Wolfe, Edward W.; Virginia Tech
2006-12-11
url:https://ejournals.bc.edu/index.php/jtla/article/view/1643
DIF
Differential
item difficulty
GRE Mathematics
GRE
Computerized
Computer
Testing
Paper-and-Pencil Testing Media
Assessment
Technology
Computer-based
pencil
paper
paper-based
computer-based
test delivery
en_US
This study seeks to determine whether item features are related to observed differences in item difficulty (DIF) between computer- and paper-based test delivery media. Examinees responded to 60 quantitative items similar to those found on the GRE general test in either a computer-based or paper-based medium. Thirty-eight percent of the items were flagged for cross-medium DIF, and post hoc content analyses were performed focusing on page formatting, mathematical notation, and mathematical content of the items. Although findings suggest that differences in page formatting and response processes across the delivery media contribute little to the observed cross-medium DIF, differences in the mathematical notation contained in the item text as well as differences in the mathematical content of the items provided the strongest apparent relationships with cross-medium DIF.
oai:ejournals.bc.edu:article/1644
2011-05-10T20:10:29Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1644
2011-05-10T20:10:29Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 5 (2006)
Comprehensive Assessment of a Software Development Project for Engineering Instruction
Hall, Richard H; University of Missouri - Rolla
Philpot, Timothy A.; University of Missouri - Rolla
Hubing, Nancy; University of Missouri - Rolla
2006-12-11
url:https://ejournals.bc.edu/index.php/jtla/article/view/1644
comprehensive assessment
engineering education
multiple-methodologies
assessment
testing
pencil and paper
computer-based testing
computer
en_US
This paper reviews a series of formative assessment studies that were conducted to inform and evaluate a large-scale instructional software development project at the University of Missouri –Rolla (UMR). The three-year project, entitled “Taking the Next Step in Engineering Education: Integrating Educational Software and Active Learning,” was funded by the U.S. Department of Education Fund for the Improvement of Post Secondary Education (FIPSE). The assessment was carried out under the auspices of UMR’s Laboratory for Information Technology Evaluation (LITE) and guided by the LITE model for evaluation of learning technologies. The fundamental premise of the model is that evaluation should consist of the triangulation of multiple research methodologies and measurement tools. Five representative evaluation studies, consisting of eight experiments, are presented here. The studies range from basic experimentation and usability testing to applied research conducted within the classroom as well as a multi-national cross-cultural applied dissemination survey conducted during the last semester of the project. This paper demonstrates that the LITE model can be an effective tool for guiding a comprehensive evaluation program. In addition, the research findings provide evidence that the instructional multimedia developed in this project can have a substantial positive impact in enhancing fundamental engineering classes.
oai:ejournals.bc.edu:article/1645
2011-05-10T20:10:29Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1645
2011-05-10T20:10:29Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 6 (2006)
An Exploratory Study of a Novel Online Formative Assessment and Instructional Tool to Promote Students’ Circuit Problem Solving
Chung, Gregory K. W. K.; UCLA/CRESST
Shel, Tammy; UCLA/CRESST
Kaiser, William J; UCLA/HSSEAS
2006-12-13
url:https://ejournals.bc.edu/index.php/jtla/article/view/1645
Formative assessment
adaptive instruction
personal response system
technology
assessment
testing
paper and pencil
en_US
We examined a novel formative assessment and instructional approach with 89 students in three electrical engineering classes in special computer-based discussion sections. The technique involved students individually solving circuit problems online, with their real-time responses observed by the instructor. While exploratory, survey and interview responses from 26 students suggest the technique offers important instructional and assessment advantages: Compared to typical discussion sessions, a large majority of respondents reported being more engaged, learning more, and interacting more with the instructor. Students reported the anonymous mode allowed them to ask “dumb” questions. The instructor was able to address student problems and questions immediately, and the amount of formative assessment information from the interaction far exceeded what was available in typical settings.
oai:ejournals.bc.edu:article/1646
2011-05-10T20:10:29Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1646
2011-05-10T20:10:29Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 7 (2007)
The Effect of Using Item Parameters Calibrated from Paper Administrations in Computer Adaptive Test Administrations
Pommerich, Mary; Defense Manpower Data Center
2007-03-30
url:https://ejournals.bc.edu/index.php/jtla/article/view/1646
computerized testing
CAT
mode effects
item parameters
technology
assessment
testing
en_US
Computer administered tests are becoming increasingly prevalent as computer technology becomes more readily available on a large scale. For testing programs that utilize both computer and paper administrations, mode effects are problematic in that they can result in examinee scores that are artificially inflated or deflated. As such, researchers have engaged in extensive studies of whether scores differ across paper and computer presentations of the same tests. The research generally seems to indicate that the more complicated it is to present or take a test on computer, the greater the possibility of mode effects. In a computer adaptive test, mode effects may be a particular concern if items are calibrated using item responses obtained from one administration mode (i.e., paper), and those parameters are then used operationally in a different administration mode (i.e., computer). This paper studies the suitability of using parameters calibrated from a
paper administration for item selection and scoring in a computer adaptive administration, for two tests with lengthy passages that required navigation in the computer administration. The results showed that the use of paper calibrated parameters versus computer calibrated parameters in computer adaptive administrations had small to moderate effects on the reliability of examinee scores, at fairly short test lengths. This effect was generally diminished for longer test lengths. However, the results suggest that in some cases, some loss in reliability might be inevitable if paper-calibrated parameters are used in computer adaptive administrations.
oai:ejournals.bc.edu:article/1647
2011-05-10T20:10:30Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1647
2011-05-10T20:10:30Z
The Journal of Technology, Learning and Assessment
Vol. 5 No. 8 (2007)
A Review of Item Exposure Control Strategies for Computerized Adaptive Testing Developed from 1983 to 2005
Georgiadou, Elissavet G; Hellenic Open University, Greece
Triantafillou, Evangelos; Aristotle University of Thessaloniki, Greece
Economides, Anastasios A; University of Macedonia, Greece
2007-05-08
url:https://ejournals.bc.edu/index.php/jtla/article/view/1647
item exposure control strategies
CAT
randomization strategies
conditional selection strategies
stratified strategies
multiple stage adaptive test designs
testing
technology
assessment
computerized
adaptive
en_US
Since researchers acknowledged the several advantages of computerized adaptive testing (CAT) over traditional linear test administration, the issue of item exposure control has received increased attention. Due to CAT’s underlying philosophy, particular items in the item pool may be presented too often and become overexposed, while other items are rarely selected by the CAT algorithm and thus become underexposed. Several item exposure control strategies have been presented in the literature aiming to prevent overexposure of some items and to increase the use rate of rarely or never selected items. This paper reviews such strategies that appeared in the relevant literature from 1983 to 2005. The focus of this paper is on studies that have been conducted in order to evaluate the effectiveness of item exposure control strategies for dichotomous scoring, polytomous scoring and testlet-based CAT systems. In addition, the paper discusses the strengths and
weaknesses of each strategy group using examples from simulation studies. No new research is presented but rather a compendium of models is reviewed with an overall objective of providing researchers of this field, especially newcomers, a wide view of item exposure control strategies.
oai:ejournals.bc.edu:article/1648
2011-05-10T22:03:09Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1648
2011-05-10T22:03:09Z
The Journal of Technology, Learning and Assessment
Vol. 4 No. 1 (2005)
The Effects of Online Formative and Summative Assessment on Test Anxiety and Performance
Cassady, Jerrell C.; Ball State University
Gridley, Betty E.; Ball State University
2005-10-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1648
en_US
This study analyzed the effects of online formative and summative assessment materials on undergraduates' experiences with attention to learners' testing behaviors (e.g., performance, study habits) and beliefs (e.g., test anxiety, perceived test threat). The results revealed no detriment to students' perceptions of tests or performances on tests when comparing online to paper-pencil summative assessments. In fact, students taking tests online reported lower levels of perceived test threat. Regarding formative assessment, findings indicate a small benefit for using online practice tests prior to graded course exams. This effect appears to be in part due to the reduction of the deleterious effects of negative test perceptions afforded in conditions where practice tests were available. The results support the integration of online practice tests to help students prepare for course exams and also reveal that secure web-based testing can aid undergraduate instruction through improved student confidence and increased instructional time.
oai:ejournals.bc.edu:article/1649
2011-05-10T21:38:39Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1649
2011-05-10T21:38:39Z
The Journal of Technology, Learning and Assessment
Vol. 4 No. 2 (2005)
Knowing What All Students Know: Procedures for Developing Universal Design for Assessment
Ketterlin-Geller, Leanne R.; University of Oregon
2005-11-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1649
en_US
Universal design for assessment (UDA) is intended to increase participation of students with disabilities and English-language learners in general education assessments by addressing student needs through customized testing platforms. Computer-based testing provides an optimal format for creating individually-tailored tests. However, although a theoretical basis for universal design is well established, little practical information is available to assist test developers in creating and implementing universally designed tests. This article discusses the application of universal design to assessment and describes how these principles are applied to a test of 3rd grade mathematics ability. I present the steps involved in conceptualizing, constructing, and implementing a universally designed test in anticipation that test developers, state department assessment coordinators, and other researchers will benefit from this application. Recommendations for future research and development efforts to create accessible computer-based learning environments for all students are explored.
oai:ejournals.bc.edu:article/1650
2011-05-10T21:38:39Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1650
2011-05-10T21:38:39Z
The Journal of Technology, Learning and Assessment
Vol. 4 No. 3 (2006)
Automated Essay Scoring With e-rater® V.2
Attali, Yigal; Educational Testing Service
Burstein, Jill; Educational Testing Service
2006-02-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1650
Automated essay scoring
e-rater
en_US
E-rater has been used by the Educational Testing Service for automated essay scoring since 1999. This paper describes a new version of e-rater (V.2) that is different from other automated essay scoring systems in several important respects. The main innovations of e-rater V.2 are a small, intuitive, and meaningful set of features used for scoring; a single scoring model and standards can be used across all prompts of an assessment; modeling procedures that are transparent and flexible, and can be based entirely on expert judgment. The paper describes this new system and presents evidence on the validity and reliability of its scores.
oai:ejournals.bc.edu:article/1651
2011-05-10T21:38:39Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1651
2011-05-10T21:38:39Z
The Journal of Technology, Learning and Assessment
Vol. 4 No. 4 (2006)
An Evaluation of IntelliMetric™ Essay Scoring System
Rudner, Lawrence M.; GMAC
Garcia, Veronica; GMAC
Welch, Catherine; Assessment Innovations at ACT, Inc.
2006-03-29
url:https://ejournals.bc.edu/index.php/jtla/article/view/1651
Automated Essay Scoring
Raters
Rater consistancy
en_US
This report provides a two-part evaluation of the IntelliMetric™ automated essay scoring system based on its performance scoring essays from the Analytic Writing Assessment of the Graduate Management Admission Test™ (GMAT™). The IntelliMetric system performance is first compared to that of individual human raters, a Bayesian system employing simple word counts, and a weighted probability model using more than 750 responses to each of six prompts. The second, larger evaluation compares the IntelliMetric system ratings to those of human raters using approximately 500 responses to each of 101 prompts. Results from both evaluations suggest the IntelliMetric system is a consistent, reliable system for scoring AWA essays with a perfect + adjacent agreement on 96% to 98% and 92% to 100% of instances in evaluations 1 and 2, respectively. The Pearson r correlations of agreement between human raters and the IntelliMetric system averaged .83 in both evaluations.
oai:ejournals.bc.edu:article/1652
2011-05-10T21:38:39Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1652
2011-05-10T21:38:39Z
The Journal of Technology, Learning and Assessment
Vol. 4 No. 5 (2006)
On-line Mathematics Assessment: The Impact of Mode on Performance and Question Answering Strategies
Johnson, Martin; University of Cambridge Local Examinations Syndicate
Green, Sylvia; University of Cambridge Local Examinations Syndicate
2006-03-29
url:https://ejournals.bc.edu/index.php/jtla/article/view/1652
On-line
assessment
mode
mathematics
en_US
The transition from paper-based to computer-based assessment raises a number of important issues about how mode might affect children’s performance and question answering strategies. In this project 104 eleven-year-olds were given two sets of matched mathematics questions, one set on-line and the other on paper. Facility values were analyzed to explore the impact of the mode on performance. Errors were coded and this allowed further investigation of the differences between questions in the different modes. The study also investigated children’s affective responses to working on computer, attempting to gain an insight into the effect of motivational factors. This was made possible by observing and interviewing a sub-sample of children. Findings suggested that although there were no statistically significant differences between overall performances on paper and computer, there were enough differences at the individual question-level to warrant further investigation. Close analysis of the data suggests that it is possible that the question type, the way it is asked, and the numbers involved, might interact with mode to affect students’ willingness to show working methods. The findings also suggest that certain types of questions in certain domains might have different impacts according to mode. The study concludes that there is scope for more research to probe further any links that may exist between children’s thinking, behavior and assessment mode in order to satisfy concerns about the relative reliability and validity of computer-based and paper-based testing.
oai:ejournals.bc.edu:article/1653
2011-05-10T21:38:40Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1653
2011-05-10T21:38:40Z
The Journal of Technology, Learning and Assessment
Vol. 4 No. 6 (2006)
Computer-Based Assessment in E-Learning: A Framework for Constructing "Intermediate Constraint" Questions and Tasks for Technology Platforms
Scalise, Kathleen; University of Oregon
Gifford, Bernard; UC Berkeley
2006-01-03
url:https://ejournals.bc.edu/index.php/jtla/article/view/1653
technology
assessment
instructional design
assessment design
tasks
questions
formats
e-learning
embedded assessment
intermediate constraint
constraint
en_US
Technology today offers many new opportunities for innovation in educational assessment through rich new assessment tasks and potentially powerful scoring, reporting and real-time feedback mechanisms. One potential limitation for realizing the benefits of computer-based assessment in both instructional assessment and large scale testing comes in designing questions and tasks with which computers can effectively interface (i.e., for scoring and score reporting purposes) while still gathering meaningful measurement evidence. This paper introduces a taxonomy or categorization of 28 innovative item types that may be useful in computer-based assessment. Organized along the degree of constraint on the respondent’s options for answering or interacting with the assessment item or task, the proposed taxonomy describes a set of iconic item types termed “intermediate constraint” items. These item types have responses that fall somewhere between fully constrained responses (i.e., the conventional multiple-choice question), which can be far too limiting to tap much of the potential of new information technologies, and fully constructed responses (i.e. the traditional essay), which can be a challenge for computers to meaningfully analyze even with today’s sophisticated tools. The 28 example types discussed in this paper are based on 7 categories of ordering involving successively decreasing response constraints from fully selected to fully constructed. Each category of constraint includes four iconic examples. The intended purpose of the proposed taxonomy is to provide a practical resource for assessment developers as well as a useful framework for the discussion of innovative assessment formats and uses in computer-based settings.
oai:ejournals.bc.edu:article/1654
2011-05-10T22:36:54Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1654
2011-05-10T22:36:54Z
The Journal of Technology, Learning and Assessment
Vol. 3 No. 1 (2004)
Telementoring as a Collaborative Agent for Change
Friedman, Audrey A.; Boston College
Zibit, Melanie; Boston College
Coote, Meca
2004-05-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1654
en_US
This case study explored the effectiveness of telementoring as a vehicle for preservice teachers to hone skills in the teaching of writing, to establish a mentoring relationship with urban high school students, and to help struggling writers improve writing skills necessary for student achievement. Inherent in this research was the goal to develop a collaborative model between the university and the high school for using technology to improve “at-risk” urban students’ skills in writing. Additionally, the research allowed preservice teachers to learn about themselves as evolving teachers as they broached some of the difficulties of teaching writing to academically diverse students and learned about the scarcity of resources and difficult realities that exist for urban students.
oai:ejournals.bc.edu:article/1655
2011-05-10T22:36:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1655
2011-05-10T22:36:55Z
The Journal of Technology, Learning and Assessment
Vol. 3 No. 2 (2005)
Learning With Technology: The Impact of Laptop Use on Student Achievement
Cengiz Gulek, James
Demirtas, Hakan
2005-01-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1655
en_US
Rapid technological advances in the last decade have sparked educational practitioners’ interest in utilizing laptops as an instructional tool to improve student learning. There is substantial evidence that using technology as an instructional tool enhances student learning and educational outcomes. Past research suggests that compared to their non-laptop counterparts, students in classrooms that provide all students with their own laptops spend more time involved in collaborative work, participate in more project-based instruction, produce writing of higher quality and greater length, gain increased access to information, improve research analysis skills, and spend more time doing homework on computers. Research has also shown that these students direct their own learning, report a greater reliance on active learning strategies, readily engage in problem solving and critical thinking, and consistently show deeper and more flexible uses of technology than students without individual laptops. The study presented here examined the impact of participation in a laptop program on student achievement. A total of 259 middle school students were followed via cohorts. The data collection measures included students’ overall cumulative grade point averages (GPAs), end-of-course grades, writing test scores, and state-mandated norm- and criterion-referenced standardized test scores. The baseline data for all measures showed that there was no statistically significant difference in English language arts, mathematics, writing, and overall grade point average achievement between laptop and non-laptop students prior to enrollment in the program. However, laptop students showed significantly higher achievement in nearly all measures after one year in the program. Cross-sectional analyses in Year 2 and Year 3 concurred with the results from the Year 1. Longitudinal analysis also proved to be an independent verification of the substantial impact of laptop use on student learning outcomes.
oai:ejournals.bc.edu:article/1656
2011-05-10T22:36:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1656
2011-05-10T22:36:55Z
The Journal of Technology, Learning and Assessment
Vol. 3 No. 3 (2005)
Examining the Relationship Between Home and School Computer Use and Students’ English/Language Arts Test Scores
O'Dwyer, Laura; University of Massachusetts-Lowell
Russell, Michael; Boston College
Bebell, Damian; Boston College
Tucker-Seeley, Kevon R.; Boston College
2005-01-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1656
en_US
With increased emphasis on test-based accountability measures has come increased interest in examining the impact of technology use on students’ academic performance. However, few empirical investigations exist that address this issue. This paper (1) examines previous research on the relationship between student achievement and technology use, (2) discusses the methodological and psychometric issues that arise when investigating such issues, and (3) presents a multilevel regression analysis of the relationship between a variety of student and teacher technology uses and fourth grade test scores on the Massachusetts Comprehensive Assessment System (MCAS) English/Language Arts test. In total, 986 fourth grade students from 55 intact classrooms in nine school districts in Massachusetts were included in this study. This study found that, while controlling for both prior achievement and socioeconomic status, students who reported greater frequency of technology use at school to edit papers were likely to have higher total English/language arts test scores and higher writing scores. Use of technology at school to prepare presentations was associated with lower English/language arts outcome measures. Teachers' use of technology for a variety of purposes were not significant predictors of student achievement, and students’ recreational use of technology at home was negatively associated with the learning outcomes.
oai:ejournals.bc.edu:article/1657
2011-05-10T22:36:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1657
2011-05-10T22:36:55Z
The Journal of Technology, Learning and Assessment
Vol. 3 No. 4 (2005)
Examining the Effect of Computer-Based Passage Presentation of Reading Test Performance
Higgins, Jennifer; Boston College
Russell, Michael; Boston College
Hoffmann, Thomas; Boston College
2005-01-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1657
en_US
To examine the impact of transitioning 4th grade reading comprehension assessments to the computer, 219 fourth graders were randomly assigned to take a one-hour reading comprehension assessment on paper, on a computer using scrolling text to navigate through passages, or on a computer using paging text to navigate through passages. This study examined whether presentation form affected student test scores. Students also completed a computer skills performance assessment, a paper based computer literacy assessment, and a computer use survey. Results from the reading comprehension assessment and the three computer instruments were used to examine differences in students test scores while taking into account their computer skills. ANOVA and regression analyses provide evidence of the following findings: 1. There were no significant differences in reading comprehension scores across testing modes. On average, students in the paper group (n=75) answered 58.1% of the items correctly, students in the scrolling group (n=70) answered 52.2% of the items correctly, and students in the whole page group (n=74) answered 56.9% of the items correctly. The almost a 6% point difference in scores between the paper and scrolling groups was not significant at the p
oai:ejournals.bc.edu:article/1658
2011-05-10T22:44:34Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1658
2011-05-10T22:44:34Z
The Journal of Technology, Learning and Assessment
Vol. 3 No. 5 (2005)
Designing Handheld Software to Support Classroom Assessment: Analysis of Conditions for Teacher Adoption
Penuel, William R.; SRI International
Yarnall, Louise
2005-02-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1658
en_US
Since 2002, Project WHIRL (Wireless Handhelds In Reflection on Learning) has investigated potential uses of handheld computers in K–12 science classrooms using a teacher-involved process of software development and field trials. The project is a three-year research and development grant from the National Science Foundation, and it is a partnership between SRI International and a medium-sized district in South Carolina, Beaufort County School District. In contrast to many recent handheld development projects aimed at developing curricular materials, Project WHIRL focused on the development of assessment materials. In Project WHIRL, teachers were asked to apply their own curricular materials, content understanding, and pedagogical content knowledge to the project. Teachers and SRI researchers, software developers, and assessment specialists worked together to design software and activities that could be used across a variety of topic areas and science and in multiple phases of instruction to improve classroom assessment. This design process revealed to the research team teachers’ beliefs and assumptions about assessment as well as a wide range of practices they used to find out what their students know and can do, both informal and formal. In this paper, we focus on how teachers’ initial teaching and assessment practices influenced the design of handheld software and the ways in which these designs have been used across a variety of teachers’ classrooms. In addition, this paper provides some preliminary answers to two of the key research questions we outlined at the outset of our project:• What kinds of software designs can be feasibly implemented in classrooms that support effective assessment practice?• What are the conditions under which teachers can adopt handheld tools to support classroom assessment?
oai:ejournals.bc.edu:article/1659
2011-05-10T22:36:56Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1659
2011-05-10T22:36:56Z
The Journal of Technology, Learning and Assessment
Vol. 3 No. 6 (2005)
A Comparative Evaluation of Score Results from Computerized and Paper & Pencil Mathematics Testing in a Large Scale State Assessment Program
Poggio, John; University of Kansas
Glasnapp, Douglas R.; University of Kansas
Yang, Xiangdong; University of Kansas
Poggio, Andrew J.; University of Kansas
2005-02-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1659
en_US
The present study reports results from a quasi-controlled empirical investigation addressing the impact on student test scores when using fixed form computer based testing (CBT) versus paper and pencil (P&P) testing as the delivery mode to assess student mathematics achievement in a state's large scale assessment program. Grade 7 students served as the target population. On a voluntary basis, participation resulted in 644 students being "double" tested: once with a randomly assigned CBT test form, and once with another randomly assigned and equated P&P test form. Both the equivalency of total test scores across different student groupings and the differential impact on individual items were examined. Descriptively there was very little difference in performance between the CBT and P&P scores obtained (less than 1 percentage point). Results make very clear that there existed no meaningful statistical differences in the composite test scores attained by the same students on a computerized fixed form assessment and an equated form of that assessment when taken in a traditional paper and pencil format. While a few items (9 of 204) were found to behave differently based on mode, close review and inspection of these items were not able to identified factors accounting for the differences.
oai:ejournals.bc.edu:article/1660
2011-05-10T22:36:56Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1660
2011-05-10T22:36:56Z
The Journal of Technology, Learning and Assessment
Vol. 3 No. 7 (2005)
Applying Principles of Universal Design to Test Delivery: The Effect of Computer-based Read-aloud on Test Performance of High School Students with Learning Disabilities
Dolan, Robert; CAST
Hall, Tracey E.; CAST
Banerjee, Manju; University of Connecticut
Chun, Euljung; University of Illinois, Urbana-Champaign
Strangman, Nicole; CAST
2005-02-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1660
en_US
Standards-based reform efforts are highly dependent on accurate assessment of all students, including those with disabilities. The accuracy of current large-scale assessments is undermined by construct-irrelevant factors including access barriers, a particular problem for students with disabilities. Testing accommodations such as the read-aloud have led to improvement, but research findings suggest the need for a more flexible, individualized approach to accommodations. The current pilot study applies principles of Universal Design for Learning to the creation of a prototype computer-based test delivery tool that provides students with a flexible, customizable testing environment with the option for read-aloud of test content. Two contrasting methods were used to deliver two equivalent forms of a National Assessment of Educational Progress United States history and civics test to ten high school students with learning disabilities. In a counterbalanced design, students were administered one form via traditional paper-and-pencil (PPT) and the other via a computer-based system with optional text-to-speech (CBT-TTS). Test scores were calculated, and student surveys, structured interviews, field observations, and usage tracking were conducted to derive information about student preferences and patterns of use. Results indicate a significant increase in scores on the CBT-TTS versus PPT administration for questions with reading passages greater than 100 words in length. Qualitative findings also support the effectiveness of CBT-TTS, which students generally preferred over PPT. The results of this pilot study provide preliminary support for the potential benefits and usability of digital technologies in creating universally designed assessments that more fairly and accurately test students with disabilities.
oai:ejournals.bc.edu:article/1661
2011-05-11T17:28:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1661
2011-05-11T17:28:55Z
The Journal of Technology, Learning and Assessment
Vol. 2 No. 1 (2003)
The Effect of Computers on Student Writing: A Meta-analysis of Studies from 1992 to 2002
Goldberg, Amie
Russell, Michael
Cook, Abigail
2003-02-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1661
en_US
Meta-analyses were performed including 26 studies conducted between 1992–2002 focused on the comparison between K–12 students writing with computers vs. paper-and-pencil. Significant mean effect sizes in favor of computers were found for quantity of writing (d=.50, n=14) and quality of writing (d= .41, n=15). Studies focused on revision behaviors between these two writing conditions (n=6) revealed mixed results. Others studies collected for the meta-analysis which did not meet the statistical criteria were also reviewed briefly. These articles (n=35) indicate that the writing process is more collaborative, iterative, and social in computer classrooms as compared with paper-and-pencil environments. For educational leaders questioning whether computers should be used to help students develop writing skills, the results of the meta-analyses suggest that on average students who use computers when learning to write are not only more engaged and motivated in their writing, but they produce written work that is of greater length and higher quality.
oai:ejournals.bc.edu:article/1662
2011-05-11T17:28:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1662
2011-05-11T17:28:55Z
The Journal of Technology, Learning and Assessment
Vol. 2 No. 2 (2003)
An Exploratory Study to Examine the Feasibility of Measuring Problem-Solving Processes Using a Click-Through Interface
Chung, Gregory K.W.K.; National Center for Research on Evaluation, Standards, and Student Testing
Baker, Eva L.; National Center for Research on Evaluation, Standards, and Student Testing
2003-08-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1662
en_US
In this study we investigated the feasibility of a novel user interface to support the measurement of problem-solving processes. Our research questions addressed the use of a "click-through" interface to measure the "generate-and-test" problem-solving process for a design problem. A click-through interface requires the user to explicitly perform an online action (e.g., to view time, the user has to click on a "time" icon). This interface allowed us to measure participants' intentional acts. Freshman college students were given the task of modifying a given, computer-interactive bicycle pump to satisfy performance requirements. The simulation interface provided participants with point-and-click access to controls to modify pump parameters, to run the simulation, to view important information, and to attempt to solve the task. Lag sequential analyses of participants' problem-solving processes over time showed cyclical behavior consistent with the generate-and-test strategy of modifying the pump design, running the simulation, viewing the information, and then either modifying the design or attempting to solve the problem and then modifying the design again. This behavior set was remarkably stable, with most lag 1 associations greater than .80. Our approach to measuring problem-solving processes appears feasible and promising, but more work is needed to gather additional validity evidence.
oai:ejournals.bc.edu:article/1663
2011-05-11T17:28:56Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1663
2011-05-11T17:28:56Z
The Journal of Technology, Learning and Assessment
Vol. 2 No. 3 (2003)
A Feasibility Study of On-the-Fly Item Generation in Adaptive Testing
Bejar, Isaac I.
Lawless, René R.
Morley, Mary E.
Wagner, Michael E.
Bennett, Randy E.
Revuelta, Javier
2003-11-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1663
en_US
The goal of this study was to assess the feasibility of an approach to adaptive testing using item models based on the quantitative section of the Graduate Record Examination (GRE) test. An item model is a means of generating items that are isomorphic, that is, equivalent in content and equivalent psychometrically. Item models, like items, are calibrated by fitting an IRT response model. The resulting set of parameter estimates is imputed to all the items generated by the model. An on-the-fly adaptive test tailors the test to examinees and presents instances of an item model rather than independently developed items. A simulation study was designed to explore the effect an on-the-fly test design would have on score precision and bias as a function of the level of item model isomorphicity. In addition, two types of experimental tests were administered – an experimental, on-the-fly, adaptive quantitative-reasoning test as well as an experimental quantitative-reasoning linear test consisting of items based on item models. Results of the simulation study showed that under different levels of isomorphicity, there was no bias, but precision of measurement was eroded at some level. However, the comparison of experimental, on-the-fly adaptive test scores with the GRE test scores closely matched the test-retest correlation observed under operational conditions. Analyses of item functioning on the experimental linear test forms suggested that a high level of isomorphicity across items within models was achieved. The current study provides a promising first step toward significant cost reduction and theoretical improvement in test creation methodology for educational assessment.
oai:ejournals.bc.edu:article/1664
2011-05-11T17:28:56Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1664
2011-05-11T17:28:56Z
The Journal of Technology, Learning and Assessment
Vol. 2 No. 4 (2003)
Examinee Characteristics Associated With Choice of Composition Medium on the TOEFL Writing Section
Wolfe, Edward W.
2003-12-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1664
en_US
The Test of English as a Foreign Language (TOEFL) contains a direct writing assessment, and examinees are given the option of composing their responses at a computer terminal using a keyboard or composing their responses in handwriting. This study sought to determine whether examinees from different demographic groups choose handwriting versus word-processing composition media with equal likelihood. The relationship between several demographic characteristics of examinees and their composition medium choice on the TOEFL writing assessment is examined using logistic regression. Females, speakers of languages based on non-Roman/Cyrillic character systems, examinees from Africa and the Middle East, and examinees with less proficient English skills were more likely to choose handwriting. Although there were only small differences between age groups with respect to composition medium choice in most geographic regions, younger examinees from Europe and older examinees from Asia were more likely to choose handwriting than their regional counterparts.
oai:ejournals.bc.edu:article/1665
2011-05-11T17:28:56Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1665
2011-05-11T17:28:56Z
The Journal of Technology, Learning and Assessment
Vol. 2 No. 5 (2003)
Computerized Adaptive Testing: A Comparison of Three Content Balancing Methods
Leung, Chi-Keung
Chang, Hua-Hua
Hau, Kit-Tai
2003-12-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1665
en_US
Content balancing is often a practical consideration in the design of computerized adaptive testing (CAT). This study compared three content balancing methods, namely, the constrained CAT (CCAT), the modified constrained CAT (MCCAT), and the modified multinomial model (MMM), under various conditions of test length and target maximum exposure rate. Results of a series of simulation studies indicate that there is no systematic effect of content balancing method in measurement efficiency and pool utilization. However, among the three methods, the MMM appears to consistently over-expose fewer items.
oai:ejournals.bc.edu:article/1666
2011-05-11T17:28:56Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1666
2011-05-11T17:28:56Z
The Journal of Technology, Learning and Assessment
Vol. 2 No. 6 (2004)
Developing Computerized Versions of Paper-and-Pencil Tests: Mode Effects for Passage-Based Tests
Pommerich, Mary; Boston College
2004-02-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1666
en_US
As testing moves from paper-and-pencil administration toward computerized administration, how to present tests on a computer screen becomes an important concern. Of particular concern are tests that contain necessary information that cannot be displayed on screen all at once for an item. Ideally, the method of presentation should not interfere with examinee performance on the test. Examinees should perform similarly on an item regardless of the mode of administration. This paper discusses the development of a computer interface for passage-based, multiple-choice tests. Findings are presented from two studies that compared performance across computer and paper administrations of several fixed-form tests. The effect of computer interface changes made between the two studies is discussed. The results of both studies showed some performance differences across modes. Evaluations of individual items suggested a variety of factors that could have contributed to mode effects. Although the observed mode effects were in general small, overall the findings suggest that it would be beneficial to develop an understanding of factors that can influence examinee behavior and to design a computer interface accordingly, to ensure that examinees are responding to test content rather than features inherent in presenting the test on computer.
oai:ejournals.bc.edu:article/1667
2011-05-11T18:06:54Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1667
2011-05-11T18:06:54Z
The Journal of Technology, Learning and Assessment
Vol. 1 No. 1 (2002)
Inexorable and Inevitable: The Continuing Story of Technology and Assessment
Bennett, Randy Elliot; Educational Testing Service
2002-06-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1667
en_US
This paper argues that the inexorable advance of technology will force fundamental changes in the format and content of assessment. Technology is infusing the workplace, leading to widespread requirements for workers skilled in the use of computers. Technology is also finding a key place in education. This is occurring not only because technology skill has become a workplace requirement. It is also happening because technology provides information resources central to the pursuit of knowledge and because the medium allows for the delivery of instruction to individuals who couldn’t otherwise obtain it. As technology becomes more central to schooling, assessing students in a medium different from the one in which they typically learn will become increasingly untenable. Education leaders in several states and numerous school districts are acting on that implication, implementing technology-based tests for low- and high-stakes decisions in elementary and secondary schools and across all key content areas. While some of these examinations are already being administered statewide, others will take several years to bring to fully operational status. These groundbreaking efforts will undoubtedly encounter significant difficulties that may include cost, measurement, technological-dependability, and security issues. But most importantly, state efforts will need to go beyond the initial achievement of computerizing traditional multiple-choice tests to create assessments that facilitate learning and instruction in ways that paper measures cannot.
oai:ejournals.bc.edu:article/1668
2011-05-11T18:06:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1668
2011-05-11T18:06:55Z
The Journal of Technology, Learning and Assessment
Vol. 1 No. 2 (2002)
Automated Essay Scoring Using Bayes' Theorem
Rudner, Lawrence M.
Liang, Tahung
2002-06-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1668
en_US
Two Bayesian models for text classification from the information science field were extended and applied to student produced essays. Both models were calibrated using 462 essays with two score points. The calibrated systems were applied to 80 new, pre-scored essays with 40 essays in each score group. Manipulated variables included the two models; the use of words, phrases and arguments; two approaches to trimming; stemming; and the use of stopwords. While the text classification literature suggests the need to calibrate on thousands of cases per score group, accuracy of over 80% was achieved with the sparse dataset used in this study.
oai:ejournals.bc.edu:article/1669
2011-05-11T18:06:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1669
2011-05-11T18:06:55Z
The Journal of Technology, Learning and Assessment
Vol. 1 No. 3 (2002)
Assessing Student Problem-Solving Skills With Complex Computer-Based Tasks
Vendlinksi, Terry
Stevens, Ron
2002-06-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1669
en_US
Valid formative assessment is an essential element in improving both student learning and the professional development of educators. Various shortcomings in common assessment modalities, however, hinder our ability to make and evaluate such formative decisions. The diffusion of computer technology into American classrooms offers new opportunities to evaluate student learning and a rich, new source of data upon which to make inferences about the formative interventions that will improve learning. The path from data to inference, however, requires appropriate methodologies that can fully exploit the data without discarding or oversimplifying the behavioral complexity of student activity. This study used IMMEX™, a computerized simulation and problem-solving tool, along with artificial neural networks as pattern recognizers to identify the common types of strategies high school chemistry students used to solve qualitative chemistry problems. Then, based on the calculated probabilities that students would transition between these strategy types over time, Markov hidden chain analysis allowed us to develop a model of the capacity of the current curriculum to produce students able to apply chemistry content to a real-world problem.
oai:ejournals.bc.edu:article/1670
2011-05-11T18:06:55Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1670
2011-05-11T18:06:55Z
The Journal of Technology, Learning and Assessment
Vol. 1 No. 4 (2002)
Investigating Children's Emerging Digital Literacies
Ba, Harouna
Tally, William
Tsikalas, Kallen
2002-08-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1670
en_US
Departing from the view that the digital divide is a technical issue, the EDC Center for Children and Technology (CCT) and Computers for Youth (CFY) have completed a 1-year comparative study of children’s use of computers in low- and middle-income homes. To assess emerging digital literacy skills at home, we define digital literacy as a set of habits through which children use computer technology for learning, work, socializing, and fun. Our findings indicate that both groups of children used the computer to do schoolwork. Many children with leisure time at home also spent 2 to 3 hours a day communicating with peers, playing games, and pursuing creative hobbies. When solving technical problems, the children from low-income homes relied more on formal help providers such as CFY and schoolteachers, while the children from middle-income homes turned to themselves, their families, and their peers. All the children developed basic literacy with word processing, email, and the Web. Not surprisingly, those children who spent considerably more time online developed more robust skills in online communication and authoring. The results also show that children’s digital literacy skills are emerging in ways that reflect local circumstances, such as the length of time children had a computer at home; the family’s ability to purchase stable Internet connectivity; the number of computers in the home and where they are located (bedroom or public area); parents’ attitudes toward computer use; parents’ own experience and skills with computers; children’s leisure time at home; the computing habits of children’s peers; the technical expertise of friends, relatives, and neighbors; homework assignments; and the direct instruction provided by teachers in the classroom. The findings highlight issues impacting social, school, and assessment policy and practice. Specifically, these results have implications for local educational systems interested in developing digital literacy assessment instruments that demonstrate progress as well as specific areas that need improvement. The digital literacy analysis model developed in this study affords teachers opportunities to start to construct activities based on 5 central digital literacy components: computing for a range of purpose, understanding the function of and ability to use common tools, communication literacy, Web literacy, and troubleshooting skills. These activities can help teachers scaffold for their students and themselves the range of digital literacy proficiency skills, that is, their proficiency in using common tools as well as their use of different communications and Web tools. However, when it comes to large-scale assessments of digital literacy of teachers and students at the national and federal levels, the use of the digital literacy analysis model outlined in this study would be operationally and financially impractical. The field urgently needs to develop valid methods and instruments of assessment that help aggregate state and federal data as schools and districts at the local level acquire more and more technology. These methods and measurement instruments are likely to include surveys, e-readiness assessment tools, multiple-choice tests, pre- and post-tests, etc., that can measure individual as well as group progress in digital literacy.
oai:ejournals.bc.edu:article/1671
2011-05-11T18:06:56Z
jtla:ART
v2
https://ejournals.bc.edu/index.php/jtla/article/view/1671
2011-05-11T18:06:56Z
The Journal of Technology, Learning and Assessment
Vol. 1 No. 5 (2002)
Enhancing the Design and Delivery of Assessment Systems: A Four-Process Architecture
Almond, Russell
Steinberg, Linda
Mislevy, Robert
2002-10-01
url:https://ejournals.bc.edu/index.php/jtla/article/view/1671
en_US
Persistent elements and relationships underlie the design and delivery of educational assessments, despite their widely varying purposes, contexts, and data types. One starting point for analyzing these relationships is the assessment as experienced by the examinee: 'What kinds of questions are on the test?,' 'Can I do them in any order?,' 'Which ones did I get wrong?,' and 'What's my score?' These questions, asked by people of all ages and backgrounds, reveal an awareness that an assessment generally entails the selection and presentation of tasks, the scoring of responses, and the accumulation of these response evaluations into some kind of summary score. A four-process architecture is presented for the delivery of assessments: Activity Selection, Presentation, Response Processing, and Summary Scoring. The roles and the interactions among these processes, and how they arise from an assessment design model, are discussed. The ideas are illustrated with hypothetical examples. The complementary modular structures of the delivery processes and the design framework are seen to encourage coherence among assessment purpose, design, and delivery, as well as to promote efficiency through the reuse of design objects and delivery processes.