Director of the Endeavor Foundation Center for Faculty Development Rollins College, United States
The role of artificial intelligence (AI) in assessment is a subject of active debate. It remains unclear if large language models can produce reliable evaluations that agree with those given by humans. In this interactive research presentation, we will share results from a pilot program comparing human and AI assessment of Information Literacy (AAC&U VALUE rubric) using student artifacts collected from courses across our multidisciplinary general education program. Our results show that while it is possible to create an AI system that aligns with expert faculty assessors, results are sensitive to model and prompt, and careful benchmarking is critical. Results aside, our study foregrounds the necessity of aligning assessment practices with institutional values and weighing trade-offs inherent to diminishing direct faculty participation in the process. Working with attendees, we will identify benefits that an assessment program might serve and ask whether AI supports or harms those goals.