Welcome back to the Ask Dr. Masha blog series, where we explore how the Duolingo English Test (DET) measures the skills that matter most for academic success. Last time, we looked at how the DET assesses writing. Now, we’re turning our attention to speaking, another complex skill that’s essential for success in English-medium environments, academic or not!
From holding a conversation to giving a presentation, speaking is arguably the most cognitively demanding skill for English learners. It requires the real-time integration of multiple language systems including rapid retrieval of vocabulary, grammatical structures, organization of ideas, and sound articulation. This integration needs to happen on the fly while test takers also manage the social and pragmatic aspects of communication.
With cutting-edge AI-powered task development, the DET is designed to measure this complexity through a variety of carefully constructed speaking tasks that tap into different aspects of oral proficiency, challenging test takers to demonstrate mastery of speaking sub-constructs.
But how exactly do we do that?
Authentic speaking tasks
The DET’s speaking section isn’t just one task, it’s a series of four tasks: Picture Description, Extended Speaking, Speaking Sample, and Interactive Speaking. Each one targets different speaking genres, ensuring substantive construct coverage and simulating authentic English use, especially in academic and professional settings, where speakers must shift between genres and rhetorical purposes.
Picture Description, Extended Speaking, and the Speaking Sample require more than just verbal output—they challenge test takers to organize and express ideas across different discourse types. These tasks elicit a range of rhetorical purposes, such as describing, narrating, explaining, and arguing, which mirror common academic speaking demands.
The Interactive Speaking task, the newest task on the DET, simulates a live conversation with an avatar. It prompts spontaneous responses across a series of questions, measuring fluency and coherence, and the ability to engage in topic shifts–crucial skills for real-time interaction in classrooms, interviews, and discussions.
Together, these tasks cover a wide range of genres, meaning the DET is able to capture how well test takers can communicate in different situations, for different purposes. This genre-based variety allows us to take a multidimensional approach to speaking, assessing not just isolated skills but how language is used across a variety of communicative purposes.
AI scoring for construct coverage
What truly sets the DET speaking tasks apart is how we score test takers’ responses. Instead of relying on human raters, we use machine learning models to evaluate numerous linguistic features of each spoken response.
These features are closely aligned with well-established dimensions of speaking proficiency, including Content (e.g., relevance and development of ideas), Discourse Coherence (e.g., logical flow and clarity), Vocabulary and Grammar (e.g., lexical diversity, range of grammatical structures, sophistication, and accuracy), Fluency (e.g., speed, chunking), and Pronunciation (intelligibility, lexical stress, intonation, rhythm).
This multi-faceted approach ensures that no single feature dominates the score. Instead, the test reflects the complex, integrative nature of real-life speaking. Moreover, unlike human raters, who may be influenced by implicit biases, fatigue, or subjective preferences (e.g., Winke et al., 2013), DET automated scoring ensures a high degree of fairness and consistency. Every response is evaluated using the same criteria, regardless of a test taker’s accent, appearance, or background.
Relevance for academic english
Beyond general language ability, the DET is designed with academic English demands in mind. Tasks require critical thinking, idea organization, and the ability to speak at length on complex topics—all essential for thriving in English-medium universities.
Because the speaking tasks elicit responses across a range of CEFR levels, including C1 and C2, the DET is well-suited for high-stakes decisions like university admissions. Moreover, test-taker responses to the Speaking Sample (along with the Writing Sample) are recorded and shared with institutions, adding a layer of transparency to the assessment.
Speaking in a second language is complex, but measuring it doesn’t have to be. By leveraging AI, adaptive design, and linguistically rich tasks, the DET offers a robust, scalable way to assess spoken English.