ARXIV:2603.06066 · AUTOMATED ESSAY SCORING · SUBMITTED 19 MAR · 18:48 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring

arXiv

Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.

Blocked on Code›Score5.0Evidence unverified

Opportunity summary

Pain Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters. While early systems relied on handcrafted features and statistical models, recent advances in Large Language Models…

METHOD

Full abstract

Automated Essay Scoring (AES) has been explored for decades with the goal to support teachers by reducing grading workload and mitigating subjective biases. While early systems relied on handcrafted features and statistical models, recent advances in Large Language Models (LLMs) have made it possible to evaluate student writing with unprecedented flexibility. This paper investigates the application of state-of-the-art open-weight LLMs for the grading of Austrian A-level German texts, with a particular focus on rubric-based evaluation. A dataset of 101 anonymised student exams across three text types was processed and evaluated. Four LLMs, DeepSeek-R1 32b, Qwen3 30b, Mixtral 8x7b and LLama3.3 70b, were evaluated with different contexts and prompting strategies. The LLMs were able to reach a maximum of 40.6% agreement with the human rater in the rubric-provided sub-dimensions, and only 32.8% of final grades matched the ones given by a human expert. The results indicate that even though smaller models are able to use standardised rubrics for German essay grading, they are not accurate enough to be used in a real-world grading environment.

RESULT

ScienceToStartup currently rates this 5.0/10 on the public viability pass. Automated Essay Scoring (AES) has been explored for decades with the goal to support teachers by reducing grading workload and mitigating subjective biases.

WHY NOW

Automated Essay Scoring moved forward this cycle; last verified April 2026. Public score 5.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score5.0

PainAutomated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.

Evidence0 refs | 0 sources | 33% coverage

Blockermissing authors

Analysis summary

Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

Competitive landscape

Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.

Segment

Automated Essay Scoring

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "1b291d5a-20a7-43c3-891f-e53a5372d9ec", "arxiv_id": "2603.06066", "canonical_route": "/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring", "endpoints": { "paper_pack": "/api/v1/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring/paper-pack", "build_passport": "/api/v1/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring", "normalized_query": "2603.06066", "route": "/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring", "paper_ref": "evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring#webpage", "url": "https://sciencetostartup.com/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring", "name": "Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring", "description": "Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring#scholarlyArticle", "headline": "Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring", "description": "Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.", "url": "https://sciencetostartup.com/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring", "sameAs": "https://arxiv.org/abs/2603.06066", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.06066" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-06T09:21:51.000Z", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 5 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Automated Essay Scoring" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Automated Essay Scoring", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Evaluating Austrian A-Level German Essays with Large Languag", "item": "https://sciencetostartup.com/paper/evaluating-austrian-a-level-german-essays-with-large-language-models-for-automated-essay-scoring" } ] } ] }

Competitive landscape

Automated essay scoring tool for Austrian A-level German texts using open-weight LLMs, showing moderate agreement with human raters.

Segment

Automated Essay Scoring

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring

Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline