ARXIV:2601.21895 · AI DETECTION · SUBMITTED 19 MAR · 18:48 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text

arXiv

A novel algorithm to reliably detect LLM-generated text, outperforming current baselines.

Blocked on Code›Score7.0Evidence unverified

Opportunity summary

Pain A novel algorithm to reliably detect LLM-generated text, outperforming current baselines.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A novel algorithm to reliably detect LLM-generated text, outperforming current baselines. Yet, their ability to produce highly human-like text raises serious concerns about misinformation and academic integrity, making it an urgent need for reliable…

METHOD

Full abstract

Modern large language models (LLMs) such as GPT, Claude, and Gemini have transformed the way we learn, work, and communicate. Yet, their ability to produce highly human-like text raises serious concerns about misinformation and academic integrity, making it an urgent need for reliable algorithms to detect LLM-generated content. In this paper, we start by presenting a geometric approach to demystify rewrite-based detection algorithms, revealing their underlying rationale and demonstrating their generalization ability. Building on this insight, we introduce a novel rewrite-based detection algorithm that adaptively learns the distance between the original and rewritten text. Theoretically, we demonstrate that employing an adaptively learned distance function is more effective for detection than using a fixed distance. Empirically, we conduct extensive experiments with over 100 settings, and find that our approach demonstrates superior performance over baseline algorithms in the majority of scenarios. In particular, it achieves relative improvements from 57.8\% to 80.6\% over the strongest baseline across different target LLMs (e.g., GPT, Claude, and Gemini).

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Theoretically, we demonstrate that employing an adaptively learned distance function is more effective for detection than using a fixed distance.

WHY NOW

AI Detection moved forward this cycle; last verified April 2026. Public score 7.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA novel algorithm to reliably detect LLM-generated text, outperforming current baselines.

Evidence0 refs | 0 sources | 33% coverage

Blockermissing authors

Analysis summary

A novel algorithm to reliably detect LLM-generated text, outperforming current baselines.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

References(68)

Reference metadata pending (370ece371455344ed7a21859dab20b29da2ac406)

Reference metadata pending (231410c0f936e10faeb7623b73da59e4a7ab32e7)

Reference metadata pending (d6e7cec15b34786debf4cb2bddf96ff107ef62c5)

Reference metadata pending (cf8ded4b63c5d46b60356b5b6d99a6372c70efe0)

Reference metadata pending (39d9c3f1cd4bd5069713e50dc7301570575fc055)

Reference metadata pending (d2d84d56f730f81d276a02b48d5d44db5bde0b4a)

Reference metadata pending (9cd7ee46c56b795515f62b866c95f189ca479be7)

Reference metadata pending (2c5e053b33d281f89f68376db319fb17f9949a1b)

Reference metadata pending (0f837f8b9aa9d7eb09a5e84804c9806f25d8c858)

Reference metadata pending (bb41aa5b3a26bd9690cdb7b279ea52bb578a58f4)

Reference metadata pending (8ec6d697f9a3e23543dd6279d6b20ac497db252b)

Reference metadata pending (7943ec4a67151a559b25cd34369e661c9a7924c8)

Reference metadata pending (d543a049879f8ac174b013c41328ae47b73ded6b)

Reference metadata pending (89bd8efe0b9c0427cb7814d7b8c2b0190d2ffa9e)

Reference metadata pending (b8c0c24bc7a09c55d4b7cae12744b93ebc4b97a9)

Reference metadata pending (40e8af970329135ec95057d73e239dab805ad128)

Reference metadata pending (33e5ee79af3b32202d0e67a278109d4eaa380b5e)

Reference metadata pending (f11e0936dc947542b157205e81b5cd8e177dd8f5)

Reference metadata pending (a2e5b2043a64ba8944d61d6f8c89a65986e5cde9)

Reference metadata pending (33fe80c0857b2eac6466d1b31997e2f1ee879589)

Showing 20 of 68 references

{ "contract_version": "paper-r2", "paper_id": "24879e3d-77b0-4fe0-be17-16a4107d3952", "arxiv_id": "2601.21895", "canonical_route": "/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "learn-to-distance-distance-learning-for-detecting-llm-generated-text", "endpoints": { "paper_pack": "/api/v1/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text/paper-pack", "build_passport": "/api/v1/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text", "normalized_query": "2601.21895", "route": "/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text", "paper_ref": "learn-to-distance-distance-learning-for-detecting-llm-generated-text", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text#webpage", "url": "https://sciencetostartup.com/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text", "name": "Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text", "description": "A novel algorithm to reliably detect LLM-generated text, outperforming current baselines.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text#scholarlyArticle", "headline": "Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text", "description": "A novel algorithm to reliably detect LLM-generated text, outperforming current baselines.", "url": "https://sciencetostartup.com/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text", "sameAs": "https://arxiv.org/abs/2601.21895", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2601.21895" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-01-29T15:55:15.000Z", "author": [ { "@type": "Person", "name": "Hongyi Zhou", "affiliation": { "@type": "Organization", "name": "Tsinghua University" } }, { "@type": "Person", "name": "Jin Zhu", "affiliation": { "@type": "Organization", "name": "University of Birmingham" } }, { "@type": "Person", "name": "Erhan Xu", "affiliation": { "@type": "Organization", "name": "London School of Economics and Political Science" } }, { "@type": "Person", "name": "Kai Ye", "affiliation": { "@type": "Organization", "name": "London School of Economics and Political Science" } }, { "@type": "Person", "name": "Ying Yang", "affiliation": { "@type": "Organization", "name": "Tsinghua University" } }, { "@type": "Person", "name": "Chengchun Shi", "affiliation": { "@type": "Organization", "name": "London School of Economics and Political Science" } } ], "citation": [ { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "370ece371455344ed7a21859dab20b29da2ac406" }, "url": "https://www.semanticscholar.org/paper/370ece371455344ed7a21859dab20b29da2ac406" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "231410c0f936e10faeb7623b73da59e4a7ab32e7" }, "url": "https://www.semanticscholar.org/paper/231410c0f936e10faeb7623b73da59e4a7ab32e7" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "d6e7cec15b34786debf4cb2bddf96ff107ef62c5" }, "url": "https://www.semanticscholar.org/paper/d6e7cec15b34786debf4cb2bddf96ff107ef62c5" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "cf8ded4b63c5d46b60356b5b6d99a6372c70efe0" }, "url": "https://www.semanticscholar.org/paper/cf8ded4b63c5d46b60356b5b6d99a6372c70efe0" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "39d9c3f1cd4bd5069713e50dc7301570575fc055" }, "url": "https://www.semanticscholar.org/paper/39d9c3f1cd4bd5069713e50dc7301570575fc055" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "d2d84d56f730f81d276a02b48d5d44db5bde0b4a" }, "url": "https://www.semanticscholar.org/paper/d2d84d56f730f81d276a02b48d5d44db5bde0b4a" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "9cd7ee46c56b795515f62b866c95f189ca479be7" }, "url": "https://www.semanticscholar.org/paper/9cd7ee46c56b795515f62b866c95f189ca479be7" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "2c5e053b33d281f89f68376db319fb17f9949a1b" }, "url": "https://www.semanticscholar.org/paper/2c5e053b33d281f89f68376db319fb17f9949a1b" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "0f837f8b9aa9d7eb09a5e84804c9806f25d8c858" }, "url": "https://www.semanticscholar.org/paper/0f837f8b9aa9d7eb09a5e84804c9806f25d8c858" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "bb41aa5b3a26bd9690cdb7b279ea52bb578a58f4" }, "url": "https://www.semanticscholar.org/paper/bb41aa5b3a26bd9690cdb7b279ea52bb578a58f4" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "8ec6d697f9a3e23543dd6279d6b20ac497db252b" }, "url": "https://www.semanticscholar.org/paper/8ec6d697f9a3e23543dd6279d6b20ac497db252b" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "7943ec4a67151a559b25cd34369e661c9a7924c8" }, "url": "https://www.semanticscholar.org/paper/7943ec4a67151a559b25cd34369e661c9a7924c8" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "d543a049879f8ac174b013c41328ae47b73ded6b" }, "url": "https://www.semanticscholar.org/paper/d543a049879f8ac174b013c41328ae47b73ded6b" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "89bd8efe0b9c0427cb7814d7b8c2b0190d2ffa9e" }, "url": "https://www.semanticscholar.org/paper/89bd8efe0b9c0427cb7814d7b8c2b0190d2ffa9e" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "b8c0c24bc7a09c55d4b7cae12744b93ebc4b97a9" }, "url": "https://www.semanticscholar.org/paper/b8c0c24bc7a09c55d4b7cae12744b93ebc4b97a9" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "40e8af970329135ec95057d73e239dab805ad128" }, "url": "https://www.semanticscholar.org/paper/40e8af970329135ec95057d73e239dab805ad128" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "33e5ee79af3b32202d0e67a278109d4eaa380b5e" }, "url": "https://www.semanticscholar.org/paper/33e5ee79af3b32202d0e67a278109d4eaa380b5e" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "f11e0936dc947542b157205e81b5cd8e177dd8f5" }, "url": "https://www.semanticscholar.org/paper/f11e0936dc947542b157205e81b5cd8e177dd8f5" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "a2e5b2043a64ba8944d61d6f8c89a65986e5cde9" }, "url": "https://www.semanticscholar.org/paper/a2e5b2043a64ba8944d61d6f8c89a65986e5cde9" }, { "@type": "ScholarlyArticle", "identifier": { "@type": "PropertyValue", "propertyID": "SemanticScholar", "value": "33fe80c0857b2eac6466d1b31997e2f1ee879589" }, "url": "https://www.semanticscholar.org/paper/33fe80c0857b2eac6466d1b31997e2f1ee879589" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "AI Detection" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "AI Detection", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Learn-to-Distance: Distance Learning for Detecting LLM-Gener", "item": "https://sciencetostartup.com/paper/learn-to-distance-distance-learning-for-detecting-llm-generated-text" } ] }, { "@type": "FAQPage", "mainEntity": [ { "@type": "Question", "name": "What is the startup potential of \"Learn-to-Distance: Distance Learning for Detecting LLM-Gener\"?", "acceptedAnswer": { "@type": "Answer", "text": "Adaptive rewrite-based algorithm to detect LLM-generated text surpasses existing methods by up to 80.6%." } }, { "@type": "Question", "name": "What products could be built from this research?", "acceptedAnswer": { "@type": "Answer", "text": "The technology can be productized into an online service that provides AI-generated text detection for enterprises, educational institutions, and news platforms." } }, { "@type": "Question", "name": "What are the practical use cases?", "acceptedAnswer": { "@type": "Answer", "text": "Develop a browser extension or cloud API service that detects AI-generated text in emails, documents, or social media posts." } }, { "@type": "Question", "name": "What industries could this research disrupt?", "acceptedAnswer": { "@type": "Answer", "text": "This method replaces manual detection processes and less effective traditional algorithms that fail against advanced AI text generators." } } ] } ] }