ARXIV:2606.06715 · LLM EVALUATION · SUBMITTED 08 JUN · 20:21 UTC · FRESHNESS FRESH

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles

Upasana Chatterjee · arXiv

Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.

Ship in 2-4 weeks›Score4.0Evidence unverified

Opportunity summary

Pain Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.

Evidence 0 refs | 4 sources | 67% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment. Using articles from AllSides, paired with shared sentiment annotations from Llama-3.3-70b-versatile, we compare ideology labels from…

METHOD

Full abstract

We ask whether topic sentiment has a causal effect on perceived political ideology, and whether the answer depends on who assigns the ideology label. Using articles from AllSides, paired with shared sentiment annotations from Llama-3.3-70b-versatile, we compare ideology labels from expert human annotators, GPT-4o-mini (baseline and finetuned), and Llama-3.3-70B. We apply Double Machine Learning (DML) and community-level mediation analysis across all four annotation paradigms. Human annotations yield no significant causal effects at the community level. Fine-tuned GPT-4o-mini achieves the highest classification accuracy (F1=72.48) and is the only annotator paradigm that produces significant community-level treatment effects and significant natural direct effects (NDEs) in mediation. We interpret this as evidence of shortcut learning: fine-tuning on ideology-labeled data causes the model to internalise a spurious sentiment--ideology coupling not operative in human judgment for this task. This coupling is structurally invisible to F1-based evaluation, with implications for the use of LLM annotations as silver labels and as proxies for human judgment in downstream causal analyses.

RESULT

ScienceToStartup currently rates this 4.0/10 on the public viability pass. Fine-tuned GPT-4o-mini achieves the highest classification accuracy (F1=72.48) and is the only annotator paradigm that produces significant community-level treatment effects and significant natural direct…

WHY NOW

LLM Evaluation moved forward this cycle; last verified June 2026. Public score 4.0/10. Implementation evidence is present through a linked repository.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score4.0

PainFine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.

Evidence0 refs | 4 sources | 67% coverage

Blockerno shell-level blocker reported

Analysis summary

Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.

Segment

LLM Evaluation

Adoption evidence

Public code linked for build inspection

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "1f839d13-e805-4658-bf11-e504461c0422", "arxiv_id": "2606.06715", "canonical_route": "/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles", "endpoints": { "paper_pack": "/api/v1/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles/paper-pack", "build_passport": "/api/v1/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles", "normalized_query": "2606.06715", "route": "/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles", "paper_ref": "does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles#webpage", "url": "https://sciencetostartup.com/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles", "name": "Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles", "description": "Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles#scholarlyArticle", "headline": "Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles", "description": "Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.", "url": "https://sciencetostartup.com/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles", "sameAs": "https://arxiv.org/abs/2606.06715", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2606.06715" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-06-04T21:05:48.000Z", "author": [ { "@type": "Person", "name": "Upasana Chatterjee" } ], "codeRepository": "https://github.com/upasanachatterjee/causal-inference-on-text", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 4 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "LLM Evaluation" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code, repo url" } ] }, { "@type": "SoftwareSourceCode", "@id": "https://sciencetostartup.com/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles#software", "name": "Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles - Source Code", "description": "Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.", "codeRepository": "https://github.com/upasanachatterjee/causal-inference-on-text", "url": "https://github.com/upasanachatterjee/causal-inference-on-text" }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "LLM Evaluation", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Does Topic Sentiment Cause Perceived Ideology? Comparing Hum", "item": "https://sciencetostartup.com/paper/does-topic-sentiment-cause-perceived-ideology-comparing-human-and-llm-annotations-in-political-news-articles" } ] } ] }

Competitive landscape

Fine-tuned LLMs exhibit spurious sentiment-ideology couplings invisible to standard evaluation, impacting their use as proxies for human judgment.

Segment

LLM Evaluation

Adoption evidence

Public code linked for build inspection

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles

Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline