ARXIV:2605.13538 · ON-DEVICE PII SUBSTITUTION · SUBMITTED 14 MAY · 20:10 UTC · FRESHNESS FRESH

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Anuj Sadani · Deepak Kumar · arXiv

An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.

Ship in 2-4 weeks›Score6.0Evidence unverified

Opportunity summary

Pain An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.

Evidence 0 refs | 0 sources | 0% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting. We propose a fully on-device pipeline that substitutes PII with consistent, type-preserving fake values: a 1.5…

METHOD

Full abstract

Personally Identifiable Information (PII) redaction usually replaces detected entities with placeholder tokens such as [PERSON], destroying the downstream utility of the redacted text for retrieval and Named Entity Recognition (NER) training. We propose a fully on-device pipeline that substitutes PII with consistent, type-preserving fake values: a 1.5 B mixture-of-experts token classifier (openai/privacy-filter) detects spans, a 1-bit Bonsai-1.7B Small Language Model (SLM) proposes contextual surrogates for names, addresses, and dates, and a rule-based generator (faker) handles patterned fields. We report a prompting finding more important than the quantization choice: with naive fixed three-shot demonstrations, the 1-bit SLM regurgitates demonstration outputs verbatim regardless of input; 1.58-bit Ternary-Bonsai-1.7B reproduces byte-identical failures, ruling out quantization as the cause. We fix this with locale-conditioned rotating few-shot demonstrations: a character-range heuristic picks a locale-pure pool and a per-input MD5 hash samples three demonstrations. With the fix, 482/482 unique Bonsai-1.7B calls succeed (no echoes) and produce locale-correct surrogates, although the SLM still copies from a small same-locale demonstration pool - a residual narrowness we quantify. On a 2000-document multilingual corpus, hybrid perplexity (PPL) beats faker in all six locales under a multilingual evaluator (XGLM-564M); length preservation is best-of-three in 4 of 6 locales. On downstream NER (400 train / 100 test, English), redact yields F1=0.000, faker 0.656, original 0.960; on a matched 160/40 subset including hybrid, faker (0.506) outperforms hybrid (0.346) at p < 0.001. We report this as an honest negative finding: SLM surrogates produce more natural text but a less varied training distribution, and downstream NER benefits more from variety than from naturalness.

RESULT

ScienceToStartup currently rates this 6.0/10 on the public viability pass. We report this as an honest negative finding: SLM surrogates produce more natural text but a less varied training distribution, and downstream NER benefits…

WHY NOW

On-Device PII Substitution moved forward this cycle; last verified May 2026. Public score 6.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score6.0

PainAn on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.

Evidence0 refs | 0 sources | 0% coverage

Blockerno shell-level blocker reported

Analysis summary

An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.

Segment

On-Device PII Substitution

Adoption evidence

No public code link in the paper record yet

Commercial read

6.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "8d31feb3-4b05-4800-836f-a64fa981c914", "arxiv_id": "2605.13538", "canonical_route": "/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan", "endpoints": { "paper_pack": "/api/v1/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan/paper-pack", "build_passport": "/api/v1/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models", "normalized_query": "2605.13538", "route": "/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan", "paper_ref": "locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan#webpage", "url": "https://sciencetostartup.com/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan", "name": "Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models", "description": "An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan#scholarlyArticle", "headline": "Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models", "description": "An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.", "url": "https://sciencetostartup.com/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan", "sameAs": "https://arxiv.org/abs/2605.13538", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.13538" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-13T13:47:11.000Z", "author": [ { "@type": "Person", "name": "Anuj Sadani" }, { "@type": "Person", "name": "Deepak Kumar" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 6 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "On-Device PII Substitution" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "On-Device PII Substitution", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Locale-Conditioned Few-Shot Prompting Mitigates Demonstratio", "item": "https://sciencetostartup.com/paper/locale-conditioned-few-shot-prompting-mitigates-demonstration-regurgitation-in-on-device-pii-substitution-with-small-lan" } ] } ] }

Competitive landscape

An on-device pipeline for PII substitution using small language models that mitigates demonstration regurgitation with locale-conditioned few-shot prompting.

Segment

On-Device PII Substitution

Adoption evidence

No public code link in the paper record yet

Commercial read

6.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline