ARXIV:2605.14205 · E-COMMERCE AGENTS · SUBMITTED 15 MAY · 20:13 UTC · FRESHNESS FRESH

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents

Zahra Zanjani Foumani · Alberto Castelo · Shuang Xie · Ted Chaiwachirasak · Han Li · Lingyun Wang · arXiv

A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.

Blocked on Code›Score5.0Evidence unverified

Opportunity summary

Pain A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.

Evidence 0 refs | 0 sources | 0% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences. Existing personalization methods rely on hand-crafted prompt-based personas that are…

METHOD

Full abstract

LLM-based web agents can navigate live storefronts, yet they often collapse to a single "average buyer" policy, failing to capture the heterogeneous and distributional nature of real buyer populations. Existing personalization methods rely on hand-crafted prompt-based personas that are brittle, difficult to scale, context-inefficient, and unable to faithfully represent population-level behavior. We introduce SimPersona, a novel framework that learns discrete buyer types from historical traffic and exposes them to LLM-based web agents as compact persona tokens. Given raw clickstreams, a behavior-aware VQ-VAE induces a discrete buyer-type space that captures the statistical structure of real buyer behavior and merchant-specific buyer population distributions. To provide behavior-specific guidance to LLM-based web agents, SimPersona maps each learned buyer type to a dedicated persona token in the LLM agent vocabulary and fine-tunes the agent with these tokens on real browsing traces. At inference, each synthetic buyer is assigned to a learned buyer type with a single encoder forward pass, requiring no retraining or store-specific prompt engineering. For population-level simulation, SimPersona samples buyer types from each merchant's empirical distribution over the learned VQ-VAE codebook and instantiates agents with the corresponding persona tokens, preserving merchant-specific buyer population distributions. Evaluated on $8.37$M buyers across $42$ held-out live storefronts, SimPersona achieves $78\%$ conversion-rate alignment with real buyers, exhibits interpretable behavioral variation across buyer types, and outperforms a baseline with $8\times$ more parameters on goal-oriented shopping tasks. We further release an open-source data pipeline that converts raw e-commerce event logs into buyer representations and agent-training traces.

RESULT

ScienceToStartup currently rates this 5.0/10 on the public viability pass. Evaluated on $8.37$M buyers across $42$ held-out live storefronts, SimPersona achieves $78\%$ conversion-rate alignment with real buyers, exhibits interpretable behavioral variation across buyer types,…

WHY NOW

E-commerce Agents moved forward this cycle; last verified May 2026. Public score 5.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score5.0

PainA framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.

Evidence0 refs | 0 sources | 0% coverage

Blockerno shell-level blocker reported

Analysis summary

A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.

Segment

E-commerce Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "c11e5ddb-b2da-4693-a1c2-85ac4d85db7f", "arxiv_id": "2605.14205", "canonical_route": "/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents", "endpoints": { "paper_pack": "/api/v1/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents/paper-pack", "build_passport": "/api/v1/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents", "normalized_query": "2605.14205", "route": "/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents", "paper_ref": "simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents#webpage", "url": "https://sciencetostartup.com/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents", "name": "SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents", "description": "A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents#scholarlyArticle", "headline": "SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents", "description": "A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.", "url": "https://sciencetostartup.com/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents", "sameAs": "https://arxiv.org/abs/2605.14205", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.14205" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-14T00:01:11.000Z", "author": [ { "@type": "Person", "name": "Zahra Zanjani Foumani" }, { "@type": "Person", "name": "Alberto Castelo" }, { "@type": "Person", "name": "Shuang Xie" }, { "@type": "Person", "name": "Ted Chaiwachirasak" }, { "@type": "Person", "name": "Han Li" }, { "@type": "Person", "name": "Lingyun Wang" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 5 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "E-commerce Agents" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "E-commerce Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "SimPersona: Learning Discrete Buyer Personas from Raw Clicks", "item": "https://sciencetostartup.com/paper/simpersona-learning-discrete-buyer-personas-from-raw-clickstreams-for-grounded-e-commerce-agents" } ] } ] }

Competitive landscape

A framework that learns discrete buyer personas from clickstream data to enable LLM agents to navigate e-commerce storefronts more effectively and personalize shopping experiences.

Segment

E-commerce Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents

SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline