ARXIV:2605.15417 · GENERATIVE MODEL TRAINING · SUBMITTED 18 MAY · 20:31 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data

Jake Fawkes · Jason Hartford · arXiv

A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.

Ship in 2-4 weeks›Score5.0Evidence unverified

Opportunity summary

Pain A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data. This loss has the property that when evaluated \emph{on-policy} its…

METHOD

Full abstract

In GFlowNets and variational inference, it has been shown that the mean square error between target and model log probabilities is an effective, low variance, surrogate loss for training generative models. This loss has the property that when evaluated \emph{on-policy} its gradients correspond to those of the KL divergence, while \emph{off-policy} it remains a valid loss with the same global minimizer. In this work, we demonstrate that this construction can be extended to the whole family of $f$-divergences, leading to a family of losses whose on-policy gradients are that of the corresponding $f$-divergence, but retain the same global minimizer off-policy. Specifically, we show that the on-policy gradients lead to a one to one correspondence between translation invariant loss functions on the target and model log probabilities, and $f$-divergences. This equivalence allows us to design new surrogate loss functions for tuning a wide class of generative models that inherit the properties of the corresponding $f$-divergence, such as being more mode covering, whilst being applicable to off-policy data. We apply our losses on a range of tasks, including classic synthetic examples, SynFlowNets for molecule discovery, and asynchronous large language model (LLM) tuning, demonstrating that our models retain their predicted properties on- and off-policy in a wide class of generative models.

RESULT

ScienceToStartup currently rates this 5.0/10 on the public viability pass. In this work, we demonstrate that this construction can be extended to the whole family of $f$-divergences, leading to a family of losses whose…

WHY NOW

Generative Model Training moved forward this cycle; last verified May 2026. Public score 5.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score5.0

PainA family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.

Segment

Generative Model Training

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "09234f74-35f4-4fcc-bfdd-f85a060460d1", "arxiv_id": "2605.15417", "canonical_route": "/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data", "endpoints": { "paper_pack": "/api/v1/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data/paper-pack", "build_passport": "/api/v1/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data#webpage", "url": "https://sciencetostartup.com/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data", "name": "$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data", "description": "A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data#scholarlyArticle", "headline": "$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data", "description": "A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.", "url": "https://sciencetostartup.com/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data", "sameAs": "https://arxiv.org/abs/2605.15417", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.15417" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-14T21:02:07.000Z", "author": [ { "@type": "Person", "name": "Jake Fawkes" }, { "@type": "Person", "name": "Jason Hartford" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 5 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Generative Model Training" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Generative Model Training", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, ", "item": "https://sciencetostartup.com/paper/f-trajectory-balance-a-loss-family-for-tuning-gflownets-generative-models-and-llms-with-off-and-on-policy-data" } ] } ] }

Competitive landscape

A family of surrogate loss functions for training generative models, including GFlowNets, LLMs, and variational inference, that can utilize both off- and on-policy data.

Segment

Generative Model Training

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data

$f$-Trajectory Balance: A Loss Family for Tuning GFlowNets, Generative Models, and LLMs with Off- and On-Policy Data

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline