Proof pending. Core topic summary fields are still materializing.
Dataset distillation is an emerging technique that compresses large datasets into smaller, synthetic versions while maintaining model training performance. Recent advancements focus on optimizing both dataset compactness and precision, addressing challenges such as redundancy and class imbalance. Techniques like Quantization-aware Dataset Distillation and Fine-Grained Dataset Distillation enhance the efficiency of training by generating high-quality samples that reflect the underlying data distribution. This is particularly crucial for applications in machine learning, where training on extensive datasets can be resource-intensive. By improving the quality and diversity of distilled datasets, these innovations enable faster training times and better model performance, making them essential for developers looking to streamline their workflows and enhance the effectiveness of their models.
Dataset Distillation (DD) compresses large datasets into compact synthetic ones that maintain training performance. However, current methods mainly target sample reduction, with limited consideration ...
Dataset distillation (DD) compresses a large training set into a small synthetic set, reducing storage and training cost, and has shown strong results on general benchmarks. Decoupled DD further impro...
Spatio-temporal time series are widely used in real-world applications, including traffic prediction and weather forecasting. They are sequences of observations over extensive periods and multiple loc...
Dataset distillation (DD) aims to synthesize compact training sets that enable models to achieve high accuracy with significantly fewer samples. Recent diffusion-based DD methods commonly introduce se...
Dataset distillation often prioritizes global semantic proximity when creating small surrogate datasets for original large-scale ones. However, object semantics are inherently hierarchical. For exampl...
Training machine learning models on massive datasets is expensive and time-consuming. Dataset distillation addresses this by creating a small synthetic dataset that achieves the same performance as th...
Large-scale dataset distillation requires storing auxiliary soft labels that can be 30-40x larger on ImageNet-1K and 200x larger on ImageNet-21K than the condensed images, undermining the goal of data...
In this paper, we propose difficulty-guided sampling (DGS) to bridge the target gap between the distillation objective and the downstream task, therefore improving the performance of dataset distillat...
Dataset distillation (DD) aims to compress large-scale datasets into compact synthetic counterparts for efficient model training. However, existing DD methods exhibit substantial performance degradati...
Freshness
Canonical route: /topics
Agent Handoff
Canonical ID dataset-distillation | Route /topic/dataset-distillation
REST example
curl https://sciencetostartup.com/api/v1/agent-handoff/topic/dataset-distillationMCP example
{
"tool": "search_papers",
"arguments": {
"query": "Dataset Distillation",
"cluster": "Dataset Distillation"
}
}source_context
{
"surface": "topic",
"mode": "topic",
"query": "Dataset Distillation",
"normalized_query": "dataset-distillation",
"route": "/topic/dataset-distillation",
"paper_ref": null,
"topic_slug": "dataset-distillation",
"benchmark_ref": null,
"dataset_ref": null
}Use This Via API or MCP
Topic pages bundle paper counts, viability trends, author concentration, and top questions into one canonical surface your agents can reference before they open Signal Canvas or create a workspace.