Proof pending. Core topic summary fields are still materializing.
Content moderation is evolving to address the complexities of online interactions, particularly in detecting illicit activities and harmful behaviors across diverse platforms. Recent research highlights the use of advanced machine learning techniques, such as In-Context Learning and vision-language models, to enhance detection capabilities while minimizing the need for extensive labeled datasets. These innovations allow for better generalization to new threats and improve the accuracy of identifying harmful content in real-time. As online environments become increasingly dynamic, these advancements are crucial for builders seeking to create safer digital spaces, enabling proactive rather than reactive moderation strategies that can adapt to changing user behaviors and platform policies.
Illicit online promotion is a persistent threat that evolves to evade detection. Existing moderation systems remain tethered to platform-specific supervision and static taxonomies, a reactive paradigm...
Social Virtual Reality (VR) platforms provide immersive social experiences but also expose users to serious risks of online harassment. Existing safety measures are largely reactive, while proactive s...
Online marketplaces, while revolutionizing global commerce, have inadvertently facilitated the proliferation of illicit activities, including drug trafficking, counterfeit sales, and cybercrimes. Trad...
Internet memes have become pervasive carriers of digital culture on social platforms. However, their heavy reliance on metaphors and sociocultural context also makes them subtle vehicles for harmful c...
Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a fi...
Online content moderation is essential for maintaining a healthy digital environment, and reliance on AI for this task continues to grow. Consider a user comment using national stereotypes to insult a...
Framing theory posits that how information is presented shapes audience responses, but computational work has largely ignored audience reactions. While recent work showed that article framing systemat...
We present KidsNanny, a two-stage multimodal content moderation architecture for child safety. Stage 1 combines a vision transformer (ViT) with an object detector for visual screening (11.7 ms); outpu...
Detecting hate speech in memes is challenging due to their multimodal nature and subtle, culturally grounded cues such as sarcasm and context. While recent vision-language models (VLMs) enable joint r...
Freshness
Canonical route: /topics
Agent Handoff
Canonical ID content-moderation | Route /topic/content-moderation
REST example
curl https://sciencetostartup.com/api/v1/agent-handoff/topic/content-moderationMCP example
{
"tool": "search_papers",
"arguments": {
"query": "Content Moderation",
"cluster": "Content Moderation"
}
}source_context
{
"surface": "topic",
"mode": "topic",
"query": "Content Moderation",
"normalized_query": "content-moderation",
"route": "/topic/content-moderation",
"paper_ref": null,
"topic_slug": "content-moderation",
"benchmark_ref": null,
"dataset_ref": null
}Use This Via API or MCP
Topic pages bundle paper counts, viability trends, author concentration, and top questions into one canonical surface your agents can reference before they open Signal Canvas or create a workspace.