A similarity-guided active learning framework strategically queries an oracle to efficiently refine decision boundaries, particularly for rare and diverse anomalies in imbalanced datasets. It integrates novel strategies like normal-like expansion and anomaly-like prioritization, leveraging a specialized similarity measure to exploit the feature space's geometric structure.
This framework helps AI systems find rare and unusual events, like cyberattacks, in huge amounts of data, even when those events are very uncommon. It does this by smartly picking which data points to ask an expert to label, using how similar data points are to each other to guide its choices, thus saving a lot of time and effort.
Was this definition helpful?