Prototype for novelty detection in a news stream based on data, parsed from Event Registry. The prototype is using QMiner's anomaly detection stream aggregate based on k-Nearest Neighbor algorithm.
Import all relevant libraries.
let qm = require('qminer');
let fs = require('fs');
let readline = require('readline');
let eachline = require('eachline');
Define the storage schema. We define one store called 'articles' ...
We use a 'fake' time series tick stream aggregator for the 'Articles' store to trigger feature space aggregate. This is why articles store has a 'fake' numeric field 'Number'.
let aggrT = {
name: "tickAggr",
type: "timeSeriesTick",
store: "articles",
timestamp: "Time",
value: "Number"
};
//create the tick aggregator
let tickAggr = articles.addStreamAggr(aggrT);
Novelty detection is implemented using Nearest Neighbors anomaly detector aggregator. Aggregator is used on the articles store and takes timestamped features as input. The time stamp is provided by the tick aggregator while the feature vector is provided by the feature space aggregator.
let aggrAD = {
name: 'AnomalyDetectorAggr',
type: 'nnAnomalyDetector',
inAggrSpV: 'ftrSpaceAggr',
inAggrTm: 'tickAggr',
rate: [0.2, 0.05, 0.01],
windowSize: 200
};
// Create the anomaly detection aggregator
let anomaly = articles.addStreamAggr(aggrAD);