Please use this identifier to cite or link to this item:
https://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/2025
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Sisodia, Deepti | - |
dc.contributor.author | Sisodia, Dilip Singh | - |
dc.date.accessioned | 2023-11-09T09:12:07Z | - |
dc.date.available | 2023-11-09T09:12:07Z | - |
dc.date.issued | 2023-05-29 | - |
dc.identifier.issn | 1882-7055 | - |
dc.identifier.issn | 0288-3635 | - |
dc.identifier.uri | https://doi.org/10.1007/s00354-023-00218-1 | - |
dc.identifier.uri | http://gnanaganga.inflibnet.ac.in:8080/jspui/handle/123456789/2025 | - |
dc.description.abstract | In online advertising, a change in the publisher’s actual status label with every generated click shows the suspicious behaviour of the publisher. Furthermore, only a small proportion of the clicks generated by the publishers are invalid, resulting in class skewness in the dataset and a challenging issue for the conventional classification methods as they get biased towards the outnumbered class. This suspicious behaviour of publishers with an uneven class distribution ratio adversely affects the classifier’s performance and increases model complexities. Thus, developing machine-learning methods capable of producing efficacious predictive models towards detecting fraudulent publishers is pivotal. This paper’s novel stacked generalization framework comprises two stacked generalization architectures, one for resampling and the second for classification. The framework employs a stacked generalization approach using generalizers to improve the learning model’s performance in two steps: first, reducing the error rate of algorithms towards reducing the bias in a learning set. Second, the results obtained through level-0 generalizers are fed as input to the level-1 generalizer with stacked integrated output towards combining the predictions for improving the predictive performance. Broad experimentations are conducted on FDMA 2012 user click dataset using ten-fold cross-validation. The performance of the proposed architecture is generalized by performing experiments on eight other highly imbalanced benchmark datasets, and performance is measured using average precision, recall, and F1-score. Results empirically prove the superiority of the proposed architecture in the publisher's behaviour prediction and classification as legitimate or illegitimate. | en_US |
dc.language.iso | en | en_US |
dc.publisher | New Generation Computing | en_US |
dc.subject | Click fraud | en_US |
dc.subject | Class imbalance | en_US |
dc.subject | Data sampling algorithms | en_US |
dc.subject | Base models | en_US |
dc.subject | Meta-model | en_US |
dc.subject | Stacked generalization | en_US |
dc.title | Stacked Generalization Architecture for Predicting Publisher Behaviour from Highly Imbalanced User-Click Data Set for Click Fraud Detection | en_US |
dc.type | Article | en_US |
Appears in Collections: | Journal Articles |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.