Please use this identifier to cite or link to this item: https://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/2025
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSisodia, Deepti-
dc.contributor.authorSisodia, Dilip Singh-
dc.date.accessioned2023-11-09T09:12:07Z-
dc.date.available2023-11-09T09:12:07Z-
dc.date.issued2023-05-29-
dc.identifier.issn1882-7055-
dc.identifier.issn0288-3635-
dc.identifier.urihttps://doi.org/10.1007/s00354-023-00218-1-
dc.identifier.urihttp://gnanaganga.inflibnet.ac.in:8080/jspui/handle/123456789/2025-
dc.description.abstractIn online advertising, a change in the publisher’s actual status label with every generated click shows the suspicious behaviour of the publisher. Furthermore, only a small proportion of the clicks generated by the publishers are invalid, resulting in class skewness in the dataset and a challenging issue for the conventional classification methods as they get biased towards the outnumbered class. This suspicious behaviour of publishers with an uneven class distribution ratio adversely affects the classifier’s performance and increases model complexities. Thus, developing machine-learning methods capable of producing efficacious predictive models towards detecting fraudulent publishers is pivotal. This paper’s novel stacked generalization framework comprises two stacked generalization architectures, one for resampling and the second for classification. The framework employs a stacked generalization approach using generalizers to improve the learning model’s performance in two steps: first, reducing the error rate of algorithms towards reducing the bias in a learning set. Second, the results obtained through level-0 generalizers are fed as input to the level-1 generalizer with stacked integrated output towards combining the predictions for improving the predictive performance. Broad experimentations are conducted on FDMA 2012 user click dataset using ten-fold cross-validation. The performance of the proposed architecture is generalized by performing experiments on eight other highly imbalanced benchmark datasets, and performance is measured using average precision, recall, and F1-score. Results empirically prove the superiority of the proposed architecture in the publisher's behaviour prediction and classification as legitimate or illegitimate.en_US
dc.language.isoenen_US
dc.publisherNew Generation Computingen_US
dc.subjectClick frauden_US
dc.subjectClass imbalanceen_US
dc.subjectData sampling algorithmsen_US
dc.subjectBase modelsen_US
dc.subjectMeta-modelen_US
dc.subjectStacked generalizationen_US
dc.titleStacked Generalization Architecture for Predicting Publisher Behaviour from Highly Imbalanced User-Click Data Set for Click Fraud Detectionen_US
dc.typeArticleen_US
Appears in Collections:Journal Articles

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.