Please use this identifier to cite or link to this item: https://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/2542
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSingh, Lokesh-
dc.contributor.authorSisodia, Deepti-
dc.contributor.authorShashvat, Kumar-
dc.contributor.authorKaur, Arshpreet-
dc.contributor.authorSharma, Prakash Chandra-
dc.date.accessioned2023-12-18T09:45:34Z-
dc.date.available2023-12-18T09:45:34Z-
dc.date.issued2023-
dc.identifier.citationChapter 13; pp. 221-254en_US
dc.identifier.isbn9781000917918-
dc.identifier.isbn9781032392769-
dc.identifier.urihttps://doi.org/10.1201/9781003415466-13-
dc.identifier.urihttp://gnanaganga.inflibnet.ac.in:8080/jspui/handle/123456789/2542-
dc.description.abstractIn the pay-per-click (PPC) model of online advertising, an advertiser pays an amount to the publishers for every click generated on the published advertisement, which results in click fraud. Click fraud is deliberate clicking by a publisher on the advert. The highly skewed class distribution of the dataset makes the identification of fraudsters more challenging for current machine learning methods. This work thus proposes a reliable click-fraud detection (CFD) system for the efficient investigation of fraudulent publishers. The proposed CFD system has many novel features. First, the problem of class imbalance is overcome using the synthetic minority oversampling technique (SMOTE) and random under-sampling (RUSBOOST). Second, a novel Hybrid-Manifold Feature Subset Selection (H-MFSS) is proposed to obtain optimal informative features. Third, the gradient tree boosting (GTB) model addresses the challenges encountered in investigating and classifying the behavior of fraudsters from balanced and optimally selected user-click data. Experiments are conducted on FDMA2012 mobile advertising user-click data in dual mode: with all features (original data and data sampled through data sampling methods); and with selected features (original data and data sampled through data sampling methods). Classification bias towards the majority class is avoided by evaluating the performance of the models using the average precision (AP), recall (SE), specificity (SP), and G-mean (GM) metrics rather than accuracy. The efficacy of the proposed GTB model is further evaluated by comparing the performance with 12 other conventional machine learning models. The empirical results prove that GTB generalizes well with an achieved AP score of 64.86% without sampling, 65.25% with RUSBoost and 66.78% with SMOTE using significant selected features. A significant improvement in the classification performance is achieved with the impact of sampling methods and selected optimal features. © 2023 selection and editorial matter, Sulabh Bansal, Prakash Chandra Sharma, Abhishek Sharma and Jieh-Ren Chang individual chapters, the contributors.en_US
dc.language.isoenen_US
dc.publisherCRC Pressen_US
dc.subjectPay-per-click (PPC)en_US
dc.subjectOnline advertisingen_US
dc.subjectFraudulent publishersen_US
dc.subjectClick-fraud detection (CFD)en_US
dc.subjectGradient tree boosting (GTB)en_US
dc.titleA Reliable Click-Fraud Detection System For The Investigation of Fraudulent Publishers In online Advertisingen_US
dc.typeBook chapteren_US
Appears in Collections:Book/ Book Chapters

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.