A Reliable Click-Fraud Detection System For The Investigation of Fraudulent Publishers In online Advertising

Singh, Lokesh; Sisodia, Deepti; Shashvat, Kumar; Kaur, Arshpreet; Sharma, Prakash Chandra

Please use this identifier to cite or link to this item: https://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/2542

Full metadata record

DC Field	Value	Language
dc.contributor.author	Singh, Lokesh	-
dc.contributor.author	Sisodia, Deepti	-
dc.contributor.author	Shashvat, Kumar	-
dc.contributor.author	Kaur, Arshpreet	-
dc.contributor.author	Sharma, Prakash Chandra	-
dc.date.accessioned	2023-12-18T09:45:34Z	-
dc.date.available	2023-12-18T09:45:34Z	-
dc.date.issued	2023	-
dc.identifier.citation	Chapter 13; pp. 221-254	en_US
dc.identifier.isbn	9781000917918	-
dc.identifier.isbn	9781032392769	-
dc.identifier.uri	https://doi.org/10.1201/9781003415466-13	-
dc.identifier.uri	http://gnanaganga.inflibnet.ac.in:8080/jspui/handle/123456789/2542	-
dc.description.abstract	In the pay-per-click (PPC) model of online advertising, an advertiser pays an amount to the publishers for every click generated on the published advertisement, which results in click fraud. Click fraud is deliberate clicking by a publisher on the advert. The highly skewed class distribution of the dataset makes the identification of fraudsters more challenging for current machine learning methods. This work thus proposes a reliable click-fraud detection (CFD) system for the efficient investigation of fraudulent publishers. The proposed CFD system has many novel features. First, the problem of class imbalance is overcome using the synthetic minority oversampling technique (SMOTE) and random under-sampling (RUSBOOST). Second, a novel Hybrid-Manifold Feature Subset Selection (H-MFSS) is proposed to obtain optimal informative features. Third, the gradient tree boosting (GTB) model addresses the challenges encountered in investigating and classifying the behavior of fraudsters from balanced and optimally selected user-click data. Experiments are conducted on FDMA2012 mobile advertising user-click data in dual mode: with all features (original data and data sampled through data sampling methods); and with selected features (original data and data sampled through data sampling methods). Classification bias towards the majority class is avoided by evaluating the performance of the models using the average precision (AP), recall (SE), specificity (SP), and G-mean (GM) metrics rather than accuracy. The efficacy of the proposed GTB model is further evaluated by comparing the performance with 12 other conventional machine learning models. The empirical results prove that GTB generalizes well with an achieved AP score of 64.86% without sampling, 65.25% with RUSBoost and 66.78% with SMOTE using significant selected features. A significant improvement in the classification performance is achieved with the impact of sampling methods and selected optimal features. © 2023 selection and editorial matter, Sulabh Bansal, Prakash Chandra Sharma, Abhishek Sharma and Jieh-Ren Chang individual chapters, the contributors.	en_US
dc.language.iso	en	en_US
dc.publisher	CRC Press	en_US
dc.subject	Pay-per-click (PPC)	en_US
dc.subject	Online advertising	en_US
dc.subject	Fraudulent publishers	en_US
dc.subject	Click-fraud detection (CFD)	en_US
dc.subject	Gradient tree boosting (GTB)	en_US
dc.title	A Reliable Click-Fraud Detection System For The Investigation of Fraudulent Publishers In online Advertising	en_US
dc.type	Book chapter	en_US
Appears in Collections:	Book/ Book Chapters

Files in This Item:

There are no files associated with this item.

Show simple item record

Alliance University, Bengaluru

Institutional Repository