You are on page 1of 10

Using Data Mining for

Screening Tax Returns


References
1. Roung-Shiunn Wu, C.S. Ou, Hui-ying Lin, She-I Chang, David C. Yen, Using
data mining technique to enhance tax evasion detection performance,
Expert Systems with Applications, Volume 39, Issue 10, August 2012, Pages
8769-8777, ISSN 0957-4174, http://dx.doi.org/10.1016/j.eswa.2012.01.204 .
2. Keith Blackburn, Niloy Bose, Salvatore Capasso, Tax evasion, the
underground economy and financial development, Journal of Economic
Behavior & Organization, Volume 83, Issue 2, July 2012, Pages 243-253,
ISSN 0167-2681, http://dx.doi.org/10.1016/j.jebo.2012.05.019 .
3. Show-Jane Yen, Yue-Shi Lee, An efficient data mining approach for
discovering interesting knowledge from customer transactions, Expert
Systems with Applications, Volume 30, Issue 4, May 2006, Pages 650-657,
ISSN 0957-4174, http://dx.doi.org/10.1016/j.eswa.2005.07.035.
Problems
Several persons and citizens try to evade tax
Big Corporation as well as smaller ones all do same [3]
Sources of fraud
Unreported income
Abusing tax Shelters
Several Solutions have been proposed and used to
detect fraudulent tax activity
Some manual and others Data mined [2]
Present Data mining solution by [1]
Abusive Tax Shelters
Non-declaration of Income
A lot has been done about this Partnership

Abusive Tax Shelters


Tax payer makes some huge gain
Tax advisor(promoter) helps to exploit the
loophole in the tax law
Set up a partnership together S
Tax payers buys call options and transfers Corporation
to partnership
Call option is sold by tax payer
Ignores liability Tax Payer
Sale results in tax payer claims of the
same amount of loss
Loss offsets the original gains
Data Set
Source : Internal Revenue Service
Data Entities

Entity Name Instances in Number of


2003(mil) Attributes
High-Income tax- 1.9 1000
payer
Trust 3.5 200
Partnership 2.5 100
Sub-chapter S 3.4 100
Corporation
Solution
Built a single-class Model
using Support Vector Machine (SVM)
Results
Successfully identified and ranked some transactions are fraudulent.
Revealed $200 mil of previously uncovered tax shelter losses
Although 90% accuracy gained
Transactions were missed
Improved Model was built by relaxing the target criteria
Based on expert domain information
Resulted in Shelter Risk Function
Improved identification of further losses.
Problem 2
How about Groups of High-income individuals working
together though other promoters
High-income
Individuals

Entities
Promoter selling the
tax
shelter
fraud to
individual
Partnerships s
organizations
Solution
Modify SRF to have groups of SRV
New Model: Promoter Risk Function
In view of Speed of operations,
Irrelevant links in the mined relationships were pruned
Filtering and merging of groups
Based on promoters levels of support in group
Overall Results
Found 500 meta-groups of potential promoters and
individuals (SSNs) involved in the tax shelter fraud
Savings of $5bil of sheltered income
50% of the amount was associated with the top 20% of the
groups and meta groups identified
The process is automated and not as laborious as the
other manual processes
Questions?

You might also like