Leveraging Data Analytics for Fraud Prevention and Detection

by Melissa A. Dardani, CPA, MAcc, MD Advisory  – September 24, 2021
Leveraging Data Analytics for Fraud Prevention and Detection

With the emergence of machine learning and artificial intelligence in recent years, data analytics has taken on new roles in business. When preventative controls utilize data analytics, patterns indicative of error or fraud can be detected earlier and can cause less harm than if uncovered during an annual audit or investigation. This is supported by the U.S. Department of Justice’s 2020 update to the “Evaluation of Corporate Compliance Programs” guidance, which is relied upon by U.S. prosecutors and emphasizes a company’s access to and use of continuous operational data and information. The challenge for fraud prevention and detection professionals then becomes finding ways to streamline useful analytical procedures and to interpret and communicate results to key stakeholders.

Planning and Preparation

In the world of fraud prevention and detection, the underlying goal is to discover patterns or anomalies indicative of fraudulent activity or areas of high risk. Understanding the fraud risk profile of the organization is key to ensuring attention is appropriately allocated to areas within the data infrastructure. This data infrastructure — which defines how information flows through the organization and where the underlying data comes from, lives and how it is used — must also be understood. When undergoing the planning stage of data analysis, consider the organization in the following layers:

Leveraging Data Analytics for Fraud Prevention

The top layer must take into consideration the former two layers when determining potential schemes that can occur within the organization.

The next step involves identifying and obtaining the best data to support or refute that fraud is occurring or has occurred in the past. To do so, the analyst should design a series of questions based on the plan and fraud risk profile, and then determine the data necessary to address these queries. For example, consider the relevant risk for ghost employees on the company’s payroll. The preparation questions may include, “Are we paying any person who doesn’t currently work for the company?” or “Are we paying any person who does not exist?” Relevant data sources to answer this question may include human resource employee records, payroll registers, employee time sheets or other reports, and canceled payroll check data. If access to third-party (payroll company) payroll registers are made available, totals should be reconciled to the disburse­ments contained in the company’s accounting records.

Obtaining good data is half the battle, especially if employees are trying to cover up potential bad acts. For this reason, the analyst must understand where data comes from, how it gets there, how it is used and who has control and authority over it. The analyst should either obtain the appropriate access to the data directly or supervise the process of extracting the data from company systems. Data veracity checks should be established, anchoring into control totals that can substantiate synchronicity of the information across the data infrastructure.

Cancelled payroll checks from the bank records should also be reviewed to ensure the data to be analyzed is complete and accurate. It is generally a best practice to rely on third-party verification for control totals whenever possible. The time sheets of hourly employees can be aggregated to ensure totals agree to that included on the registers. Employees on the register can be cross referenced to human resource data which should be analyzed for suspicious entries such as employees who are missing key data fields or different employees who have the same address or bank account number. In practice, the areas designat­ed for analysis may be more complex in nature. The key to success in determining data veracity is maintaining the professional skepticism of an accountant or auditor.

Testing and Interpreting the Data

Prior to testing, the analyst should have already developed a list of potential fraud schemes based on the organization fraud risk, a series of questions designed to obtain either inculpatory or exculpatory evidence of the given schemes and obtained the evidence necessary to answer those questions. During testing, new questions may arise that should be evaluated to determine if tests should be expanded. It is also worth noting that issues frequently arise while cleansing the data in preparation for the analysis. The analyst should be sure to follow up on these during testing as they could provide unexpected insights.

Identifying financial anomalies and patterns involves methodically slicing and dicing the data. Two primary measurement perspectives include vertical and horizontal. A horizontal view observes the period-over-period change of a particular data subset, while the vertical perspective looks at data as a percentage of a subset total. For example, an analyst may vertically test expenses paid to vendors for marketing by looking at the spend per vendor as a percentage of the total advertising spend for a given period. Similarly, this test run horizontally would measure the period-over-period change for amounts paid to individual vendors for marketing.

Another good practice when determining how to isolate data subsets is to identify the different focuses through which the information can be observed. Depending on the objectives and potential fraud schemes, the analyst may focus on a particular account or class of accounts or on a particular vendor or customer. They may also look as broadly as an entire department, business unit or organization.


Data analysis is both an art and a science. When conducting these procedures for the purpose of fraud prevention or detection, it is important to stick to the development and subsequent testing of theories based on risk to avoid becoming overwhelmed in the data. Upfront communications are key to understanding the risks of the organization and the concerns of those tasked with fraud risk management. Choosing technology to build analytical procedures will depend on the organization’s goals for the analysis. Considerations here stem from the replicability of the procedures and the need to distribute or communicate the findings to stakeholders. In many cases, a good old fashioned Excel spreadsheet can work, but the analyst may be limited by the size of their dataset and will have to overcome difficulties to automate their procedures. In developing analytics as an internal control or for fraud examinations, affordable, accessible and far more powerful technologies exist that should be considered.


This article appeared in the Fall 2021 issue of New Jersey CPA magazine. Read the full issue.