Turning Accountants into Data Scientists
In today’s landscape of overflowing data availability, being comfortable around numbers is surely a competitive advantage. But to remain competitive, accountants need to be able to obtain and analyze reliable and relevant data — in essence, become data scientists.
Data scientists collect, organize, validate, mine and analyze data. Most CPAs are already using these skills, such as when an auditor performs a search for unrecorded liabilities or creates a materiality sample or when an accountant analyzes accounts receivable or cash flows.
Data science fundamentally combines computer programming and math skills. Either of those alone are enough to strike fear into most, and combined they turn off an even bigger audience. Don’t stop reading here; accountants are especially primed to learn data science. Further, after learning the basics, there are a number of software packages that will do the heavy lifting.
What to Learn
Those who aren’t very familiar with Microsoft Excel should start there. Excel has many built-in tools that can help sort and analyze data. Take the time to learn more complicated functions such as VLOOKUP and pivot tables before moving on to other areas. Even those who use Excel on a daily basis may not be aware of its full potential. The figure below shows a sample of the power of a pivot table in Excel.
This example includes a worksheet containing raw data of billable hours for clients in a hypothetical CPA firm. The data has been put into a pivot table to quickly total the number of hours by client. Depending on the filters used, the data could also have been totaled by employee or month. Now imagine being able to do this with two or three thousand rows of data in a matter of minutes. The output gives you a simple, accurate visual to use. Before moving on to more advanced data science, accountants should be comfortable with the following in Excel:
- Pivot tables and charts
- Statistical formulas: t-Test, Chi Square Test, Variance
In addition to Excel, there are other software resources that can help make meaning of reams of data. While learning to use them, accountants will also learn the computer programming language that pairs with them. Some of the most popular languages include R and Python. Both of these languages are supported with free software resources by nonprofit foundations. There is much debate over whether R is better than Python or vice versa. In general R has more advanced statistical capabilities, but Python is faster at processing. Since they are free to use, it is easy for accountants to do an overview of both and see which one they feel more comfortable with. For those with more enhanced data processing and visualization needs, some popular programs are SAS and Tableau. These programs are commercially available and can cost anywhere from a few hundred dollars a year to a few thousand.
Quite simply, all of these programs are either databases or pull from databases of information. Despite their sophistication, these programs still require human interaction. The user must know what he or she wants to research and what they hope to glean from the data. To help determine what information is needed and how to analyze it, data scientists use their knowledge of statistics. While basic knowledge of statistics such as mean, median and mode will get you up and running, to expand data analysis capabilities, an accountant will need to know how to interpret results of statistical analysis.
Where to Learn It
There are thousands of free and paid resources online from reputable websites such as Coursera and edX that create and cull courses from top universities and research centers around the world. Websites such as these are recommended over just watching random YouTube videos because they structure learning knowledge, skills and abilities around competencies and skill level. The entities that support R and Python also provide a plethora of resources for beginners.
It might seem counterintuitive to start learning about computers from books rather than online, but many do just that. For example, the “For Dummies” line of books includes editions on R programming, data science and big data. Being able to look at the book while following along on the computer can be an effective approach. When searching for texts, it is important to check the publication dates for the most recent versions.
How to Learn It
The most effective way to learn how to be a data scientist is by doing it. Working through an exercise while watching a video or reading a book increases the likelihood of retaining the knowledge. Try applying the task to relevant data to make learning more meaningful. If you hit a stumbling block, Google the error message or question; there is a whole community of people out there learning it one step at a time. Lastly, integrate new-found data science skills into daily practice.
This article appeared in the March/April 2020 issue of New Jersey CPA magazine. Read the full issue.