Finding Insights with the humble Pivot Table
The Humble Pivot Table in the age of Big Data
In this age, big data and AI are all the rage. But 99% of the data analysts work with data sets that are much smaller and most of them do not have access to complex and expensive analytical tools. Despite these challenges, analysts are required to conduct analyses and identify pivotal insights in their respective data sets; and many do this through the most commonly used analytical tool – Microsoft Excel; specifically pivot tables.
Many of the complex analytical tools today can trace their roots back to the humble pivot table. Aside from this interesting fact, pivot tables are great tools for descriptive analysis and even interactive reporting. But more importantly, it is a great tool for learning about the data value chain.
Analytics tools of the day often function like black boxes, where the user is unable to see what is going on with the data. The beauty of MS Excel and pivot tables is that you can literally see how the data is transformed and used across the data value chain.
How each formula changes the data set
If there are any errors that need fixing
If the formula didn’t work out as intended
Once the data set is processed correctly, pivot tables allow the user to conduct descriptive analysis and data visualizations without the need to produce any programming code. All you need is a mouse. In the next sections, we will introduce some of the key concepts you might need to know to start your own data analysis with Pivot Tables.
Not any data set will do with pivot tables. To get the most out of pivot tables the data needs to be organized in a specific manner, where
Each row is a unique record (e.g. one employee, one customer, one transaction).
Each column describes something different about the record (e.g. the gender or age of an employee, the date and time of the transaction).
Each column of data needs to have a unique name and the data definitions are clear to the user; this includes, the data type (numerical or categorical) and units of measure (e.g. inches or meters).
While you can have blank cells in the table, but you shouldn’t have blank columns. That means, that all the columns with data should be side by side.
The order of the data is not important, the key is to make sure the data is complete and accurate enough.
The above is an example of pivot table friendly data table format; this just happens to be the transaction data from a restaurant. You will note that each line is a check/order and each column describes something different about each check/order. It is very similar to what you would obtain from queries out of most database platforms; some processing may still be required to assemble the correct dataset for analysis.
The table on the right is not as pivot table friendly. While you can create pivot tables with it, you will quickly discover that there are severe limits on what insight you can glean from it. This is better known as a report; something that you create from pivot tables.
The Pivot Table format provides the user with the maximum flexibility to explore and understand the data set. Failing which, it makes for a frustrating experience.
Using Pivot Tables to find answers in the data
There is no shortage of excel classes that provide instruction on how to navigate and use pivot tables. While such classes provide the how and what to do with the pivot tables, they fail to address the why. One of the most common applications of the pivot tables is to find answers to questions with the data; but the Excel classes often fail to bridge the gap between the question and how to get the answers with Pivot Tables.
This is easily accomplished in three steps:
Data doesn’t add much value unless you can get some form of intelligence or insight from it; and it could be as simple getting answers from the data itself.
It would be as simple as:
“How many people did we serve at the restaurant?”
“How many dollars did we get in sales last month?”
“How much as the Singapore Government revenue in 2019?”
The key to finding answers in the data using pivot tables lies in the ability to translate the question into the requisite data fields.
To find the number of guests served, we can probably use the “Guest” data field from the first table.
To quantify the revenue, we can probably use the “Total” data field from the first table.
In many cases, the words used in the question may not match any in the data fields. It may take some effort to find one that is suitable or at least a close approximation to the question.
In other cases, there are no data fields in the data to answer the question. Then the journey begins to find the data field elsewhere
Value Field Settings
By default, pivot tables will present the sum if the data is numerical or count if the data is categorical.
But there are so many other options available in pivot tables; it can present the average, maximum, minimum, standard deviation or variance etc.; which significantly increases the types of questions that can be answered with the pivot table.
If you are interested to build up practical pivot tables skills or sharpen your existing skills in data analytics, you should check out our upcoming data analytics workshops. What kinds of questions will you find answers to?