VizWar ⚔️ ~ Comparison Between Visualization Platforms Such as Excel, Tableau, PowerBi, and Matplotlib/Seaborn

Comparing different data visualization platforms while analyzing the dataset.

Soni Heet
8 min readApr 23, 2021

Introduction

It is an indisputable fact that in this data-driven world getting insightful data visualizations is very important. From small business stores to big companies are using their preferable visualization platform to gain some insight about their organization. Every platform has its advantages & disadvantages. From all the known platforms, Excel, Tableau, PowerBi, and Matplotlib/Seaborn are very popular. In this article, I will analyze one particular dataset on these four platforms and will conclude my decision, which platform has more edge.

Objective

  • Analyzing the dataset 🔍
  • Producing visualization on Excel, Tableau, PowerBi, and Matplotlib/Seaborn 📊
  • line out insights from all observations 📄
  • Comparing platforms ✔️
  • Declaring best platform 🔊

Data source

Dataset provider: Washington Post

The Washington Post’s database contains records of every fatal shooting in the United States by a police officer in the line of duty since Jan. 1, 2015.

https://github.com/washingtonpost/data-police-shootings

Analyzing the dataset

For this mini-project, I will analyze the dataset from the Washington Post. This dataset is called “fatal-police-shootings-data”.

Data description:

Column Names: ID, Name, Date, Manner of death, Armed, Age, Gender, Race, City, State, Sign of mental illness, Threat level, Flee, Body camera, Latitude & Longitude.

At first look, we can see that most columns have categorical values. However, the dataset does not contain any information about police ID or name.

Note: It is very clear that some of the information is missing from the dataset. When we see the column ID, it does not have all the ID values. So we can assume that some amount of data is missing and that can lead us to false insights.

This dataset is used only to compare the ease of use across different services, not to generate any insights from the data itself.

Analyze in Excel

Microsoft Excel is a helpful and powerful program for data analysis and documentation. It is a spreadsheet program with tools like pivot tables, charts, filters, slicer, and many more. With the help of all these tools, we will create our dashboards.

Dashboard: 1 Count field

In the first dashboard, we will count the occurrence of all the column values, with this we can see which types of conditions are occurring more.
Here, I divided all the ages into groups of 10. We can see that most of the individuals who were shot belong to the age groups 26–35, 36–45, 46–55.

In this bar graph, we can see that the age group 26–35 has the highest count and 6–15 has the lowest count. It is very clear, males are committing more crimes than females. According to the dataset, race ‘W’ has more count than all the races. Race ‘O’ has the lowest count. Here, out of all the states, I counted the value count of TOP 10 states. From these top states, shooting happed most in California. Additionally, I did the same thing for the cities and analyze that in the city of Los Angeles and phoenix most of the shootings occurred. All the individuals were carrying various types of arms. Out of all the types, the gun was most popular. Most of the people got directly shot instead of tasered and shot. Out of all the count, most people did not had any sign of mental illness and more than 50% of the people decided not to run when police arrived. However, police detected more individuals with high threat levels. We can clearly see that a few years ago wearing a body camera was not that popular.

Dashboard: 2 Variable Relationship

We will create this dashboard with the help of pivot tables, slicers, and charts. This dashboard will visualize the relationships between various columns.

Here, I put the race as the main slicer. I created four pivot tables for Age groups vs States, Gender, Top 3 arm types vs Flee type, Manner of death vs Mental illness and connected all tables to the main slicer.

From the dashboard, we know that race ‘W’ had the highest count. So when it comes to race ‘W’, most of the individuals were shot at the age between 26–35 in California and Florida as the most common states. Males are committing more crime and most of them prefer to carry a gun while not running from the police. According to the dataset, It did not matter whether the individual had mental illness or not, police prefer to shot directly without using a taser.

Likewise, if we click on any other race from the slicer, all the pivot table's values and charts will change accordingly.

Analyze In Tableau

Tableau is a visual analytics platform that makes it more straightforward to produce interactive visual analytics in the form of dashboards. These dashboards make it easier for non-technical analysts and end-users to convert data into logical, interactive graphics.

Here, for this dataset, we will create four different dashboards.

The first dashboard is about crime distribution. We can see the geographical and bar chart format distribution in the above dashboard. With this, we can see in which state the shooting is happening the most and the crime distribution among the cities.

The above dashboard presents the relation between crime and race count across the US. We can see that according to data, race ‘W’ got the highest shot count in both genders, and the state of California shows the highest percentage.

The third dashboard presents the relation between shooting reaction based on mental health and threat level. We can see that most of the people did not have mental illness and whether they had an illness or not, police prefer to shot them directly without tasering them first. Additionally, we can also see that mental illness did not have any connection with the threat level.

The last dashboard presents the relation between Race, Arm type, Flee, and Manner of death.

Analyze In PowerBi

Microsoft Power BI is used to manage reports and insights based on an organization’s data. Power BI can connect to a broad variety of data sets, and combines the info so that it can be better understood.

Here in the first dashboard, I created the same count board as excel, but this time in PowerBi. PowerBi allows us to use its drag and drop feature to create various types of charts.

The above dashboard shows, how we can simply use features like slicer in PowerBi. There is no need to specify a separate filter or slicer, just click on any other field from any chart, Bi will slice the data for us.

This Dashboard shows the geographical crime distribution across the US.

I created a similar variable relationship dashboard in the Bi platform.

Analyze In Matplotlib/Seaborn

In the computer world, the Matplotlib/Seaborn are libraries drafted for the Python programming language for data analysis and manipulation.

Here, I visualize the same insights but in Matplotlib/Seaborn.

Pandas have one library called sweetviz, which will give us the overall count distribution of the whole dataset.

We can see that with a simple library, we can create the whole report of the count dashboard.

Now I will create all relationship charts with the help of Matplotlib/Seaborn.

Matplotlib/Seaborn have one feature called Groupby which allows us to group multiple features.

I created two groupby data frames for Gender and Arm type.

line Out Insights From All Observations

  • Dataset has some missing values. So we can assume that the dataset is not 100% accurate.
  • Males were getting shot more than females.
  • Age 26 to 45 is the most common age for getting shot. Children and old age people did not commit more dangerous crimes.
  • Most of the shootings happen in California State. Additionally, for cities, Los Angeles has the highest shooting count.
  • According to the dataset, race “W” is getting more gunshots and race “O” lowest.
  • Gun, Knife, and Toy-weapon were the most popular among criminals.
  • Most of the individuals did not show any sign of mental illness.
  • 65% of the people decided not to run from the police.
  • Of all the individuals, 65% of the time individuals identified as a high threat.
  • 95% of the time police shot directly instead of shot and tasered.
  • We noticed that 85% of the time, the police decided not to use body cameras. We can assume that before few years, wearing a body camera was not that necessary among police.
  • According to the dataset, there is a connection between Race and State, Race and Age.

Comparing Platforms

Excel

  • More sophisticated graphs and charts, Excel has a better list, Easy customization
  • Great tools like pivot tables, Filters, Slicers, Streamlines Calculations
  • No geographical visualization
  • Can not handle big data

Tableau

  • Good Vizulizations, Better user experience, Better performance
  • Multi data source connections
  • Great tools like, beautiful graphs, filters, geo presentations, etc
  • Can handle big data
  • high pricing, Low BI Capabilities

PowerBI

  • Affordable, Custome visualizations
  • Excel integration, Dataset connectivity, Eazy customization
  • Interactive visualizations are a very impressive feature from PowerBI. Can apply filter or slicer by just clicking any wanted field from visualization.
  • Can handle 1~2 GB data.
  • Crowded interface, Low configuration of visuals

Matplotlib/Seaborn

  • Built for Python, Great representation of data
  • Can present good statistical presentations of data
  • Big data support, Eazy customization
  • Can handle null values, outliers, Data type modification
  • Require programming for visualizations, No Interactive Graphs
  • Low 3D matrix compatibility

Best Platform

In this Visualization project, we tried four different visualization platforms. At the end of the task, the question that arises is “Which platform is superior? ” And my answer is “All of them”. In my opinion, it is not about which platform has more features and attractive visuals. It’s all about selecting the particular platform for particular data scenarios. It all depends on the data situation and task requirements.

Project Source:

--

--

No responses yet