Samsung Health x Google Maps project

Data cleaning | Merge | Tableau mobile dashboard
Q4 2022
Smartphone view of main fitness KPIs evolution over timeSmartphone view of places visited when working out (map of Japan)Smartphone view of reached goals dashboard

Forewords

This article presents in a non technical way the user problem I wanted to address as well as the dashboard solution I created. To dive deeper, I kindly invite you to visit my Github repositories to read the underlying Python codes.
Tech stack : Python | Tableau

Github | Samsung Health revamp

Project summary

I have been using the Samsung Health smartphone app since 2016. This allowed me to track my daily number of steps and burned calories when exercising (running, hiking,...). Still, the information presented in the app often fails to provide me with what I need to know to monitor my physical activity.

My objective in this project was to obtain the most comprehensive view of my workouts using data from the Samsung Health app (running, hiking) and Google Maps Location History (swimming and dance workouts). 

To do so, I imported, cleaned, merged and analyzed all my personal data thanks to several Python scripts. Finally, I constructed an interactive Tableau dashboard as the solution to user problems.

Data flow visualization

The problems at hand

The Samsung Health app has undergone many changes in the past. Still, I do not find the app user-friendly enough due to the following reasons:

Due to such problems, app users like me struggle to find quick insights about their physical activity levels and may be tempted to use apps / connected devices from competitors (or construct their own analytics dashboards like me!).

Below: Illustrations of some of dashboards available in the Samsung Health app. They are dispersed in the app and can be accessed after several clicks. Kind of confusing isn't it?

Samsung Health: one of the many dashboardsSamsung Health: one of the many dashboards
Samsung Health: one of the many dashboardsSamsung Health: one of the many dashboards

User goals

My goal in this project was to construct a dashboard that would improve:

More specifically, thanks to the new dashboard, I wanted to answer the following questions : In 6 years, how many steps did I make in total ? What year was I the most active ? How did my exercise habits change over time in terms of frequency and intensity of workouts ?

As a cherry on the cake, I wanted to supplement the Samsung Health walking and running information with my Google Maps Location History data to:

Reasons for using a Python script

I could not simply build a quick Tableau dashboard right after downloading the data from the app. Indeed:

Data preparation process

Samsung Health data

Data import
Data quality checks:

Illustration of the duplicates problem in the csv raw data (before cleaning)

Duplicates in the raw dataset

Google Maps Location History data

Data import

Google Maps data were difficult to import in Python because they were:

Example below (credits: jsoncrack.com): Representation of a single data entry in one of the many JSON files

JSON structure of Google Maps History Location data
Data cleaning and preparation

Data analysis in Python

Samsung Health

I created a few graphs in my Python Jupyter Notebook to produce the insights I could not easily get from the Samsung Health app. I focused mostly on 'long-term' analyses (yearly and monthly analyses).

Bar chart | Total distance walked per year
A line chart displaying the average daily number of steps, aggregated per month
Boxplot comparing physical activity in 2019 vs 2022
Heat map : Days when I reached my 10k steps / day objective

I summarize below some of the insights I could get thanks to the above graphs (disclaimer: 2016 and 2022 are not complete years in the dataset)

Google Maps History Location

Python lineplot: Number of dance and swimming sessions per year
Python boxplot representing duration of swimming and dance workouts

Here are some learning from the above graphs

Final output: an interactive Tableau dashboard

After combining all data sources in Python, I exported them to csv format and created an interactive dashboard on Tableau Public to reach the user goals that I had defined at the beginning of the project.

Link to the dashboard

You may also see below video screenshots of the dashboard (final product).

Main dashboard overview

Focus: Map video

The above dashboard answers the user problems defined at the beginning of the case study namely:

Conclusion

Possible next steps
  • Improve the overall UI of the final deliverable (smartphone view)
  • Create a data pipeline that updates regularly the Tableau dashboard with my most recent data, and may integrate data from other data sources (Google Maps positions or calendar events to explain why I walked more certain days compared to others).
  • Write a Python script that better identifies the zipcode and town information for Asian countries I lived at
What I learnt in this project
  • Coding in Python to perform data cleaning, analysis and visualization (I am originally a R lover)
  • Importing and reading dozens of complex JSON files
  • Using Jupyter Notebook and Markdown format to summarize my results in a clean manner
  • Version my code on Github

Explore other portfolio projects

Want to get in touch?

Send me a message

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.