Cyclistic Bike Sharing Analysis Report

prayag purohit
6 min readSep 8, 2022

Google Data Analytics Certificate Capstone Project

This is a case study provided by google in its data analytics certificate. The certificate introduced me to the world of data, and the analytics process. I have used tools such as MySQL, CMD, Tableu and Excel in this analysis. Please feel free to reach out for suggestions, and criticisms.

Over the duration of 6 months Google introduced a seven step analytics process:

Ask Questions→ Prepare the data and question its integrity →Process by cleaning → Analyze by transforming →Share by visualizations →Act and make decisions. I have tried my best to use this process in this report.

Introduction to the problem

Cyclisitc is a Chicago based fictional company, which follows a bike-sharing business model. The company has 5800 bicycles with more than 600 docking stations.

Cyclistic has introduced flexibility in their pricing plans: they have introduced single-ride passes, full-day passes, and annual memberships. People who use single-ride or full-day passes are referred to as Casual riders. Customers who purchase annual memberships are Cyclistic members.

The financial analysts at the company have discovered that annual members are much more profitable to the company. The director of the company, Moreno believes that maximizing the number of annual members will be a key to future growth. They will not target all-new customer for the time being and focus on converting casual riders into annual members.

The goal is to design digital marketing policies that can help with this conversion. To achieve this goal, it is very important to realize how casual riders and members use the bikes differently.

The key problem statement :

💡 How do casual riders and annual members use Cyclisitc bikes differently?

The Data

The company provided a pretty large dataset. The data was uploaded at the companies cloud server. It was divided on a monthly basis, each stored in an excel csv. The data was made available by a company called Motivate International Inc. under this license.

The data had 13 attributes and over 10 million observations.

Cyclistic data schema

The data does not provide private information about the users, mainly to avoid breaching privacy ethics.

Preparing the data

To analyze the historical data for 2021, I downloaded individual zip files for each month. I compiled the data using command prompt and later uploaded it to a Mysql server.

Command line code to compile and upload the data

Cleaning and Preparation Log

I cleaned the data using SQL queries

  1. The station ID values had inconsistent formatting, which was corrected.
  2. The compilation of the files resulted in repeating headers which had to be removed.
  3. Some rides had start times greater than the end times, the rows were deleted.
  4. The ride id (PK) column had duplicates and nulls which were removed from the dataset.
  5. Added a ride duration column, week index number column, and weekend/weekday column to make analysis easier.
  6. Created different tables for members and casual riders to make query processing faster.

Analysis — Revealing insights

After creating queries I exported them to excel to reveal insights into the data.

  • In 2021, 55% of the total rides were taken by the members, and the rest 45% were taken by casual riders.
  • Average ride duration for Casual riders was 32 minutes, whereas members rode for 14 minutes.
  • Classic Bikes were the most popular among both groups, followed by electric bikes, and docked bikes.
  • 11 % of members ended their ride where they started from (circular rides), whereas 17% of casual riders engaged in circular rides.

Monthly Data -

We can see that members and casual riders follow similar patterns for using the bike-share platform. The number of riders starts to increase in April and peaks in July, followed by a decline. To maximize the impact of marketing campaigns, the company should focus more resources during the summer and early monsoon months.

Monthly riding patterns by member type.

Weekly data –

We can observe here that members use Cylistic bike-share services during the working days, whereas casual riders ride the most during the weekends.

Weekly riding patterns by membership type

We can hypothesize from this data that members use Cyclistic services to commute to work, on the other hand, casual riders use the service for leisure activities.

To assess the hypothesis, we can look at the peak start timings of the ride. For this, I extracted hours from the start time column and grouped them by week/end and subscription type.

We can see that members show a lot more activity during the weekday. Furthermore, they start their rides around 8:00–9:00 A.M and peak during 5:00–6:00 P.M. It can be easily inferred from this chart that members use Cyclistic’s services for work-commute purposes.

The data on weekend tells us that most riders start their rides between 8:00 AM to 8:00 P.M. Even though this data does not tell give us new insights, we can adjust our marketing campaigns during these hours.

Start and End station Data

Top 30 start stations (Left), and end stations (Right) by membership type. Link to Tableau viz

These charts show the top 30 start stations and end stations by subscription type. We can clearly see that casual riders tend to concentrate on the coastline (assuming for leisurely activities), whereas the members are clustered around the city center, where office buildings are located. To see the visualization follow the link mentioned in the source.

Conclusion and Suggestions

Summary:

The data suggest that casual riders prefer bike-sharing services for leisure activities. Members, on the other hand, use Cyclistic mostly to commute to work. I made this conclusion from the following facts.

  • Members are more active on the weekdays, Casual riders are active on the weekends.
  • On weekdays, Members mostly use bikes during peak hours. (8:00 AM and 5:00 P.M)
  • Members use the bikes for a shorter duration of time, whereas casual riders use the bikes for more than 30 minutes.
  • Members mostly start and end their rides in the city center or around busy areas, Casual riders prefer leisurely places such as parks and coastal tracks.

Based on following evidence I would suggest the following to the company:

  • The company should focus its campaigns during the summer-monsoon season — which includes the period from Mid-April to Early September
  • The company can offer discounts and offers for memberships during this season to maximize its efficiency.
  • The company can offer new types of memberships such as, ‘The weekend membership’, or ‘The commuter membership’ to capture target markets.
  • The company can also partner with the local parks and beaches where casual riders travel the most. Offline marketing can be also utilized in these areas if interested.
  • Casual riders ride for more than half hours — Cyclistic can offer bonuses or points for longer duration rides.

--

--

prayag purohit

Ambitious guy who gets way too excited about reading, mind maps, philosophical conversations, and data driven functioning.