Journey Into Data Analysis, A Brief Overview

Abdlazeez Ayobanji Mumeen
5 min readJun 14, 2020

On Thursday, 28th of March, 2020, I concluded my one year contract with the Corporate and Strategic Planning Division, Nigerian Ports Authority in Lagos State Nigeria. It was a very great experience serving with the authority. I was privileged to serve with the division handling all the port data and statistics including the ports cargo throughput, turn around time of vessels, the port personnel strength among others.

Back In 2018, I graduated from university, completing my undergraduate degree in Statistics. Right before my university years, I’ve always been a technology enthusiast; this has paved way for me in acquiring and gaining experiences in several digital skills both vocationally and through self-learning.

Though I’m certified as a Statistician, I’ve tried and honed my core skills in the areas of Web development, Data Analysis and Visualization, Data Science and Machine Learning, Business and Task Automation, Web Scrapping, among others. If you are wondering what some of these terms are, I will take you through a brief journey.

For the data analysis, it is a way of getting insight from data to make a decision or reach some conclusion. Anything you can ever think of can be represented as data. Your chats and messages on Facebook, your different mobile phones, the different food you eat, the clothes you wear, etc. All are data that can be analyzed and get insight from.

A simple yet efficient data analysis principle may go from the first to the fifth of the following steps

  1. Data Mining (gathering the data)
  2. Data cleaning (transforming and removing non useful part of the data)
  3. Data Visualization ( Trying to understand the data better)
  4. Analysis (modeling and evaluating the data for facts)
  5. Presentation/Interpretation (explaining the outcome of the analysis)

Having taken that, let me take you through another brief journey; “A practical approach to data analysis using the above steps”

I will like to use one of my personal projects for a better explanation.

During my undergraduate year In 2017, I created a platform baya.com.ng for people who want to buy items or properties, to meet sellers and transact online. Businesses and people who want to sell stuff have been patronizing and uploading there items like phones, cars, clothes etc for sale; likewise, buyers have been coming to transact with these sellers.

Take for instance you are a Tailor or Fashion Designer who wants to start sowing and selling ready-made cloth. You’ve been wondering over time, which of the different styles of clothes better fit people and which people wear most so that you can start sowing it to get more sales. As a data analyst, the first thing you will need to do is mining or gathering your data. Thank to platform like my site baya.com.ng, jumia.com.ng, konga.com. You already have where you can mine enough of data for different styles of clothes. But the problem is checking each style of clothing out on the platform to write down their details will be too cumbersome for you where you have thousands of clothes to gather and analyze. And you are not the owner of my site or other similar sites, so you don’t have access to the database or well arrange set of the data in one place. This is where you can use WEB SCRAPING for your data gathering. WEB SCRAPING is simply a way of gathering data from the web using a programming language like python or other automation tools. For instance, you can use WEB SCRAPPING to get all the names and contacts of Facebook users in your location or state. So next time you get advertising message from an unknown person or business, they might have probably scrape your number together with other numbers on facebook or similar platform where is it publicly available.

After you have gathered your data, you may then move to the step 2; cleaning your data to remove non useful parts, “that is”, those cloths where the seller didn’t specify the style. This is necessary to get a smooth analysis and better facts from your analysis.

The next step (step 3) is visualizing your data. You can do this by breaking down the clothes into their styles and buyers’ location or by their sizes to know which style of cloth people like more in different locations or whether children's sizes are in demand more than adult sizes.

In the fourth step, you apply STATISTICAL MODELS on your data such as regression analysis, to know which style of clothing sold or sells more. A clothe style selling more means people like and wear it more as it better fit them or meets their needs. Then you can improve your tailoring business by sowing these types of clothes to meet their needs and make more sales.

If you want to take things further advance, you may start applying MACHINE LEARNING on your data. Machine Learning is a way of making computers learn and make a decision on their own just like human beings do. Its efficient application has been proven sufficient in several human endeavors, ranging from Image and speech recognition, Natural language processing, Medical Diagnosis among others.

There are three major aspects of machine learning

  1. Supervised Learning (You give the machine insight on the problem you want to solve. E.g. styles of cloths that sells more and those that does’t so that the machine can learn what make these one type of clothes sell than the other)
  2. Unsupervised Learning (Machine look for a possible problem to solve in your data on it own without giving it any hint)
  3. Reinforcement Learning ( Machine learn and relearn from mistakes to produce better results. Reinforcement learning is the future of strong Artificial intelligence such as driverless cars, self navigating vacuum cleaners, etc.)

In the case of our data, a simple better approach may be supervised learning since we already know we want the machine to determine which style of clothes sells more.

In conclusion, the process of data analysis combined with machine learning and artificial intelligence can be term as DATA SCIENCE. I do hope you gain some insight from this.

--

--