OPENING A NEW SHOPPING MALL IN LAGOS, NIGERIA
This is my final project for The IBM Data Science Professional Course, and I have decided to explore the neighbourhoods in Lagos and predict a perfect location for opening a new shopping mall.
1. INTRODUCTION
Lagos is one of the major commercial cities in Nigeria and in Africa at large, with numerous business opportunities, hence this project to perfectly cite a location for a shopping mall business. It is important to note a good location for the business to be able to maximize cost effectively.
2. DATA
For this project, the data was gotten from the following
a. Geopy — For getting the co-ordinates of different locations.
b. Foursquare API — To get the list of venues and the details around a given location.
c. Wikipedia — To get the neighbourhoods in Lagos using this link: https://en.wikipedia.org/wiki/List_of_Lagos_State_local_government_areas_by_population
3. METHODOLOGY AND RESULTS
The programming language used in this project is Python, with the following libraries: Pandas, Numpy, Geopy, Scikit learn, folium and Request. The code for this project is here.
First, the coordinates of the city were gotten from geocoder, The table was gotten from Wikipedia and cleaned to display just the localities or neighbourhoods. After cleaning the data, they were all merged into one dataframe as seen in figure 1 below.
The coordinates were thereafter plotted on a map as shown below in fig 2
Foursquare API was used to explore and populate the various shopping malls in the target city. After a code was written to find out the number of neighbourhoods that have shopping malls. The answer was 8, while the total number of shopping malls was gotten to be 17.
Machine Learning
Before carrying out the machine learning, the venue categories were converted to numerical values as machine learning algorithm cannot work with categorical values directly. This is known as One Hot Encoding.
Kmeans clustering was the machine model used in this project. This is an unsupervised machine learning algorithm that works by grouping data into k number of clusters. It groups data with similar characteristics in the same cluster.
The clustering was able to group the neighbourhoods into 3 clusters according to the number of shopping mall a neighbourhood has. This was plotted in the map below
4. DISCUSSION
The neighbourhoods in Cluster 0, displayed with red color all have either one or two shopping malls, neighourhoods in Cluster 1, displayed with green color have no shopping malls at all. Lastly, neighbourhoods in Cluster 2, displayed with yellow have four shopping malls in each neighbourhood.
5. CONCLUSION
- Cluster 0 and 1 have few and no shopping malls respectively, This means opening a new shopping mall in these locations should be very profitable as there is no competition.
- Cluster 2 has the highest concentration of shopping malls in each neighbourhood. Competition is definitely going to be high. One should have good business ideas and good strategies to be able to stand out in Cluster 2
Note: The accuracy of this result depends on the data provided by Foursquare.
Click here for the jupyter notebook
Click here for the github repository