Web Scraping in Python 3

Contents:

Web Scraping and Basic Exploratory Data Analysis in Python 3 using Requests, BeautifulSoup, Pandas, Matplotlib, Seaborn

Click here to view jupyter notebook

In this jupyter notebook I documented web scraping and exploratory data analysis using Python 3.

The process is as follows:

  1. use Python’s requests and bs4 libraries to scrape a webpage
  2. load the scraped data into a pandas dataframe
  3. do some basic exploratory data anlaysis on the dataframe
  4. basic visualization

For this article’s purpose I scraped population data of New Zealand as of June 30, 2018 from http://citypopulation.de website.

Source:

Source:

Thomas Brinkhoff: City Population, http://www.citypopulation.de

Please note there might be some policies and rules for a website for using the data. So before you do the web scraping please do not forget to read the data usage policies.

Data use policy: (DATA -> Population Data)

The data that I will be extracting in the jupyter notebook is population data for Oceania -> NEW ZEALAND http://citypopulation.de/en/newzealand/

Final note

Please note that I documented this jupyter notebook for a simple way to explain how useful it is using requests and BeautifulSoup (bs4) libraries to scrape website data .

Feel free to do your own exploration and analysis on the data that is scraped.

I hope that this piece of work will give readers some basic understanding of web scraping and intuition behind the analysis, which is:

⇠ Computer Vision with Python and OpenCV2 (Part 2 - Rotating and flipping images)