Downloading a Dataset and Displaying an Image in Google Colab

Sybernix
3 min readNov 16, 2021

--

Google Colab is a cloud-hosted Jupyter Notebook for Python programming language that is made available for free for academic use by Google. The single most useful factor is the free access to GPU resources that you can get from Google Colab. GPU is needed for most deep learning algorithms and networks. All other options to access GPU costs money, which could be a prohibitive factor for students and beginners.

In this short article, we will see how we can get started with Google Colab, download a dataset in zip format from the internet and display one image from it.

Step 1 — Access Google Colab

Visit https://colab.research.google.com/ and navigate to Files > New Notebook

You should see a screen as follows,

Step 2 — Download Dataset from Internet

We will use a dataset named Market1501 as an example in this tutorial. This dataset is used to train networks that tackle a computer vision problem known as “person re-identification”.

First, we need to import the necessary libraries.

import os
import sys
from pycocotools.coco import COCO
import urllib
import zipfile

Then download the zip file. For this step, you need a web URL that contains a zipped file.

market1501_url = 'http://188.138.127.15:81/Datasets/Market-1501-v15.09.15.zip'
urllib.request.urlretrieve(market1501_url , filename = 'market1501.zip' )

You need to click the play icon to the left of your code block to run the Python code we have written. Once it finishes executing, you should see the zip file in the directory panel to your left as shown below.

Step 3 — Display an Image

Next, we need to extract the downloaded zip file.

with zipfile.ZipFile('market1501.zip' , 'r') as zip_ref:
zip_ref.extractall()

After extracting you will see a new folder named “Market-1501-v15.09.15” as seen below.

Finally, we need to display one of the images in the sub-directories. For this, we will need the following libraries

from google.colab.patches import cv2_imshow
from skimage import io

After importing the libraries, we can read an image and display using the code below.

img_path = "Market-1501-v15.09.15/bounding_box_test/-1_c1s1_000401_03.jpg"
image = io.imread(img_path)
cv2_imshow(image)

You will see an output as follows.

Now, you can go on building your neural network model and use the downloaded data. In future blogs, I will explain how to build suitable neural network models with Pytorch and train them.

--

--