{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Image Processing in Python\n", "## Tutorial for TSBB15\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting Started\n", "In python we need to import packages which provide the functionality we want.\n", "In this course we will use the numpy, scipy, and matplotlib packages a lot, lets import them!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import scipy\n", "from matplotlib import pyplot as plt\n", "plt.rcParams[\"figure.figsize\"] = (10,10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Numpy lets us work with (num)bers in (py)thon!\n", "\n", "It's great for working with vectors and matrices, lets see how to create one:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "row_vector = np.array([[1 ,2 ,3]],dtype=np.float)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Everything in numpy is an array, and arrays have:\n", "\n", "**datatype**: e.g, float, int, bool - Describes what types of numbers are in the array\n", "\n", "**shape/size**: describes how the array \"looks\" " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "row_vector.dtype" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "row_vector.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Operations\n", "We can **add** and **subtract** easily:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "row_vector+row_vector" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "row_vector-row_vector" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What about **elementwise multiplication** and **matrix multiplication**?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "row_vector*row_vector #elementwise multiplication, same as .* in matlab!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "row_vector@(row_vector.T) # matrix multiplication, same as * in matlab!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 1**: What does .T do to the array?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Answer**: Transponation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can basically do any operation elementwise, by calling the proper numpy function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.linspace(0,2*np.pi,num=100)\n", "y = np.sin(x)\n", "plt.plot(x,y) # Here we use matplotlib, which was imported above\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Indexing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the simplest case we want a single index:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ind = 0 # Arrays start at 0 (!)\n", "x[ind]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We may also the first 10 elements" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x[0:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is called \"slicing\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also want every second index:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x[::2] # short for x[0:-1:2], where -1 is short for the last element of x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 2**: Now we want to get every third index of x, starting at index 5, give the appropriate command below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#answer here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Boolean Indexing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lets say we want to take all the values of x, where $y<0$.\n", "\n", "Then we can use boolean indexing, and it's super simple!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x[y<0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.scatter(x[y<0],y[y<0])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 3**: Now we want to index all values of x,y such that $y^2 < 0.5$. Provide the appropriate code below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#answer here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Multi-dimensional indexing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Indexing works basically the same!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.array([[1,2,3],[4,5,6],[7,8,9]])\n", "x[0:2,0:2] # Selects a 2x2 block." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Images" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Luckily for us, images are arrays so everything written above applies for images too!\n", "\n", "We begin by showing some ways of loading images into python.\n", "\n", "We start with using **matplotlib** to load the image." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_mpl = plt.imread('/courses/TSBB15/images/capybaras.jpg')\n", "\n", "plt.imshow(image_mpl) #shows some cute capybaras\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another way is with **opencv**.\n", "\n", "The image is automatically made into a numpy array, however with BGR instead of RGB color channels." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import cv2\n", "image_cv2 = cv2.imread('/courses/TSBB15/images/capybaras.jpg')\n", "plt.imshow(image_cv2) #Now the capybaras are blue, since opencv uses bgr instead of rgb\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also use **PIL**\n", "\n", "The image is not automatically made into a numpy array, but we can convert it quite easily :)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from PIL import Image\n", "image_pil = Image.open('/courses/TSBB15/images/capybaras.jpg')\n", "print(image_pil)\n", "image_pil = np.array(image_pil)\n", "plt.imshow(image_pil) #Now the capybaras are blue, since opencv uses bgr instead of rgb\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 4**: What is the datatype of the images, and what is the shape? What are the maximum and minimum values?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Answer**:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making a 2d filter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Often we want to filter our images, lets make a gaussian blur filter with numpy!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kernel_size = 13;\n", "(x,y) = np.meshgrid(np.linspace(-1,1,num=kernel_size),np.linspace(-1,1,num=kernel_size))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We just created a meshgrid! You can think of this as an array containing all the coordinates." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.imshow(x)\n", "plt.colorbar()\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.imshow(y)#Note that the y-axis goes down!\n", "plt.colorbar()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we want to create a gaussian on this grid!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 5**: The equation for a gaussian filter is $\\exp(\\frac{-(x^2+y^2)}{2\\sigma^2})$ \n", "\n", "Lets assume that $\\sigma=1$. How can you write this in numpy? Write the code below. \n", "\n", "*Hint*: $\\exp$ can be written np.exp() in numpy and $x^2$ can be written x**2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lp_filter = #answer here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Additionally, we want the filter to sum to 1, to ensure that we don't make the image super bright or dark!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lp_filter = lp_filter/np.sum(lp_filter)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.imshow(lp_filter)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convolving our filter with the image" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we (hopefully) have a working low-pass filter. \n", "\n", "Lets convolve it with the capybara image!\n", "\n", "First lets import a package for performing convolution" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.signal import convolve2d as conv2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now lets try convolving!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_lp = conv2(image_pil,lp_filter, mode='same') # uhoh..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our capybaras are in color! This means that they are technically 3D arrays. \n", "\n", "An easy way to fix this issue is to do the convolution seperately for each color channel. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 6**: Convolve the capybary image seperately for each color channel below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_lp_r = conv2(image_pil[:,:,0],lp_filter, mode='same')#here's how to do it for the red channel\n", "image_lp_g = #do it for the green channel!\n", "image_lp_b = #and for the blue" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we have multiple numpy arrays we can *stack* them together, to get a combined array.\n", "\n", "Note: Make sure to remember which dimension you stack them in! This can be selected by the *axis* parameter in the np.stack function " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_lp = np.stack((image_lp_r,image_lp_g,image_lp_b),axis=-1)\n", "image_lp.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So lets show our beautiful resulting image!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.imshow(image_lp) # uhoh...\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sadly it was not to be.\n", "\n", "You may recall that the image had values between [0,255] and was of type uint8.\n", "\n", "When we convolved it the datatype became float, and now matplotlib is telling us that our image is weird." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.imshow(image_lp/255.0) # The easiest way is to simply divide the image by 255\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.imshow(image_lp.astype(np.uint8)) # We can also convert it back to uint8!\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fourier Domain" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may have noticed that the convolution was a bit slow.\n", "\n", "Since the image is very large, and the kernel is also quite large, it is quite computationally expensive.\n", "\n", "A faster way could be to do it in the fourier domain!\n", "\n", "Remember that $f*g = \\mathcal{F}^{-1}[FG]$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Take the fourier transfrom of the image\n", "f_image_pil_r = np.fft.fft2(image_pil[:,:,0])\n", "f_image_pil_g = np.fft.fft2(image_pil[:,:,1])\n", "f_image_pil_b = np.fft.fft2(image_pil[:,:,2])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Fourier transform of filter, note that we MUST have the same shape as the image(!!)\n", "zero_padded_lp_filter = np.zeros(image_pil[:,:,0].shape,dtype=np.float)\n", "zero_padded_lp_filter[:kernel_size,:kernel_size] = lp_filter # here we have implicitly shifted the filter\n", "zero_padded_lp_filter = np.roll(zero_padded_lp_filter,shift=(-(kernel_size//2),-(kernel_size//2)),axis=(0,1))#so we need to roll it back into place!\n", "f_lp_filter = np.fft.fft2(zero_padded_lp_filter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 7**: Now multiply and do the inverse transform for each channel!\n", " \n", "*Hint*: the inverse transform is np.fft.ifft2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_lp_r_fast = #answer here\n", "image_lp_g_fast = #answer here\n", "image_lp_b_fast = #answer here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 8**: Stack the channels as you did before. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_lp_fast = #answer here" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.imshow(image_lp_fast.astype(np.uint8)) # We can also convert it back to uint8!\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Check**: Does the image look basically identical to the previous ones?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That was basically it for the basics, however, there are infinitly many things to know.\n", "Here is a neat cheat-sheet:\n", "http://datacamp-community-prod.s3.amazonaws.com/ba1fe95a-8b70-4d2f-95b0-bc954e9071b0\n", "\n", "And here is some more advanced operations that Johan Edstedt put in a github gist:\n", "\n", "https://gist.github.com/Parskatt/18a7db9784a43aa598e59f21412b4ec2\n", "\n", "Below we will go through some stuff that should help you with Lab 1 and Lab 2!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Helpful stuff for lab 1 and lab 2 (Dont Miss These)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lab 1\n", "In Lab 1 you will construct a regularized derivative filter. \n", "\n", "This filter arises from the fact that:\n", "$\\frac{\\partial}{\\partial x} (f*g(\\sigma))(x) = (f*\\frac{\\partial}{\\partial x}g(\\sigma))(x) = (f*\\frac{-x}{\\sigma^2}g(\\sigma))(x)$\n", "\n", "Where $g=\\exp(-\\frac{x^2}{2\\sigma^2})$. \n", "\n", "So we use the filter: $df = \\frac{-x}{\\sigma^2}g(\\sigma)$ and convolve it with our signal.\n", "\n", "Furthermore this filter is *seperable* when its in 2d, which is great!\n", "\n", "Because since $df(x,y) = df(x)g(y)$, then $f*df=(f*df(x))*g(y)$\n", "\n", "So we can make our 2d convolution (which takes scales quadratically with kernel size) and compose it to 2 seperate convolution which only scales linearly!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kernel_size = 7\n", "sigma = 1.5\n", "x = np.linspace(-(kernel_size//2),kernel_size//2,num=kernel_size)[np.newaxis,:]\n", "g = np.exp(-x**2/(2*sigma**2))\n", "g = g/np.sum(g)\n", "df = -x*g/(sigma**2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(x[0],df[0])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is veeeeery important that our filter is **symmetric**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 9**: Explain why the filter is not symmetric if the kernel size is even!\n", "\n", "*Hint*: Try to change the kernel size above, and see what happens :)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Answer**:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we show you how to perform the convolution and the resulting derivative image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_pil = Image.open('/courses/TSBB15/images/capybaras.jpg')\n", "image_pil_gray = np.mean(np.array(image_pil),axis=-1)\n", "dx = conv2(conv2(image_pil_gray,df,mode='same'),g.T,mode='same')\n", "dy = conv2(conv2(image_pil_gray,df.T,mode='same'),g,mode='same')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure()\n", "plt.subplot(1,2,1)\n", "plt.imshow(dx)\n", "plt.subplot(1,2,2)\n", "plt.imshow(dy)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lab 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In Lab 2 you will use the gopimage color palette for visualization.\n", "\n", "This uses a complex representation of the images." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "z = dx + 1j*dy\n", "from lab2 import gopimage\n", "gopimage(z)#You should see (parts of) a colorful and noisy capybara" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While the above is quite pretty, it is quite noisy!\n", "\n", "We can instead look at a better representation, e.g. the structure tensor!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "T = np.zeros_like(image_pil)\n", "Tlp = np.zeros_like(T)\n", "T [: ,: ,0] = dx * dx\n", "T [: ,: ,1] = dx * dy\n", "T [: ,: ,2] = dy * dy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sigma = 3.0\n", "kernel_size = 25\n", "lp = np.exp(-0.5*(np.linspace(-(kernel_size//2) ,kernel_size//2,num=kernel_size)/ sigma )**2)[np.newaxis,:]\n", "lp = lp/np.sum(lp) # normalize the filter\n", "Tlp[: ,: ,0] = conv2 ( conv2 ( T [: ,: ,0] , lp , mode ='same') , lp.T , mode ='same')\n", "Tlp[: ,: ,1] = conv2 ( conv2 ( T [: ,: ,1] , lp , mode ='same') , lp.T , mode ='same')\n", "Tlp[: ,: ,2] = conv2 ( conv2 ( T [: ,: ,2] , lp , mode ='same') , lp.T , mode ='same')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "z = Tlp[:,:,0]-Tlp[:,:,2]+2j*Tlp[:,:,1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gopimage(z)#You should see (parts of) a colorful and noisy capybara" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the colors here might be slightly different from what is presented in the lecture.\n", "\n", "One difference might come from the fact that the y-axis is directed downwards." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 10**: Change the direction of the y-axis (e.g. by flipping the direction of the filter), and plot z again. What happens?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Answer**:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Task 11**: What happens when you increase the value of sigma? What happens if you decrease it? Why?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Answer**:" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }