home machine learning project : identifying cutlery items

home machine learning project : identifying cutlery items

machine learning is becoming crucial to add to your backpack. this post attempts to show a basic application of machine learning. we need to determine whether some cutlery items are bowls or mugs or plates. this is a practical demonstration of applied theory

machine learning is just applied maths

from start to finish, machine learning is just statistics, equations, calculations and repetitions. you just code an algorithm. tensors are just big words for matrices.

the project

we are identifying items from my kitchen so that when presented with an unidentified cutlery, we can attempt to classify it

characteristics of our target

we will be identifying plates, mugs and bowls. here is an overview of their common characteristics

target data
target data

our collected data

here is our measurement sheet, it could have been a spreadsheet. heights in cm

data sheet
our data

we reformat it into cutlery.csv as follows :

D,h,type
17,5,bowl
15.5,5,bowl
22.5,1,plate
8,7.5,mug
8,9,mug
6,11,mug
6,10,mug
24,2.5,plate
26,2,plate
11,8,mug
18,5.5,bowl
14,8,bowl

reformatting for calculations : plotting

let us plot our data :

cutlery Jupyter

now to select the entire D column we do :

df['D']

outputs :

0     17.0
1     15.5
2     22.5
3      8.0
4      8.0
5      6.0
6      6.0
7     24.0
8     26.0
9     11.0
10    18.0
11    14.0
Name: D, dtype: float64

same for df[‘h’]

viewing our data

please see this article on annotating data

code :

import pandas as pd
import matplotlib.pyplot as plt

# df is short for dataframe
df = pd.read_csv("<path here>/Desktop/cutlery.csv")

D = df['D']
h = df['h']

for i,type_ in enumerate(df['type']):
    D_ = D[i]
    h_ = h[i]
    if type_ == 'bowl':
        plt.scatter(D_, h_, marker='o', color='red', label='a')
        plt.text(D_+0.3, h_+0.3, type_, fontsize=9)
    elif type_ == 'mug':
        plt.scatter(D_, h_, marker='o', color='blue')
        plt.text(D_+0.3, h_+0.3, type_, fontsize=9)
    elif type_ == 'plate':
        plt.scatter(D_, h_, marker='o', color='green')
        plt.text(D_+0.3, h_+0.3, type_, fontsize=9)
        
plt.show()

output :

machine learning intro view
view data

our sample

now let us say that we have this :

width D : 8 and h : 7.5

cup from kitchen
cup from kitchen

without seeing the image, having D:9 and h:14, how do we guess what type of cutlery it is?

guess by distance

since this is but points on graph, we’ll measure the distance between D:9 and h:14 i.e. (9,14) to each point

we’ll use the simple distance formula :

distance = square_root(  ( X2-X1)^2  +  (Y2-Y1)^2 )

code :

import pandas as pd
import matplotlib.pyplot as plt
from math import sqrt

# df is short for dataframe
df = pd.read_csv("<path here>/cutlery.csv")

D = df['D']
h = df['h']

target = (8, 7.5) # tuple
Dt = target[0] # Dt for D-target
ht = target[1] # ht for h-target

plt.figure(figsize=(14,5)) # size of plot in inches

for i,type_ in enumerate(df['type']): # i list for index type_ for list element
    D_ = D[i]
    h_ = h[i]
    dist = sqrt( (Dt-D_)**2 + (ht-h_)**2 ) # formula
    
    label = '{} \ndist:{}'.format(type_, round(dist, 2))
    if type_ == 'bowl':
        plt.scatter(D_, h_, marker='o', color='red')
        plt.text(D_+0.3, h_, label, fontsize=9)
    elif type_ == 'mug':
        plt.scatter(D_, h_, marker='o', color='blue')
        plt.text(D_+0.3, h_, label, fontsize=9)
    elif type_ == 'plate':
        plt.scatter(D_, h_, marker='o', color='green')
        plt.text(D_+0.3, h_, label, fontsize=9)
        
    plt.scatter(Dt, ht, marker='x', color='green') # target point
    
plt.annotate('target', 
    ha = 'center', va = 'bottom',
    xytext = (Dt-2.5, ht-2),
    xy = (Dt, ht),
    arrowprops = { 'facecolor' : 'green', 'shrink' : 0.05 }) # target arrow
plt.xlabel('diameter')
plt.ylabel('height')
plt.show()

output :

distance to target
distance to target

we’ll see that the distance is nearest to all mug samples i.e. from 1.5 to 4 than it is to the nearest bowl (dist:6.02) or plate (dist:15.89)

so we can say that it is a mug / cup

of course, since we are only calculating dist, we can tweak our code to do everything without graphs

another sample

consider the following cup with D:9 and h:14

mug sample
mug sample

we set our target to

target = (9, 14) # tuple

and our output is :

sample two from kitchen
sample two graph

machine learning concept applied

this is a k-nearest neighbour application which is labelled under classification aka the basic of machine learning.

learning type: supervised learning

this was a demo project normally much more data has to be collected !

about the title

i wanted to put up something that would not scare beginners off. my first title was :

identifying home cutlery items with a k-nearest neighbours inspired method 

but no uninitiated would probably want to click on the link!

  •  
  •  
  •  
  •  
  •  
  •  

Lives in Mauritius, cruising python waters for now.