Teachable Machine By Google Creative Lab

Sharan Babu 01 Mar, 2024 • 6 min read

Introduction

In this article google teachable machine, let us go over how you can create a Python program using Convolutional Neural Networks (Deep Learning) and a mouse/keyboard automation library called pyautogui to move your mouse/cursor with your head pose. We will be using pyautogui as it is the simplest available Python library to programmatically control the various components of your computer. More details on how to use this library mentioned below.

First things first, we have to create a deep learning model that can classify your current head pose into 5 different categories, namely, Neutral (no direction indicated), Left, Right, Up, and Down.

This article was published as a part of the Data Science Blogathon.

What is a Teachable Machine?

A teachable machine is a tool that helps you teach a computer to recognize things. It allows you to show the computer examples of different things and tell it what they are. The computer then learns from these examples and can identify similar things in the future. It’s like training a pet or teaching a child by showing them examples and explaining what they are. With a teachable machine, you can create your own programs or applications that can understand and respond to different inputs, like images or sounds, without needing to be an expert programmer.

Easy Machine Learning Model Creation with Teachable Machine

Creating a machine learning model with Teachable Machine is straightforward and hassle-free. Here’s how:

  1. Collect Data: Utilize a webcam to gather images or videos of the items you want to classify.
  2. Train the Model: Label the gathered data and use it to train your machine learning model using Teachable Machine’s intuitive interface.
  3. Test the Model: Evaluate the accuracy of your trained model by classifying new images or videos.
  4. Refine the Model: Fine-tune your model by adjusting parameters or collecting additional data if necessary.

Teachable Machine is an invaluable tool for individuals, educators, and students alike to delve into the fundamentals of machine learning and computer vision. It’s versatile, capable of training models for various applications like image classification, object detection, and gesture recognition.

In today’s era of advancing technology, the significance of Machine Learning and Artificial Intelligence cannot be overstated. While many organizations are leveraging AI to make a societal impact, understanding how machine learning works and creating applicable models remains a challenge for some. But fear not, as it’s now entirely feasible.

Google Teachable Machine Overview

Google Teachable Machine is Google’s free no-code deep learning model creation web platform. You can build models to classify images, audios or even poses. After doing the same, you can download the trained model and use it for your applications.

You could use frameworks like Tensorflow or PyTorch to build a custom Convolutional Neural Network with network architecture of your choice or if you want a simple no-code way of doing the same, you could use the Google Teachable Machine platform to do the same for you. It is very intuitive and does a really good job.

GOOGLE TEACHABLE MACHINE

Choose ‘Image Project’ and name the samples and record your photos with corresponding head poses. You can let the default hyper-parameters be and proceed with the training.

GOOGLE TEACHABLE MACHINE - new project
How to use?

Tip: Try to record the photos of the head poses from different depths and positions with respect to the camera so that you don’t overfit the data which will later lead to poor predictions.

Then you can use the Preview section of the same page to see how well your trained model is performing and decide whether to use it for the program or create a more robust model with a higher number of images and parameter tuning.

class 1

After the above step, download the model weights. The weights will get downloaded as ‘keras_model.h5’.

Program

Now, let us combine this with a program that is able to move the mouse.

Let us have a look at the code:

# Python program to control mouse based on head position # Import necessary modules

import numpy as np
import cv2
from time import sleep
import tensorflow.keras
from keras.preprocessing import image
import tensorflow as tf
import pyautogui

# Using laptop’s webcam as the source of video

cap = cv2.VideoCapture(0)

# Labels — The various outcome possibilities

labels = [‘Left’,’Right’,’Up’,’Down’,’Neutral’]

# Loading the model weigths we just downloaded

model = tensorflow.keras.models.load_model(‘keras_model.h5’)

while True:    

    success, image = cap.read()    

    if success == False:

        break

# Necessary to avoid conflict between left and right

image = cv2.flip(image,1)

cv2.imshow(“Frame”,image)

# The model takes an image of dimensions (224,224) as input so let’s

# reshape our image to the same.

 img = cv2.resize(image,(224,224))

# Convert the image to a numpy array

img = np.array(img,dtype=np.float32)

img = np.expand_dims(img,axis=0)

# Normalizing input image

img = img/255

# Predict the class

 prediction = model.predict(img)

# Map the prediction to a class name

predicted_class = np.argmax(prediction[0], axis=-1)

predicted_class_name = labels[predicted_class]

# Using pyautogui to get the current position of the mouse and move

current_pos = pyautogui.position()

current_x = current_pos.x

current_y = current_pos.y   print(predicted_class_name)   if predicted_class_name == ‘Neutral’:

   sleep(1)

   continue

elif predicted_class_name == ‘Left’:

    pyautogui.moveTo(current_x-80,current_y,duration=1)

    sleep(1)

elif predicted_class_name == ‘Right’:

    pyautogui.moveTo(current_x+80,current_y,duration=1)

    sleep(1)

elif predicted_class_name == ‘Down’:

    pyautogui.moveTo(current_x,current_y+80,duration=1)

    sleep(1)

elif predicted_class_name == ‘Up’:

    pyautogui.moveTo(current_x,current_y-80,duration=1)

    sleep(1)

# Close all windows if one second has passed and ‘q’ is pressed

if cv2.waitKey(1) & 0xFF == ord(‘q’):

    break

# Release open connections

cap.

# Close all windows if one second has passed and ‘q’ is pressed

if cv2.waitKey(1) & 0xFF == ord(‘q’):

break

# Release open connections

cap.release()

cv2.destroyAllWindo

release()

cv2.destroyAllWindows()

You can also find the code and weights here.

Explanation for the pyautogui functions

pyautogui.moveTo(current_x-80, currently, duration=1)

The above code makes the mouse move 80 pixels to the left from the current position and take a duration of 1 second to do the same. If you do not set a duration parameter then the mouse pointer will instantaneously move to the new point removing the effect of moving the mouse.

current_pos = pyautogui.position()
current_x = current_pos.x
current_y = current_pos.y

The first line gets the value of the x and y coordinates of the mouse. And you can access them individually by using the following lines of code.
Finally, release the open connections.

We have now built an end to end deep learning model, that can take input video from the user. As the program reads the video, it classifies each image based on your head pose and returns the corresponding prediction. Using this prediction, we can take the appropriate action which is moving the mouse in the direction our head is pointing.

You can further improve the project we have just built by adding custom functionality to click the mouse without actually touching the mouse of your computer. You could train another deep learning model for this on Google Teachable Machine itself.

I hope this was a fun project to implement. Thanks for reading!

Conclusion

This article has provided a comprehensive guide on creating a Python program using Convolutional Neural Networks (Deep Learning) and pyautogui for mouse/cursor control based on head pose. Utilizing Google Teachable Machine, users can easily train models for image classification without extensive coding. The provided program showcases how to integrate the trained model with mouse control, enabling practical applications like hands-free computer interaction. By leveraging these tools and techniques, individuals can embark on exciting projects and expand their understanding of machine learning in a user-friendly manner. Thank you for exploring this innovative project with us!

Frequently Asked Questions

Q1. What does Teachable Machine do?

A. Teachable Machine is a user-friendly tool that enables people to create their own machine learning models without extensive coding knowledge. It allows users to train a computer to recognize and classify various inputs, such as images, sounds, or gestures. By providing labeled examples, users can teach the machine what different things look or sound like. The tool then uses this training data to create a model that can recognize similar inputs and make predictions. Teachable Machine empowers users to build interactive applications, prototypes, or educational projects that can understand and respond to specific inputs based on the training provided.

Q2. Is Teachable Machine free?

A. Yes, Teachable Machine is completely free to use! It’s like having a cool tool that helps you teach a computer to recognize things without needing to pay anything. You can use it to train the computer to understand different pictures, sounds, or poses by showing it examples and telling it what they are. The best part is, you don’t have to worry about any costs or fees while using it. It’s a fun and accessible way for anyone to get started with machine learning without spending any money.

Q3. What is Teachable Machine and how does it work?

A. Teachable Machine is a user-friendly tool developed by Google that enables individuals to create machine learning models without extensive coding knowledge. It allows users to train a computer to recognize and classify various inputs, such as images, sounds, or gestures. Users provide labeled examples to teach the machine what different things look or sound like.

Q4. Is there any prerequisite knowledge required to use Teachable Machine?

A. eachable Machine is designed to be accessible to users with varying levels of experience and expertise in machine learning. While having some basic knowledge of machine learning concepts may be beneficial, it is not a strict requirement for using Teachable Machine. The platform provides user-friendly tools, tutorials, and resources to guide users through the process of training and deploying machine learning models.

Sharan Babu 01 Mar 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Jambana Gouda HMP
Jambana Gouda HMP 06 Dec, 2020

Artificial intelligence and Machine Learning online

furas
furas 07 Dec, 2020

I didn't tested code by I think you can use `moveRel()` instead of `moveTo()` to make it more readable pyautogui.moveRel(-80, 0, duration=1) instead of pyautogui..moveTo(current_x-80,current_y,duration=1) In pyautogui < 1.0 you can also use `move()` instead of `moveRel()` --- I think `continue` is useless or even harmful in your code. When prediction is `Neutral` then `continue` skips `waitkey()` and it can't exit when you press key `q` I see this as # these lines are not needed #if predicted_class_name == 'Neutral': # continue if predicted_class_name == 'Left': pyautogui.moveRel(-80, 0, duration=1) elif predicted_class_name == 'Right': pyautogui.moveRel(80, 0, duration=1) elif predicted_class_name == 'Down': pyautogui.moveRel(0, 80, duration=1) elif predicted_class_name == 'Up': pyautogui.moveRel(0, -80, duration=1) sleep(1)