Human Eye for the Visually Impaired.
I have developed this object detection module using Google Cloud Vision API and Raspberry Pi which gives the output in audio format. This could be helpful for people with impaired vision or the ones who have visibility problems.
How It Works
When the button is pressed, it captures an image of the object through the camera which is connected to it and sends it to Google Cloud. Then Google’s Cloud Vision API uses artificial intelligence and provides relevant results of what the object can be. The output is then converted into audio format through Google’s Text-to-Speech (gTTS). And finally, The converted audio is played using VLC.
- A Raspberry Pi (I am Using Pi3 Model B)
- A microSD card with Raspbian Installed
- A USB Webcam
- A Speaker
- A 4-Pin Button
- 2 Male to Female Jumper Wires
- A Breadboard
Setting up Google Cloud
First, you’ll be needing a Google Cloud Account. Click here to create a new Google cloud account using your existing Google account. Fill in the required information and proceed.
Billing needs to be enabled for it to work but you won’t be charged if your usage is within the free limits. To make sure about that, you can setup capping on API usage. Learn more about capping here.
Creating the JSON file
- Go to Actions on Google to create a new project. Log in to your google account and give your project a name. Click Create.
- Click Device registration, then register model. Fill in the product name and manufacturer name then choose the device type as a speaker. Note down the device model id. We will need this later.
- Head to the API manager and enable the google assistant API. Once enabled, click manage, then credentials, and go to the OAuth consent screen. Choose your E-Mail ID and give a product name. Click Save.
- Now go to the credentials tab, Click create credentials. From the drop-down, choose OAuth client ID. Select application type as other, click create.
- On the credentials page, Scroll down and download the JSON file.
- Make sure you have Enabled Web & App Activity, Device Information, and Voice & Audio Activity in the Activity Controls of your account that you have used to sign in to the Cloud Console.
Enabling the API
We will now enable the API. Click here to Enable it. Select the name of the project we created in the previous step.
Getting the JSON file
We will need a JSON key to authenticate our Raspberry Pi to use it with our Google Cloud Account.
On your APIs & Services Dashboard, Click on Credentials from the panel on the left side.
Then click on Create credentials, And choose Service account key.
Choose JSON as the key type, and click create.
Now, move the downloaded JSON key to your Raspberry Pi’s Home Directory.
Getting the Hardware Ready
Start by connecting your USB webcam and the Speaker to your Raspberry Pi.
Then connect the button to your Raspberry Pi according to the schematics below:-
Installing the libraries
First, We will install the Google API Python Client. Run the following command in Terminal:-
sudo pip install google-cloud-vision
Next, We will install fswebcam, to use our camera. Again, Run the following command in Terminal:-
sudo apt-get install fswebcam
We will also install gTTS, Google’s text-to-speech library. We could have used something like pyttsx3, but their voice is metallic or robotic when compared to gTTS, whose voice is more human-like. To install gTTS,
sudo pip install gTTS
VLC is also needed, as it will be used for audio playback. Run the following command in Terminal:-
sudo pip install python-vlc
We will now make the JSON file available to any application we’re running. This command has to be run every time the Pi reboots.
Make sure to replace name with the actual name of the JSON file.
Let’s Try It!
Download the python script from here and save it on your Pi’s Desktop.
Run it through the following commands:-