How extract number of DNI with Google Cloud Vision*tkAiRWvYmAi_RuRhRYgeiQ.jpeg

Recently I started to investigate a little Artificial intelligence, I specifically need a tool that would allow me extract text from images because it would serve me for a project that I am doing… And since San Google has everything I found Google Cloud Vision.

Google Cloud Vision in short, it is a service that Google provide to developers and companies for image recognition. This service allows to recognize and extract texts from images, detect inappropriate content, facial detdction and identfication of logos in a image.

Let´s do it

In order to work with Python without having to configure everything from scratch, I made use of the service that Google also provides, Colaboratory… I will write about this service in another post :D.

Well… what we came to

  1. Install the package Google Cloud Vision
pip install google-cloud-vision

2. Import the libraries

#import api Google Cloud Vision
from import vision
#to acces funcionalities of SO
#and authenticate with Google api
import os
# Use regular expresion
import re

3. Authenticate

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/content/vision_key.json'

The file vision_key.json generated when you create a project in Google Cloud

4. Using the library

vision_client = vision.ImageAnnotatorClient()
image = vision.Image()

5. Get the image to process

IMAGE_URI = 'path-image-to-proccess'
image.source.image_uri = IMAGE_URI

6. Proccesing image to detect text

response = vision_client.text_detection(image=image)

The answer has lot of content, but what interests us is the description…

text = response.text_annotations[0].description

7. Extract the number of DNI

To extract only the DNI we can use regular expressions..

dni = re.findall('[0-9]{8}', text)
print (dni)

In conclusion, this tool is extremely simple to use… at least what I’m working for.

Co-Founder of Andheuris | Software Engineer | UI Designer