Let’s talk about Google Vision API and how to write a simple java program to extract Text from an Image

By registering for Google Cloud’s vision API you can access just that with an API key.

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.textextractor</groupId>
<artifactId>textextractor</artifactId>
<version>1.0-SNAPSHOT</version>


<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>libraries-bom</artifactId>
<version>19.2.1</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>

<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-vision</artifactId>
</dependency>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-storage</artifactId>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>


</project>
import com.google.api.gax.core.FixedCredentialsProvider;
import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import com.google.cloud.vision.v1.*;
import com.google.cloud.vision.v1.Feature.Type;
import com.google.common.collect.Lists;
import com.google.protobuf.ByteString;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class TextExtractionDemo {
public static void main(String... args) throws Exception {

GoogleCredentials credentials = GoogleCredentials.fromStream(new FileInputStream("<PATH_TO_API_KEY_FILE>"))
.createScoped(Lists.newArrayList("https://www.googleapis.com/auth/cloud-platform"));
Storage storage = StorageOptions.newBuilder().setCredentials(credentials).build().getService();

detectDocumentText("<PATH_TO_IMAGE_FILE>", credentials);
}

public static void detectDocumentText(String filePath, GoogleCredentials credentials) throws IOException {
List<AnnotateImageRequest> requests = new ArrayList<>();

ByteString imgBytes = ByteString.readFrom(new FileInputStream(filePath));

Image img = Image.newBuilder().setContent(imgBytes).build();
Feature feat = Feature.newBuilder().setType(Type.DOCUMENT_TEXT_DETECTION).build();
AnnotateImageRequest request =
AnnotateImageRequest.newBuilder().addFeatures(feat).setImage(img).build();
requests.add(request);

ImageAnnotatorSettings imageAnnotatorSettings =
ImageAnnotatorSettings.newBuilder()
.setCredentialsProvider(FixedCredentialsProvider.create(credentials))
.build();

// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (ImageAnnotatorClient client = ImageAnnotatorClient.create(imageAnnotatorSettings)) {
BatchAnnotateImagesResponse response = client.batchAnnotateImages(requests);
List<AnnotateImageResponse> responses = response.getResponsesList();
client.close();

for (AnnotateImageResponse res : responses) {
if (res.hasError()) {
System.out.format("Error: %s%n", res.getError().getMessage());
return;
}

// For full list of available annotations, see http://g.co/cloud/vision/docs
TextAnnotation annotation = res.getFullTextAnnotation();
for (Page page : annotation.getPagesList()) {
String pageText = "";
for (Block block : page.getBlocksList()) {
String blockText = "";
for (Paragraph para : block.getParagraphsList()) {
String paraText = "";
for (Word word : para.getWordsList()) {
String wordText = "";
for (Symbol symbol : word.getSymbolsList()) {
wordText = wordText + symbol.getText();
}
}
blockText = blockText + paraText;
}
pageText = pageText + blockText;
}
}
System.out.println("Complete annotation:");
System.out.println(annotation.getText());
}
}
}
}
  • Face detection
  • Landmark detection
  • Label detection
  • OCR for handwritten text

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How Pyroscope Saved Me Weeks of Wasted Effort

Speed up Actions on Google development workflow with local fulfillment

Comprehensive Notes For Java 8 Features Every Developer Must Have

Creating an email custom scalar to Apollo GraphQL

What is Full Stack Development?

Airflow Summit takeaways

Pulling Back The Curtain: Development And Testing For Low Latency DASH Support

What are the Best Tools for UI Testing in 2022?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ritesh Shergill

Ritesh Shergill

More from Medium

Learning what it takes to edit sentences.

What if your asset management system could read between the lines?

If you’re stuck in a design rut, You need a new font!

How to Maximize Your Conference Experience