GitHunt
SE

sequence-sh/TesseractConnector

This repository is a mirror of https://gitlab.com/sequence/connectors/tesseract

Sequence Tesseract OCR Connector

Sequence® is a collection of libraries for
automation of cross-application e-discovery and forensic workflows.

This connector contains steps to perform optical character recognition (OCR)
on image files. It uses the Tesseract
open source library as the OCR engine.

Prerequisites

The following needs to be installed:

Examples

OCR a bitmap image

- <path> = 'MyImage.bmp'
- <imageData> = FileRead <path>
- <imageFormat> = GetImageFormat <path>
- <imageText> = TesseractOCR <imageData> <imageFormat>
- Print <imageText>

Documentation

https://sequence.sh

Download

https://sequence.sh/download

Try SCL and Core

https://sequence.sh/playground

Package Releases

Can be downloaded from the Releases page.

NuGet Packages

Release nuget packages are available from nuget.org.

Languages

C#100.0%

Contributors

Apache License 2.0
Created May 26, 2021
Updated April 10, 2023