Text Extraction from Images using Tesseract

Trupti Nitin Baraskar

Profiles Research Units Publications

Research Article

Open Access

Text Extraction from Images using Tesseract

Published in

2021

Volume: 08

Issue: 07

Pages: 295 - 231

Abstract

Important information can be found in captured images, scanned documents, magazines, newspapers, posters etc. The information in these images is highly available nowadays and they are very important in describing, representing and moving information which help people in communication, productivity, cost, analysis etc. The information from these image documents would provide a much higher ease of access if it is converted to text. The process by which text is extracted to plain text is known as Text Extraction. Text Extraction is useful in information editing, documenting, archiving, searching, or analysis of image text. However, a lot of differences in these texts because of size, orientation, and alignment, low resolution/pixelated image, and complicated and noisy background make the issue of text extraction an extremely difficult one. In this project we attempt to minimize these problems using Tesseract OCR Engine

About the journal

Journal	International Research Journal of Engineering and Technology (IRJET)
ISSN	p-ISSN: 2395-0072
Open Access	Yes

Authors (1)

Trupti Nitin Baraskar
- School of Computer Engineering & Technology
- Engineering and Technology

ABOUT

ACADEMICS

@MIT-WPU

ADMISSIONS/ PLACEMENTS

MISCELLANEOUS