Get all the updates for this publication
Text Extraction from Images using Tesseract
Important information can be found in captured images, scanned documents, magazines, newspapers, posters etc. The information in these images is highly available nowadays and they are very important in describing, representing and moving information which help people in communication, productivity, cost, analysis etc. The information from these image documents would provide a much higher ease of access if it is converted to text. The process by which text is extracted to plain text is known as Text Extraction. Text Extraction is useful in information editing, documenting, archiving, searching, or analysis of image text. However, a lot of differences in these texts because of size, orientation, and alignment, low resolution/pixelated image, and complicated and noisy background make the issue of text extraction an extremely difficult one. In this project we attempt to minimize these problems using Tesseract OCR Engine
Journal | International Research Journal of Engineering and Technology (IRJET) |
---|---|
ISSN | p-ISSN: 2395-0072 |
Open Access | Yes |