Get PDF File as Images & Extract Text from Images Using REST APIs
(Nov 14 2012 - 12:54:15 AM
) - [print blog
Saaspose development team is very happy to announce the conversion of PDF file to images and recognize text using Saaspose APIs. A very important and interesting aspect of Saaspose APIs is that you can integrate multiple file format APIs to combine a variety of features and achieve the desired results. There might be scenarios where you want to get PDF file as images using Saaspose.Pdf and extract text from the images using Saaspose.OCR. Saaspose.Pdf is a REST API for creating and editing PDF files and converting to other file formats. Saaspose.OCR is a REST API for optical character recognition and document scanning. Let’s have a look at how you can use these two REST APIs together to work with PDF files and text recognition. You can convert PDF file to images using Saaspose.Pdf API. This REST API allows converting the PDF file to images in the cloud; it converts the PDF file to images, you may choose to convert the whole PDF file to image, or you may choose to convert the required pages. The supported image formats are JPEG, PNG, GIF, BMP, TIFF etc. Once you have converted the PDF files to images, you can use Saaspose.OCR REST API to recognize text from images and save it to the database. You can also recognize the font attributes from extracted text such as font type, font style and font size through Saaspose.OCR. Saaspose.Pdf supports this very strong and useful feature of converting PDF files to images. You can also convert PDF page to image with default size or specified size. You can choose to manipulate the images using Saaspose APIs; for instance, Saaspose.OCR to recognize a collection of characters from images in different languages like English, French, and Spanish. So using a combination of these two REST APIs, you can easily achieve quality results of image extraction and character recognition.
SaaSpose is a cloud-based document generation, conversion and automation platform for developers. Using SaaSpose makes it easy for Web & Mobile Developers to work with Microsoft Word documents, Microsoft Excel spreadsheets, Microsoft PowerPoint presentations, Adobe PDFs, OpenDocument formats, and email formats and protocols in their Apps. The SaaSpose REST API enables you to quickly integrate the following into your Web: Document Assembly & Mail-Merge, Reporting, Document Conversion, Text and Image Extraction, Device Targeting, Metadata Removal, Barcode Recognition, Generation & Embedding, Email Templating & Tracking. The REST API can be called from any platform: .NET, Java, Ruby, Salesforce, Amazon etc.
More about Saaspose.Pdf
- Homepage of Saaspose.Pdf: http://saaspose.com/api/pdf
- Homepage of Saaspose.OCR: http://saaspose.com/api/ocr
- More examples by Saaspose.Pdf for Working with Images: http://saaspose.com/docs/display/pdf/188.8.131.52+-+Working+with+Images
- Convert PDF Page to Image (PHP REST): http://saaspose.com/docs/display/pdf/Convert+PDF+Page+to+Image+%28PHP+REST%29
- Ask technical questions/queries from Saaspose Support Team: http://saaspose.com/support/contact-us
Aspose Pty Ltd, Suite 163,
79 Longueville Road
Lane Cove, NSW, 2066