How to make a "searchable" PDF (text images)

Discussion in 'other software & services' started by Jean Marc, Sep 19, 2008.

Thread Status:
Not open for further replies.
  1. Jean Marc

    Jean Marc Registered Member

    Joined:
    Nov 21, 2005
    Posts:
    40
    Hi,

    Do you know a tool that can create a "searchable" PDF from scanned PDF images?

    Tia for your help ;)


    P.S : I use Foxit Reader 2.3
     
  2. GlobalForce

    GlobalForce Regular Poster

    Joined:
    Jun 30, 2004
    Posts:
    3,581
    Location:
    Garden State, USA
    Hi Jean Marc,

    I presume you're after OCR soft. Unless Bob D. or someone else makes a suggestion, you'll find more on Wikipedia.
    If there's a copy of MS Office onboard you can probably utilize it's (06 article now) document imaging feature.


    S
     
    Last edited: Sep 19, 2008
  3. raakii

    raakii Registered Member

    Joined:
    Sep 1, 2008
    Posts:
    593
    Abbyy Fine reader and Scansoft omnipage are the ones leading int the arena
     
  4. Bob D

    Bob D Registered Member

    Joined:
    Apr 18, 2005
    Posts:
    1,150
    Location:
    Mass., USA
    Hi Jean Marc
    As you know, "true" pdf s are text searchable.
    However, when you scan a pdf document, it typically turns it into a raster image (vs. a vector image, as created by Acrobat), subsequently losing it's native pdf "intelligence".

    There are 3 types of pdf s (described briefly here):
    http://www.pdftocad.com/

    Other than GlobalForce's OCR soft recommendation, I cannot offer any suggestions.

    Cheers
     
  5. Jean Marc

    Jean Marc Registered Member

    Joined:
    Nov 21, 2005
    Posts:
    40
    Thank you all for your advice ;)

    Apart from PDF made from scanned images, I've got a "weird" PDF...

    At first glance, it seems to be a "true" PDF: for instance, I can highlight and use the copy command (the document is not encrypted) but when I try to put the copied text in the clipboard, the characters are unreadable!

    http://img519.imageshack.us/img519/8576/img3261909jp8.jpg


    And above all, I can't perform a search... o_O

    Any solution?

    Tia
     
  6. GlobalForce

    GlobalForce Regular Poster

    Joined:
    Jun 30, 2004
    Posts:
    3,581
    Location:
    Garden State, USA
    Give this a shot if it's a commonly available file. If from a friend, ask how it was generated. In the event you still can't get it working search "making fair use of cut and paste restricted PDF files" and browse within the top ten result's .... it may well be DRM related.

    S
     
    Last edited: Sep 20, 2008
  7. Bob D

    Bob D Registered Member

    Joined:
    Apr 18, 2005
    Posts:
    1,150
    Location:
    Mass., USA
    Not familiar with Foxit, but did you copy the characters as Plain Text, Formatted Text, or Rich Content?
    Again, the pdf may not be a "true" pdf, but a hybrid.
    If the file contains nothing proprietary or personal, maybe you can post it so that members can possibly manipulate it.
     
Loading...
Thread Status:
Not open for further replies.