Put this code snippet into a script named greppdf into your path :
1 2 3 4 5 6 7 8 9 10 11 | #!/bin/bash for PDF in *.pdf do NB_PAGES=`pdfinfo "$PDF" |grep "Pages" |cut -f 2 -d ":"` for (( PAGE=1; PAGE<=$NB_PAGES; PAGE++ )) do pdftotext "$PDF" -f $PAGE -l $PAGE - | grep -i $@ | while read line; do echo "$PDF:$PAGE:$line"; done done done |
Now you can search through a directory of pdf, using this command (you can use as well regular grep parameters) :
1 | greppdf "programming" |
This will output the filename and slide number where the “programming” string is found.
Requirements: The package poppler-utils needs to be installed on your system.
Discussion
No comments yet.