How do you search for strings within a zip archive?
I'm tinkering with EPUB3 files, and I wanted to be able to find certain strings within .epub files, so I had a look around, and I immediately found zgrep and family. The trouble was that zgrep assumes a single zipped file, not an archive.
So, without further ado, I wrote the following script, which I called, naturally, zipgrep. It uses grep and unzip, which it assumes to be available on the PATH. Not wanting to have to pick through the argument list, I decided to mark the end of arguments to grep with the traditional '--', after which I could stack up as many zip file names as I liked.
It was a case of not enough time in the library; or Google, in this case. As soon as I had it working, I discovered the original zipgrep.
All was not lost. The original zipgrep handles a single archive using egrep and unzip, with the nice wrinkle of optional sets of filenames to include in, or exclude from, the search. However, I liked the ability to search multiple zip archives, and grep can be converted to any of its relatives with an appropriate flag, so I decided to hang on to son of zipgrep. All I needed was a new name: hence zargrep.
You can retrieve it here. It has been tested on OS X against multiple EPUB3 files.
Because they are zip files, this should also work for jar files, but I haven't yet tried it.
#! /bin/sh
# Greps files in a zip archive.
# Same argument sequence as for grep, except that
# zip file arguments must be separated from flags and
# patterns by --. If no -- is found in the argument list, returns error.
usage() {
echo Usage: >&2
echo $0 "<grep flags> <pattern> -- zipfiles ..." >&2
}
declare -a args
i=0
for (( i=0; $# > 0; i++ ))
do
if [ "$1" != "--" ]; then
args[$i]="$1"
shift
else
filesmarked=1
shift
break
fi
done
if [ -z "$filesmarked" ]; then
Echo "No '--' marker for zipfiles args." >&2
usage
exit 1
fi
tmpfile=/tmp/zipgrep$$
rm -rf $tmpfile
mkdir $tmpfile
trap 'rm -rf $tmpfile' EXIT
wd=$(pwd)
cd $tmpfile
while [ $# -gt 0 ]; do
zipfile="$1"
zfile="$1"
shift
# If zipfile is not absolute, set it relative to wd
if [ "${zipfile:0:1}" != / ]; then
zipfile="$wd/${zipfile}"
fi
unzip "$zipfile" >/dev/null
result=$(find . -type f -print0|xargs -0 grep "${args[@]}")
if [ -n "$result" ]; then
echo "zip: $zfile"
echo "$result"
fi
cd $wd
rm -rf $tmpfile
mkdir $tmpfile
cd $tmpfile
done