Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Serial copying from disk images to folder in Bash
(Brought over from SE.)
This is a Bash script that copies files stored inside disk images to a directory, using a defined structure provided via a JSON file. I've included the external programs it requires and the test I used so that you can test it too.
Any comments regarding programming style and improvements are welcome.
Overview
The following is a Bash shell script that copies files stored inside disk images into a directory in the filesystem.
The script takes two parameters:
- The first one is optional and defines a root directory (existing or not) that will contain the files being copied.
- The second one, optional when the first one is given, is a path to a valid JSON-formatted file that describes:
- which disk images will be opened,
- which files inside each disk image will be copied, and
- which path inside the directory root will be used as the destination for the files being copied.
The first parameter defaults to the current directory when not given. The second one defaults to a file named steps.json
located in the current directory. If the first parameter is not given, the second one can't be either.
Prerequisites
This script requires the following external programs to work correctly:
- The JSON parsing program
jq
. - The disk image manipulation utility
udisksctl
.
To install these dependencies, use one of the following commands:
-
Ubuntu
$ sudo apt install jq udisks2
-
Fedora
$ sudo dnf install jq udisks2
Script
The complete script is below. It can be marked as executable to avoid having to prepend bash
to its execution command. There is no restrictions on directories this script may reside into. For the purpose of the test below, the directory where it resides has read/write permissions.
imgdisk-copy.sh
#!/bin/bash
# Copying files contained inside disk images via JSON recipe.
# Aura Lesse Programmer
# December 12th, 2018
# Is a string contained in another? Return 0 if so; 1 if not.
# By fjarlq, from https://stackoverflow.com/a/8811800/5397930
contains() {
string="$1"
substring="$2"
if test "${string#*$substring}" != "$string"; then
return 0
else
return 1
fi
}
# Obtain the absolute path of a given directory.
# By dogbane, from https://stackoverflow.com/a/3915420
abspath() {
dir="$1"
echo "$(cd "$(dirname "$dir")"; pwd -P)/$(basename "$dir")"
}
# The main script starts here.
# If no first parameter is given, assume current directory.
if [ -z "$1" ]; then
DESTROOT="."
else
# Omit any trailing slash
DESTROOT=$(abspath "${1%/}")
fi
# If no second parameter is given, assume file "steps.json".
# If no first parameter is given, this can't be either.
if [ -z "$2" ]; then
CONF="./steps.json"
else
CONF="$2"
fi
# Create the root directory where the files will the put.
mkdir -p "$DESTROOT"
# How many disks will be processed?
LIMIT=$(cat "$CONF" | jq -r length)
i=0
while [ "$i" -lt "$LIMIT" ]; do
# For each disk, get its file name.
DISK=$(cat "$CONF" | jq -r .["$i"].disk)
echo "$DISK"
# Setup a loop device for the disk and get its name.
RES=$(udisksctl loop-setup -f "$DISK")
LOOP=$(echo "$RES" | cut -f5 -d' ' | head -c -2)
# Using the loop device obtained, mount the disk.
# Obtain the mount root directory afterwards.
RES=$(udisksctl mount -b "$LOOP")
SRCDIR=$(echo "$RES" | sed -nE 's|.*at (.*)\.|\1|p')
# How many file sets will be copied?
NOITEMS=$(cat "$CONF" | jq -r ".["$i"].files | length")
j=0
while [ "$j" -lt "$NOITEMS" ]; do
# For each file set, obtain which files will be copied and where.
FSRC=$(cat "$CONF" | jq -r .["$i"].files["$j"].src)
FDEST=$(cat "$CONF" | jq -r .["$i"].files["$j"].dest)
# Make the destination directory.
mkdir -p "$DESTROOT"/"$FDEST"
echo " ""$FSRC"
if contains "$FSRC" "\*"; then
# If a wildcard is used in the file set, copy by file expansion (option -t).
pushd "$SRCDIR" > /dev/null
cp -t "$DESTROOT"/"$FDEST" $FSRC
popd > /dev/null
else
# Else, copy normally.
cp "$SRCDIR"/"$FSRC" "$DESTROOT"/"$FDEST"
fi
j=$(($j + 1))
done
# Once all the file sets are copied, unmount the disk
# and delete its associated loop device.
udisksctl unmount -b "$LOOP" > /dev/null
udisksctl loop-delete -b "$LOOP"
i=$(($i + 1))
done
Test set
This script was tested with the following disk set: Microsoft C Compiler 4.0. The first 3 .img
disks inside the ZIP (disk01.img
, disk02.img
, and disk03.img
) should be placed in the same directory the script is.
The corresponding JSON recipe used for the test is below. It is also placed in the same directory the script is for convenience.
steps.json
[
{
"disk": "disk01.img",
"files": [
{ "src": "*", "dest": "bin" }
]
},
{
"disk": "disk02.img",
"files": [
{ "src": "*.EXE", "dest": "bin" }
]
},
{
"disk": "disk03.img",
"files": [
{ "src": "LINK.EXE", "dest": "bin" },
{ "src": "*.H", "dest": "include" },
{ "src": "SYS/*.H", "dest": "include/sys" },
{ "src": "SLIBC.LIB", "dest": "lib" },
{ "src": "SLIBFP.LIB", "dest": "lib" },
{ "src": "EM.LIB", "dest": "lib" },
{ "src": "LIBH.LIB", "dest": "lib" }
]
}
]
The test is performed by opening a terminal and executing the following command:
$ ./imgdisk-copy.sh testing/
The command will output each disk image name as it is mounted, and under it the names of the files being copied (unexpanded), as follows:
disk01.img
*
disk02.img
*.EXE
disk03.img
LINK.EXE
*.H
SYS/*.H
SLIBC.LIB
SLIBFP.LIB
EM.LIB
LIBH.LIB
The result will be a directory testing
under where the script is with the following structure:
testing/
├── bin
│ ├── C1.EXE
│ ├── C2.EXE
│ ├── C3.EXE
│ ├── CL.EXE
│ ├── CV.EXE
│ ├── EXEMOD.EXE
│ ├── EXEPACK.EXE
│ ├── LIB.EXE
│ ├── LINK.EXE
│ ├── MAKE.EXE
│ ├── MSC.EXE
│ └── SETENV.EXE
├── include
│ ├── sys
│ │ ├── LOCKING.H
│ │ ├── STAT.H
│ │ ├── TIMEB.H
│ │ ├── TYPES.H
│ │ └── UTIME.H
│ ├── ASSERT.H
│ ├── CONIO.H
│ ├── CTYPE.H
│ ├── DIRECT.H
│ ├── DOS.H
│ ├── ERRNO.H
│ ├── FCNTL.H
│ ├── FLOAT.H
│ ├── IO.H
│ ├── LIMITS.H
│ ├── MALLOC.H
│ ├── MATH.H
│ ├── MEMORY.H
│ ├── PROCESS.H
│ ├── SEARCH.H
│ ├── SETJMP.H
│ ├── SHARE.H
│ ├── SIGNAL.H
│ ├── STDARG.H
│ ├── STDDEF.H
│ ├── STDIO.H
│ ├── STDLIB.H
│ ├── STRING.H
│ ├── TIME.H
│ ├── V2TOV3.H
│ └── VARARGS.H
└── lib
├── EM.LIB
├── LIBH.LIB
├── SLIBC.LIB
└── SLIBFP.LIB
2 answers
You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.
The following answer was given by SE user Edward. The original source can be found here.
The other answer gave some really good advice; this is intended as a complementary answer with still more things to think about.
Put default arguments at the top of the script
If someone wanted to change the default arguments, they'd have to hunt through the code to find them. I typically prefer to put them at the top of the script and then only overwrite them if command line arguments are passed. For example:
#!/bin/bash
# default arguments
TARGET=./target
JSON=steps.json
# Command line args are both optional: TARGET JSON
if [[ -z "$1" ]] ; then
TARGET="$1"
fi
if [[ -z "$2" ]] ; then
JSON="$2"
fi
Use install
to copy files
DOS archives may or may not have proper permissions bits set and may need to have a complex path created before copying the file. We can manage all of this easily with install
which is also a basic part of every Linux installation:
echo "installing $src on $disk to $dst"
install -p --mode=664 -D "$TMPDIR"/$src -t "$TARGET"/$dst/
With the -p
argument we preserve the original timestamp. The mode argument explictly sets the mode
for each file (you could, of course change this to something else if you cared to). The combination of -D
and -t
tells install to create the destination directory if it doesn't already exist.
Do more with jq
Since you're already requiring a dependency on jq
, it makes sense to use its capabilities more thoroughly. As you know, it has the ability to apply one or more filters sequentially to the result of the previous step. We can use this to great advantage and only call jq
once like this:
# use jq to create disk, src, dst triplets to feed to inst
jq -r -c '.[] | {disk, file: .files[]} | {disk, src: .file.src, dst: .file.dest} | [.disk,.src,.dst] |@sh ' "$JSON" | while read line
do inst ${line}
done
As you can see from the comment, this extracts (disk, src, dst)
triplets.
Create a function to do the work
Given the above advice, what we need is the inst
routine to actually do the work. Here's one way to write that:
# working variables
TMPDIR=
LASTDISK=
# given disk, src, dst triplet
# mount the disk in a temporary dir
# (if not already mounted)
# and install from src to dst
# src may contain wildcards
function inst () {
disk=$(eval echo $1)
src=$(eval echo $2)
dst=$(eval echo $3)
if [[ "$disk" != "$LASTDISK" ]] ; then
cleanup
TMPDIR="$(mktemp -d)"
echo "mounting $disk on $TMPDIR"
if sudo mount -r "$disk" "$TMPDIR" ; then
LASTDISK="$disk"
else
echo "Failed to mount $disk"
sudo rmdir "$TMPDIR"
fi
fi
echo "installing $src on $disk to $dst"
install -p --mode=664 -D "$TMPDIR"/$src -t "$TARGET"/$dst/
}
Notice that I've used a number of bash
-isms here that make this non-portable, but since you've explicitly called out bash
, I'm assuming this is OK. I've also chosen to use sudo mount
and sudo umount
instead of udisksctl
. Either could work, of course; it's a matter of preference as to which is used. On one hand, mount
is always available but on the other, it requires sudo
privileges. Most of this will be self-explanatory, except for cleanup which is described in the next suggestion.
Use a cleanup function
It's annoying when a script fails for some reason and then leaves temporary files or other junk lying around as a result. One technique that's handy for this is to use bash
's TRAP
feature.
# un mount and remove bind dir TMPDIR if
# TMPDIR is not empty
function cleanup {
if [[ ! -z "$TMPDIR" ]] ; then
sudo umount "$TMPDIR"
sudo rm -rf "$TMPDIR"
fi
}
# rest of script ...
trap cleanup EXIT
This tells bash
that no matter how we get to the exit (either normally or via some fatal error) it needs to invoke the specified function, which I typically name cleanup
for obvious reasons.
0 comment threads
The following answer was given by SE user Oh My Goodness. The original source can be found here.
-
Instead of
cat "$x" | command
orecho "$x" | command
, usecommand <$x
(vscat
) orcommand <<<$x
(vsecho
): it saves a fork and removes the need to quote. -
Instead of
if [ x -lt y ]
useif [[ x -lt y ]]
: it saves a fork ([[
is a bash builtin;help test
for details) and adds some functionality. -
Functions return their last exit value already so
contains()
can be shortened to the following (whether you prefer this is up to you):contains() { test "${1#*$2}" != "$1" }
-
Use bash defaulting mechanism instead of
if [[ -z
, as inCONF=${2:-./steps.json}
. -
Use
for ((i=0; i<$LIMIT; i++))
instead ofi=0; while ...
. -
Test the exit values of things that shouldn't fail, as in
mkdir -p "$DESTROOT" || exit 1
. Any invocation ofcd
orpushd
should be checked for success, always! A general purposeDIE()
function can replace the naked exit and take an error message as an argument. If nothing should fail,set -e
ortrap DIE ERR
(the first argument is a function name) does this globally. -
Constructions like
jq -r ".["$i"].files | length")
andecho " ""$FSRC"
are kind of weird and the inner double quotes probably should be removed. -
In a language where every variable is a global, it's a good habit to use fewer variables. For example,
RES=$(foo); LOOP=$( echo "$RES" | ...)
can just beLOOP=$( foo | ...)
. -
Your get-conf pattern should be in a function like:
get_conf() { jq -r $1<<<$CONF }
-
Pruning code paths is important in an interpreted language. Since the wildcard copy method works for regular copies too, just use that one unconditionally and remove
if contains ... "\*"
. -
You don't need to escape wildcards like
*
in double quotes. When in doubt about what will be interpolated, use single quotes. Quoting in bash can be very complex and take a long time to learn; an advanced understanding of it will help to avoid common bugs. -
Since you are using commands that aren't standard, it's a good idea to set
PATH
in the script, or as an optional config directive, and to check that they're there before you begin, as in the following example:require() { for cmd in "$@"; do type $cmd >/dev/null || exit 1 done } # Code here... require jq udisksctl
-
Read
CONF
just once, into a variable:conf=$(<$CONF)
, and query that. Then you can edit the config while the script runs.
0 comment threads