Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Serial copying from disk images to folder in Bash

−0

(Brought over from SE.)

This is a Bash script that copies files stored inside disk images to a directory, using a defined structure provided via a JSON file. I've included the external programs it requires and the test I used so that you can test it too.

Any comments regarding programming style and improvements are welcome.

Overview

The following is a Bash shell script that copies files stored inside disk images into a directory in the filesystem.

The script takes two parameters:

The first one is optional and defines a root directory (existing or not) that will contain the files being copied.
The second one, optional when the first one is given, is a path to a valid JSON-formatted file that describes:
1. which disk images will be opened,
2. which files inside each disk image will be copied, and
3. which path inside the directory root will be used as the destination for the files being copied.

The first parameter defaults to the current directory when not given. The second one defaults to a file named steps.json located in the current directory. If the first parameter is not given, the second one can't be either.

Prerequisites

This script requires the following external programs to work correctly:

The JSON parsing program jq.
The disk image manipulation utility udisksctl.

To install these dependencies, use one of the following commands:

Ubuntu
```
$ sudo apt install jq udisks2
```
Fedora
```
$ sudo dnf install jq udisks2
```

Script

The complete script is below. It can be marked as executable to avoid having to prepend bash to its execution command. There is no restrictions on directories this script may reside into. For the purpose of the test below, the directory where it resides has read/write permissions.

`imgdisk-copy.sh`

#!/bin/bash

# Copying files contained inside disk images via JSON recipe.
# Aura Lesse Programmer
# December 12th, 2018

# Is a string contained in another? Return 0 if so; 1 if not.
# By fjarlq, from https://stackoverflow.com/a/8811800/5397930
contains() {
    string="$1"
    substring="$2"

    if test "${string#*$substring}" != "$string"; then
        return 0
    else
        return 1
    fi
}

# Obtain the absolute path of a given directory.
# By dogbane, from https://stackoverflow.com/a/3915420
abspath() {
    dir="$1"
    echo "$(cd "$(dirname "$dir")"; pwd -P)/$(basename "$dir")"
}

# The main script starts here.

# If no first parameter is given, assume current directory.
if [ -z "$1" ]; then
    DESTROOT="."
else
    # Omit any trailing slash
    DESTROOT=$(abspath "${1%/}")
fi

# If no second parameter is given, assume file "steps.json".
# If no first parameter is given, this can't be either.
if [ -z "$2" ]; then
    CONF="./steps.json"
else
    CONF="$2"
fi

# Create the root directory where the files will the put.
mkdir -p "$DESTROOT"

# How many disks will be processed?
LIMIT=$(cat "$CONF" | jq -r length)

i=0
while [ "$i" -lt "$LIMIT" ]; do
    # For each disk, get its file name.
    DISK=$(cat "$CONF" | jq -r .["$i"].disk)

    echo "$DISK"

    # Setup a loop device for the disk and get its name.
    RES=$(udisksctl loop-setup -f "$DISK")
    LOOP=$(echo "$RES" | cut -f5 -d' ' | head -c -2)

    # Using the loop device obtained, mount the disk.
    # Obtain the mount root directory afterwards.
    RES=$(udisksctl mount -b "$LOOP")
    SRCDIR=$(echo "$RES" | sed -nE 's|.*at (.*)\.|\1|p')

    # How many file sets will be copied?
    NOITEMS=$(cat "$CONF" | jq -r ".["$i"].files | length")
    j=0
    while [ "$j" -lt "$NOITEMS" ]; do
        # For each file set, obtain which files will be copied and where.
        FSRC=$(cat "$CONF" | jq -r .["$i"].files["$j"].src)
        FDEST=$(cat "$CONF" | jq -r .["$i"].files["$j"].dest)

        # Make the destination directory.
        mkdir -p "$DESTROOT"/"$FDEST"

        echo "    ""$FSRC"

        if contains "$FSRC" "\*"; then
            # If a wildcard is used in the file set, copy by file expansion (option -t).
            pushd "$SRCDIR" > /dev/null
            cp -t "$DESTROOT"/"$FDEST" $FSRC
            popd > /dev/null
        else
            # Else, copy normally.
            cp "$SRCDIR"/"$FSRC" "$DESTROOT"/"$FDEST"
        fi

        j=$(($j + 1))
    done

    # Once all the file sets are copied, unmount the disk
    # and delete its associated loop device.
    udisksctl unmount -b "$LOOP" > /dev/null
    udisksctl loop-delete -b "$LOOP"

    i=$(($i + 1))
done

Test set

This script was tested with the following disk set: Microsoft C Compiler 4.0. The first 3 .img disks inside the ZIP (disk01.img, disk02.img, and disk03.img) should be placed in the same directory the script is.

The corresponding JSON recipe used for the test is below. It is also placed in the same directory the script is for convenience.

`steps.json`

[
    {
        "disk": "disk01.img",
        "files": [
            { "src": "*", "dest": "bin" }
        ]
    },
    {
        "disk": "disk02.img",
        "files": [
            { "src": "*.EXE", "dest": "bin" }
        ]
    },
    {
        "disk": "disk03.img",
        "files": [
            { "src": "LINK.EXE", "dest": "bin" },
            { "src": "*.H", "dest": "include" },
            { "src": "SYS/*.H", "dest": "include/sys" },
            { "src": "SLIBC.LIB", "dest": "lib" },
            { "src": "SLIBFP.LIB", "dest": "lib" },
            { "src": "EM.LIB", "dest": "lib" },
            { "src": "LIBH.LIB", "dest": "lib" }
        ]
    }
]

The test is performed by opening a terminal and executing the following command:

$ ./imgdisk-copy.sh testing/

The command will output each disk image name as it is mounted, and under it the names of the files being copied (unexpanded), as follows:

disk01.img
    *
disk02.img
    *.EXE
disk03.img
    LINK.EXE
    *.H
    SYS/*.H
    SLIBC.LIB
    SLIBFP.LIB
    EM.LIB
    LIBH.LIB

The result will be a directory testing under where the script is with the following structure:

testing/
├── bin
│   ├── C1.EXE
│   ├── C2.EXE
│   ├── C3.EXE
│   ├── CL.EXE
│   ├── CV.EXE
│   ├── EXEMOD.EXE
│   ├── EXEPACK.EXE
│   ├── LIB.EXE
│   ├── LINK.EXE
│   ├── MAKE.EXE
│   ├── MSC.EXE
│   └── SETENV.EXE
├── include
│   ├── sys
│   │   ├── LOCKING.H
│   │   ├── STAT.H
│   │   ├── TIMEB.H
│   │   ├── TYPES.H
│   │   └── UTIME.H
│   ├── ASSERT.H
│   ├── CONIO.H
│   ├── CTYPE.H
│   ├── DIRECT.H
│   ├── DOS.H
│   ├── ERRNO.H
│   ├── FCNTL.H
│   ├── FLOAT.H
│   ├── IO.H
│   ├── LIMITS.H
│   ├── MALLOC.H
│   ├── MATH.H
│   ├── MEMORY.H
│   ├── PROCESS.H
│   ├── SEARCH.H
│   ├── SETJMP.H
│   ├── SHARE.H
│   ├── SIGNAL.H
│   ├── STDARG.H
│   ├── STDDEF.H
│   ├── STDIO.H
│   ├── STDLIB.H
│   ├── STRING.H
│   ├── TIME.H
│   ├── V2TOV3.H
│   └── VARARGS.H
└── lib
    ├── EM.LIB
    ├── LIBH.LIB
    ├── SLIBC.LIB
    └── SLIBFP.LIB

posted 3 months ago

CC BY-NC-SA 4.0

3mo ago

aura-lsprog-86‭

101 reputation 8 9 16 8

Raw

Markdown

History

is a duplicate

This question has been asked before and has already been answered. It should be marked as a duplicate.

Please enter the URL of the proposed duplicate in the details field below.

not constructive

This question cannot be answered in a way that is helpful to anyone. It's not possible to learn something from possible answers, except for the solution for the specific problem of the asker.

0 comment threads

2 answers

Score Active Age

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

−0

The following answer was given by SE user Edward. The original source can be found here.

The other answer gave some really good advice; this is intended as a complementary answer with still more things to think about.

Put default arguments at the top of the script

If someone wanted to change the default arguments, they'd have to hunt through the code to find them. I typically prefer to put them at the top of the script and then only overwrite them if command line arguments are passed. For example:

#!/bin/bash

# default arguments
TARGET=./target 
JSON=steps.json

# Command line args are both optional: TARGET JSON
if [[ -z "$1" ]] ; then
    TARGET="$1"
fi
if [[ -z "$2" ]] ; then
    JSON="$2"
fi

Use `install` to copy files

DOS archives may or may not have proper permissions bits set and may need to have a complex path created before copying the file. We can manage all of this easily with install which is also a basic part of every Linux installation:

echo "installing $src on $disk to $dst"
install -p --mode=664 -D "$TMPDIR"/$src -t "$TARGET"/$dst/

With the -p argument we preserve the original timestamp. The mode argument explictly sets the mode for each file (you could, of course change this to something else if you cared to). The combination of -D and -t tells install to create the destination directory if it doesn't already exist.

Do more with `jq`

Since you're already requiring a dependency on jq, it makes sense to use its capabilities more thoroughly. As you know, it has the ability to apply one or more filters sequentially to the result of the previous step. We can use this to great advantage and only call jq once like this:

# use jq to create disk, src, dst triplets to feed to inst
jq -r -c '.[] | {disk, file: .files[]} | {disk, src: .file.src, dst: .file.dest} | [.disk,.src,.dst] |@sh ' "$JSON" | while read line 
    do inst ${line}
done

As you can see from the comment, this extracts (disk, src, dst) triplets.

Create a function to do the work

Given the above advice, what we need is the inst routine to actually do the work. Here's one way to write that:

# working variables
TMPDIR=
LASTDISK=

# given disk, src, dst triplet
# mount the disk in a temporary dir
# (if not already mounted)
# and install from src to dst
# src may contain wildcards
function inst () {
    disk=$(eval echo $1)
    src=$(eval echo $2)
    dst=$(eval echo $3)
    if [[ "$disk" != "$LASTDISK" ]] ; then 
        cleanup
        TMPDIR="$(mktemp -d)"
        echo "mounting $disk on $TMPDIR"
        if sudo mount -r "$disk" "$TMPDIR" ; then 
            LASTDISK="$disk"
        else 
            echo "Failed to mount $disk"
            sudo rmdir "$TMPDIR"
        fi
    fi
    echo "installing $src on $disk to $dst"
    install -p --mode=664 -D "$TMPDIR"/$src -t "$TARGET"/$dst/
}

Notice that I've used a number of bash-isms here that make this non-portable, but since you've explicitly called out bash, I'm assuming this is OK. I've also chosen to use sudo mount and sudo umount instead of udisksctl. Either could work, of course; it's a matter of preference as to which is used. On one hand, mount is always available but on the other, it requires sudo privileges. Most of this will be self-explanatory, except for cleanup which is described in the next suggestion.

Use a cleanup function

It's annoying when a script fails for some reason and then leaves temporary files or other junk lying around as a result. One technique that's handy for this is to use bash's TRAP feature.

# un mount and remove bind dir TMPDIR if
# TMPDIR is not empty
function cleanup {
    if [[ ! -z "$TMPDIR" ]] ; then
        sudo umount "$TMPDIR"
        sudo rm -rf "$TMPDIR"
    fi
}

# rest of script ...

trap cleanup EXIT

This tells bash that no matter how we get to the exit (either normally or via some fatal error) it needs to invoke the specified function, which I typically name cleanup for obvious reasons.

posted 3 months ago

CC BY-NC-SA 4.0

aura-lsprog-86‭

101 reputation 8 9 16 8

Copy Link

Raw

Markdown

History

0 comment threads

−0

The following answer was given by SE user Oh My Goodness. The original source can be found here.

Instead of cat "$x" | command or echo "$x" | command, use command <$x (vs cat) or command <<<$x (vs echo): it saves a fork and removes the need to quote.
Instead of if [ x -lt y ] use if [[ x -lt y ]]: it saves a fork ([[ is a bash builtin; help test for details) and adds some functionality.
Functions return their last exit value already so contains() can be shortened to the following (whether you prefer this is up to you):
```
contains() {
    test "${1#*$2}" != "$1"
}
```
Use bash defaulting mechanism instead of if [[ -z, as in CONF=${2:-./steps.json}.
Use for ((i=0; i<$LIMIT; i++)) instead of i=0; while ....
Test the exit values of things that shouldn't fail, as in mkdir -p "$DESTROOT" || exit 1. Any invocation of cd or pushd should be checked for success, always! A general purpose DIE() function can replace the naked exit and take an error message as an argument. If nothing should fail, set -e or trap DIE ERR (the first argument is a function name) does this globally.
Constructions like jq -r ".["$i"].files | length") and echo " ""$FSRC" are kind of weird and the inner double quotes probably should be removed.
In a language where every variable is a global, it's a good habit to use fewer variables. For example, RES=$(foo); LOOP=$( echo "$RES" | ...) can just be LOOP=$( foo | ...).
Your get-conf pattern should be in a function like:
```
get_conf() {
    jq -r $1<<<$CONF
}
```
Pruning code paths is important in an interpreted language. Since the wildcard copy method works for regular copies too, just use that one unconditionally and remove if contains ... "\*".
You don't need to escape wildcards like * in double quotes. When in doubt about what will be interpolated, use single quotes. Quoting in bash can be very complex and take a long time to learn; an advanced understanding of it will help to avoid common bugs.
Since you are using commands that aren't standard, it's a good idea to set PATH in the script, or as an optional config directive, and to check that they're there before you begin, as in the following example:
```
require() {
    for cmd in "$@"; do
        type $cmd >/dev/null || exit 1
    done
}

# Code here...

require jq udisksctl
```
Read CONF just once, into a variable: conf=$(<$CONF), and query that. Then you can edit the config while the script runs.

posted 3 months ago

CC BY-NC-SA 4.0

3mo ago

aura-lsprog-86‭

101 reputation 8 9 16 8

Copy Link

Raw

Markdown

History

Communities

Serial copying from disk images to folder in Bash

Overview

Prerequisites

Script

imgdisk-copy.sh

Test set

steps.json

0 comment threads

2 answers

Put default arguments at the top of the script

Use install to copy files

Do more with jq

Create a function to do the work

Use a cleanup function

0 comment threads

0 comment threads

`imgdisk-copy.sh`

`steps.json`

Use `install` to copy files

Do more with `jq`