Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,22 +63,32 @@ Once GaPTools is setup, to execute it on the included sample study, run the belo
./dbgap-docker.bash -i ./input_files/1000_Genomes_Study/ -o ./output_files/1000_Genomes_Study -m ./input_files/1000_Genomes_Study/metadata.json up
```

GaPTools uses Apache Airflow behind the scenes as the workflow orchestrator to perform all the validation tasks. To view the validation results of the dbGaP validation tool, browse to the following URL:
GaPTools uses [Apache Airflow](https://airflow.apache.org/) behind the scenes as the workflow orchestrator to perform all the validation tasks. To view the validation results of the dbGaP validation tool, browse to the following URL:

```
http://<your_docker_host_ip>:8080
```

If you are running this locally on a workstation, this can often be found at http://localhost:8080.


At the end of the workflow, the output files will be created under the specified output directory.

## Usage

To use GaPTools for your study, modify the above command and pass as input parameters:

__-i__ -- path to the input files for your study
- __`-i path/to/INPUT_DIR`__ -- path to the input files for your study (may also use `--input [...]` on Linux OS's)

- __`-o path/to/OUTPUT_DIR`__ -- path where output files should be generated (may also use `--output [...]` on Linux OS's)

- __`-m path/to/metadata.json`__ -- path to the manifest file for your study (may also use `--manifest [...]` on Linux OS's)

__-o__ -- path where output files should be generated
- __`-h`__ -- print full usage information at the command line (may also use `--help` on Linux OS's)

### Note on macOS

__-m__ -- path to the manifest file for your study
On macOS, only the short versions of the command line options are supported. (`-i`, `-o`, `-m`, `-h`)

## Stop Docker Containers

Expand All @@ -88,4 +98,4 @@ Once your study is processed, run the below command to stop the GaPTools service
```

## Contact
If you have any questions or to report any issues, please contact us at: [dbgap-help@ncbi.nlm.nih.gov](dbgap-help@ncbi.nlm.nih.gov)
If you have any questions or to report any issues, please contact us at: [dbgap-help@ncbi.nlm.nih.gov](mailto:dbgap-help@ncbi.nlm.nih.gov)
59 changes: 41 additions & 18 deletions dbgap-docker.bash
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,18 @@
#================================================================
#% SYNOPSIS
#+ ${SCRIPT_NAME} [-h] [-i, --input [input directory]]
#+ [-o, --output [output directory]]
#+ [-o, --output [output directory]]
#+ [-m, --manifest [manifest file]]
#+ [up|down]
#+
#% DESCRIPTION
#% This script runs GaPTools to validate data files
#% to be submitted to dbGaP. It loads a docker-compose
#% to be submitted to dbGaP. It loads a Docker Compose
#% file and runs required docker containers.
#%
#%
#% OPTIONS
#% -i, --input Input directory containing the
#% -i, --input Input directory containing the
#% data files to be validated.
#% -o, --output Output directory
#% -m, --manifest Manifest file with metadata
Expand All @@ -31,6 +31,9 @@
#% docker
#% docker-compose
#%
#% NOTES
#% macOS and other BSD-based systems do not support long options (--help, etc)
#% Use the short option equivalents (-i, -o, -m, -h)
#%
#================================================================
# END_OF_HEADER
Expand All @@ -53,12 +56,22 @@ OPTIONS=ht:i:o:m:
LONGOPTS=help,input:,output:,manifest:

# pass arguments only via -- "$@" to separate them correctly
! PARSED=$(getopt --options=$OPTIONS --longoptions=$LONGOPTS --name "$0" -- "$@")
if [[ "$OSTYPE" == "darwin"* ]]; then
# macOS uses the BSD getopt, which does not support long options.
! PARSED=$(getopt $OPTIONS "$@")
else
! PARSED=$(getopt --options=$OPTIONS --longoptions=$LONGOPTS --name "$0" -- "$@")
fi

if [[ ${PIPESTATUS[0]} -ne 0 ]]; then
# e.g. return value is 1
# then getopt has complained about wrong arguments to stdout
echo "Wrong number/type of arguments"
usage
if [[ "$OSTYPE" == "darwin"* ]]; then
# macOS uses the BSD getopt, which does not support long options.
echo "WARNING: On macOS: long options (e.g. '--help') are not supported."
fi
exit 2
fi
# read getopt output to handle the quoting:
Expand All @@ -72,7 +85,7 @@ while true; do
exit
;;
-i|--input)
INPUT_DIR="$2"
INPUT_DIR="$2"
INPUT_DIR="$(echo -e "${INPUT_DIR}" | tr -d '[[:space:]]')"
if [ -z "$INPUT_DIR" ]; then
echo $INPUT_DIR "Please provide the input directory"
Expand All @@ -84,10 +97,10 @@ while true; do
usage
exit 2
fi
shift 2
shift 2
;;
-o|--output)
OUTPUT_DIR="$2"
OUTPUT_DIR="$2"
OUTPUT_DIR="$(echo -e "${OUTPUT_DIR}" | tr -d '[[:space:]]')"
if [ -z "$OUTPUT_DIR" ]; then
echo $OUTPUT_DIR "Please provide the output directory"
Expand All @@ -98,11 +111,11 @@ while true; do
echo $OUTPUT_DIR "Please provide the output directory"
usage
exit 2
fi
fi
shift 2
;;
-m| --manifest)
MANIFEST="$2"
MANIFEST="$2"
MANIFEST="$(echo -e "${MANIFEST}" | tr -d '[[:space:]]')"
if [ -z "$MANIFEST" ]; then
echo $MANIFEST "Please provide full path to the manifest file"
Expand All @@ -113,7 +126,7 @@ while true; do
echo $MANIFEST "Please provide full path to the manifest file"
usage
exit 2
fi
fi
shift 2
;;
--)
Expand All @@ -128,6 +141,16 @@ while true; do
esac
done

if command -v docker-compose > /dev/null ; then
DOCKER_COMPOSE_COMMAND="docker-compose"
elif docker compose version 2>&1 > /dev/null ; then
DOCKER_COMPOSE_COMMAND="docker compose"
else
echo "Docker Compose not found"
usage
exit 2
fi

##################
# Verify up/down arguments were supplied properly
##################
Expand All @@ -144,11 +167,11 @@ if ! [[ "${DSTATE}" =~ ^(up|down)$ ]]; then
fi

#########################
# Run the docker-compose commands to bring the environment down
# Run the $DOCKER_COMPOSE_COMMAND commands to bring the environment down
# Skips the rest of the validation and exit the script
#########################
if [ $DSTATE == "down" ]; then
docker-compose -f docker-compose-CeleryExecutor.yml down
$DOCKER_COMPOSE_COMMAND -f docker-compose-CeleryExecutor.yml down
exit
fi

Expand Down Expand Up @@ -204,7 +227,7 @@ if [ ! -f "$MANIFEST" ]; then
fi

########################
# Create a .env file to be used by docker-compose
# Create a .env file to be used by $DOCKER_COMPOSE_COMMAND
########################
echo "OUTPUT_VOL=${OUTPUT_DIR}" > .env
echo "INPUT_VOL=${INPUT_DIR}" >> .env
Expand Down Expand Up @@ -250,15 +273,15 @@ docker_check() {
return 0
fi
done < <(docker ps)
return 1
return 1
}

#########################
# Run the docker-compose commands to bring the environment up
# Run the $DOCKER_COMPOSE_COMMAND commands to bring the environment up
#########################
if [ $DSTATE == "up" ]; then
docker pull ncbi/gaptools:latest
docker-compose -f docker-compose-CeleryExecutor.yml up -d
$DOCKER_COMPOSE_COMMAND -f docker-compose-CeleryExecutor.yml up -d
i=0
while ! docker_check
do
Expand All @@ -268,12 +291,12 @@ if [ $DSTATE == "up" ]; then
echo "Timed out waiting for webserver to start"
echo "Check the docker container logs by executing the command \"docker logs [container_name]\""
echo "E.g. \"docker logs gaptools_webserver_1\""
echo
echo
exit 2
fi
sleep 10
done

echo ""
echo "The airflow server has started on port 8080. Visit "
echo "http://<your_docker_host_ip>:8080"
Expand Down