Underwater acoustic indices and ocean biodiversity

This project was done in collaboration with Ocean Science Analytics. For more information, please check this post on Waveform Analytics' website.

Summary of code

This repository contains the code used to prepare the data and to build the dashboard. The data preparation was done using Python, and the dashboard was build using R Shiny.

Additional links

Interactive data dashboard

Documentation site

Data overview

This project utilizes a comprehensive dataset that includes fish annotations, acoustic indices, and environmental data. The data is stored in a DuckDB database and CSV files, and is linked through an R Shiny dashboard for analysis and visualization.

Data Sources:

DuckDB Database:
- The main data source is a DuckDB database (mbon11.duckdb) containing several tables related to fish annotations, acoustic indices, and seascaper data.
CSV Files:
- Additional data is sourced from CSV files, including:
  - Index Categories: Updated index categories for analysis (Updated_Index_Categories_v2.csv).
  - Site Information: Information about different sites where data was collected (BioSound_Datasets.csv).

Data Types:

Fish Annotations:
- Data related to fish presence and annotations from different locations (e.g., Key West, May River).
- Tables: t_fish_keywest, t_fish_mayriver.
Acoustic Indices:
- Acoustic indices data that includes various metrics related to sound recordings.
- Tables: t_aco2, t_aco_norm2.
Seascaper Data:
- Data from the Seascaper tool that relates to environmental data and water classes.
- Table: t_seascaper.

Data Formats:

R Data Frames:
- Data is manipulated and analyzed using R data frames created from the DuckDB tables and CSV files.
Pandas DataFrames:
- In Python, data is handled using Pandas DataFrames, especially in the data_wrangler.py and tidy_biosound_data.ipynb files.
Jupyter Notebook:
- The tidy_biosound_data.ipynb file is a Jupyter notebook that contains code for data preparation and analysis, including merging and cleaning data.

Data Analysis and Visualization:

R Shiny Dashboard:
- The R Shiny dashboard includes multiple tabs for visualizing data, including:
  - Time series plots with annotations.
  - Boxplots comparing index values by species.
  - Heatmaps showing relationships between acoustic indices and water classes.
  - Download options for generated plots and data.
Plotting Libraries:
- Libraries such as ggplot2, dygraphs, and plotly are used for creating visualizations in the R Shiny application.

Data Preparation Functions:

Data Wrangling:
- Functions in data_wrangler.py are used to prepare and normalize data, handle annotations, and combine datasets.
Normalization:
- The normalize_df function normalizes acoustic indices to a range between -1 and 1.
Annotation Preparation:
- Functions like annotation_prep_kw_style and annotation_prep_mr_style are used to prepare annotations for Key West and May River datasets, respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
assets		assets
python-prep		python-prep
shiny		shiny
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Underwater acoustic indices and ocean biodiversity

Summary of code

Additional links

Data overview

Data Sources:

Data Types:

Data Formats:

Data Analysis and Visualization:

Data Preparation Functions:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Underwater acoustic indices and ocean biodiversity

Summary of code

Additional links

Data overview

Data Sources:

Data Types:

Data Formats:

Data Analysis and Visualization:

Data Preparation Functions:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages