ECHO - Explainable Computation for Hearing Outputs

Learning Interpretability Tool for Audio Models

Interpreting how deep learning models make decisions is crucial, especially in high-stakes applications like speech recognition, emotion detection, and speaker identification. While the Learning Interpretability Tool (LIT) enables exploration of text and tabular models, there's a lack of equivalent tools for voice-based models. Voice data poses additional challenges due to its temporal nature and multi-modal representations (e.g., waveform, spectrogram).

ECHO extends the interpretability paradigm to audio models, providing researchers and developers with tools to analyze and debug speech models with greater transparency. Through interactive visualizations, attention mechanisms, and perturbation analyses, you can gain deeper insights into how your audio models make decisions.

Features

Audio Data Management: Upload and manage audio datasets with metadata
Waveform Visualization: Interactive waveform viewer with playback controls
Model Prediction Analysis: Examine model predictions and confidence scores
Attention Visualization: Explore attention patterns in transformer-based audio models
Embedding Analysis: Visualize high-dimensional audio embeddings in 2D/3D space
Saliency Mapping: Identify important regions in audio input using gradient-based methods
Perturbation Tools: Apply various audio perturbations to test model robustness
Interactive Dashboard: Comprehensive interface for exploring model behavior

Tech Stack

Frontend: React 18 + TypeScript + Vite
UI Framework: Tailwind CSS + shadcn/ui components
State Management: TanStack Query
Data Visualization: Custom React components with Chart.js integration
Audio Processing: Web Audio API
Backend: FastAPI + Python 3.11
Models: Transformer-based audio models (Whisper, Wav2Vec2)
Storage: Redis for caching predictions and results

Prerequisites

Frontend:
- Node.js (v18 or higher)
- npm or bun package manager
Backend:
- Python 3.11
- Docker (for Redis)

Installation

1. Clone the repository

git clone https://github.com/AnasSAV/ECHO.git
cd ECHO

2. Set up the Frontend

cd Frontend
npm install
npm run dev

3. Start Redis server in Docker

# In a new terminal
cd Backend
docker compose up -d

4. Set up the Backend

Using Python venv (recommended)

cd Backend
python -m venv .venv
.venv\Scripts\activate  # On Windows
source .venv/bin/activate  # On Unix or MacOS
python -m pip install --upgrade pip
pip install -r requirements.txt
uvicorn app.main:app --reload

Using Miniconda (alternative)

# Initialize conda for your shell (Only if you have not used Conda before)
conda init cmd.exe

# Navigate to your project folder
cd Backend

# Create the environment with Python 3.10
conda create -n ECHO python=3.10 -y

# Activate the environment
conda activate ECHO

# Install dependencies
conda install -c pytorch -c nvidia -c conda-forge fastapi uvicorn starlette httpx python-multipart python-dotenv pydantic-settings anyio numpy pandas librosa pysoundfile transformers pytorch torchvision torchaudio pytorch-cuda=12.1 redis-py pytest pytest-asyncio requests -y

# Start the backend server
uvicorn app.main:app --reload

5. Access the Application

Open your browser and navigate to http://localhost:8080

Project Structure

ECHO/
├── Frontend/                # React frontend application
│   ├── components/          # React components
│   │   ├── analysis/        # Analysis and perturbation tools
│   │   ├── audio/           # Audio visualization components
│   │   ├── layout/          # Layout components
│   │   ├── panels/          # Dashboard panels
│   │   ├── ui/              # Reusable UI components
│   │   └── visualization/   # Data visualization components
│   ├── hooks/               # Custom React hooks
│   ├── lib/                 # Utility functions
│   └── pages/               # Page components
│
├── Backend/                 # FastAPI backend application
│   ├── app/                 # Application code
│   │   ├── api/             # API routes and endpoints
│   │   ├── core/            # Core functionality
│   │   └── services/        # Business logic services
│   ├── data/                # Sample datasets
│   ├── tests/               # Backend tests
│   └── uploads/             # User-uploaded audio files
│
├── CODE_OF_CONDUCT.md       # Community guidelines
├── CONTRIBUTING.md          # Contribution guidelines
├── LICENSE                  # MIT License
├── README.md                # Project documentation
└── SECURITY.md              # Security policy

Available Scripts

Frontend

npm run dev - Start development server
npm run build - Build for production
npm run lint - Run ESLint
npm run preview - Preview production build

Backend

pytest - Run backend tests
uvicorn app.main:app --reload - Start the API server in development mode

Usage

Upload Audio Data: Use the audio uploader to load your audio files
Select Models: Choose from available audio models for analysis
Explore Visualizations:
- Examine waveforms and spectrograms
- View model predictions and confidence scores
- Explore attention patterns and embedding spaces
- Generate saliency maps to highlight important audio regions
Apply Perturbations: Test model robustness with various audio perturbations
Analyze Results: Use the interactive dashboard to gain insights

Contributing

We welcome contributions! Please read our Contributing Guidelines for more information.

Security

For security-related issues, please refer to our Security Policy.

Authors

Anas Hussaindeen - GitHub Profile
Chandupa Ambepitiya - GitHub Profile
Dewmike Amarasinghe - GitHub Profile

Mentor

Dr Uthayasanker Thayasivam - NLP Researcher & Senior Lecturer and Head of Department at Computer Science & Engineering, University of Moratuwa, Sri Lanka

Acknowledgments

Inspired by Google's Learning Interpretability Tool (LIT)
Built with modern React ecosystem and TypeScript
Special thanks to the open-source community for the amazing tools and libraries

Roadmap

Backend API enhancements for model serving
Support for more audio model architectures
Advanced perturbation techniques
Real-time audio processing capabilities
Export functionality for visualizations
Multi-language support
Plugin system for custom analysis tools

License

This project is licensed under the MIT License - see the LICENSE file for details.

_{Built for audio model interpretability}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ECHO - Explainable Computation for Hearing Outputs

Features

Tech Stack

Prerequisites

Installation

1. Clone the repository

2. Set up the Frontend

3. Start Redis server in Docker

4. Set up the Backend

Using Python venv (recommended)

Using Miniconda (alternative)

5. Access the Application

Project Structure

Available Scripts

Frontend

Backend

Usage

Contributing

Security

Authors

Mentor

Acknowledgments

Roadmap

License

About

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
Backend		Backend
Frontend		Frontend
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
helper.txt		helper.txt
temp_timeline_replacement.txt		temp_timeline_replacement.txt

License

AnasSAV/ECHO

Folders and files

Latest commit

History

Repository files navigation

ECHO - Explainable Computation for Hearing Outputs

Features

Tech Stack

Prerequisites

Installation

1. Clone the repository

2. Set up the Frontend

3. Start Redis server in Docker

4. Set up the Backend

Using Python venv (recommended)

Using Miniconda (alternative)

5. Access the Application

Project Structure

Available Scripts

Frontend

Backend

Usage

Contributing

Security

Authors

Mentor

Acknowledgments

Roadmap

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Uh oh!

Languages

Packages