This project is a command-line application written in Go that demonstrates image compression using Singular Value Decomposition (SVD), a fundamental concept in linear algebra. The application takes an input image, compresses it to a specified level, and saves the result as a new image.
- Image Compression: Compresses images by reducing the amount of data required to represent them.
- Adjustable Compression Level: Allows users to control the level of compression by specifying the number of singular values to keep.
- Command-Line Interface: Easy-to-use CLI for specifying input and output files and the compression level.
- Pure Go Implementation: The core linear algebra concepts (SVD, Gram-Schmidt) are implemented from scratch in Go.
The image compression process is based on the mathematical technique of Singular Value Decomposition (SVD). Here's a high-level overview of the steps involved:
-
Load and Convert to Grayscale: The input image is loaded and converted to grayscale. This simplifies the process by working with a single color channel.
-
Image to Matrix: The grayscale image is converted into a matrix, where each element of the matrix represents the intensity of a pixel.
-
Singular Value Decomposition (SVD): The SVD algorithm is applied to the image matrix. SVD decomposes the original matrix (A) into three other matrices:
- U: An orthogonal matrix.
- S (Sigma): A diagonal matrix containing the singular values of the original matrix.
- V_T (V Transpose): The transpose of an orthogonal matrix.
The decomposition is represented as:
A = U * S * V_T -
Truncation (Compression): The compression happens by truncating the S, U and V matrices. We keep the top 'k' singular values from the S matrix, and the corresponding columns from the U and V matrices. The number 'k' is a parameter that controls the compression level. A smaller 'k' results in higher compression but lower image quality.
-
Reconstruction: The compressed image matrix is reconstructed by multiplying the truncated matrices:
A_compressed = U_k * S_k * V_k_T -
Matrix to Image: The reconstructed matrix is converted back into an image, which is then saved to the specified output file.
The core of the SVD implementation relies on the Gram-Schmidt process for QR decomposition, which is used iteratively to find the eigenvalues and eigenvectors of the matrix.
-
Clone the repository:
git clone https://github.com/tekeoglan/img-comp.git cd img-comp -
Build the application:
go build -o img-comp ./cmd/main.go
-
Run the application:
./img-comp -input <path_to_input_image> -output <path_to_output_image> -k <compression_level>
Command-line flags:
-input: (Required) Path to the input image file.-output: (Required) Path to save the compressed image file.-k: (Optional) Number of singular values to keep (compression level). Defaults to 50. A smaller 'k' means more compression.
Example:
./img-comp -input assets/blue_sky.png -output assets/blue_sky_compressed.png -k 100
/home/tekeoglan/dev/personal/img-comp/
├───go.mod
├───README.md
├───assets/
├───cmd/
│ └───main.go
├───img/
│ ├───img_comp.go
│ ├───img_load.go
│ └───img_matrix.go
├───matrix/
│ ├───eigen.go
│ ├───matrix.go
│ ├───qr_decomp.go
│ └───svd.go
└───utils/
└───logic.go
cmd/main.go: The entry point of the application, responsible for parsing command-line arguments and orchestrating the image compression process.img/: This package contains all the logic related to image handling, such as loading, saving, converting to grayscale, and converting between images and matrices.matrix/: This package implements the core linear algebra operations from scratch, including matrix multiplication, transposition, Gram-Schmidt decomposition, and Singular Value Decomposition (SVD).utils/: A package for utility functions.assets/: Contains sample images for testing the application.
This project has no external dependencies. All the required code is part of the Go standard library or implemented within the project itself.
This program is not optimized and runs poorly. I built this project to apply my linear algebra knowledge in a practical computer project. I don't recommend compressing images larger than 256×256.
This project is licensed under the MIT License. See the LICENSE file for details.