SciGo's official mascot - Ready, Set, SciGo!
The blazing-fast scikit-learn compatible ML library for Go
Say "Goodbye" to slow ML, "Sci-Go" to fast learning!
SciGo = Statistical Computing In Go
SciGo brings the power and familiarity of scikit-learn to the Go ecosystem, offering:
- π₯ Blazing Fast: Native Go implementation with built-in parallelization
- π― scikit-learn Compatible: Familiar Fit/Predict API for easy migration
- π² LightGBM Support: Full compatibility with Python LightGBM models (.txt/JSON/string)
- π Well Documented: Complete API documentation with examples on pkg.go.dev
- π Streaming Support: Online learning algorithms for real-time data
- π Zero Heavy Dependencies: Pure Go implementation (only scientific essentials)
- π Comprehensive: Regression, classification, clustering, tree-based models, and more
- π§ͺ Production Ready: Extensive tests, benchmarks, and error handling
- β‘ Superior to leaves: Not just inference - full training, convenience features, and numerical precision
go get github.com/YuminosukeSato/scigo@latest- π³ Docker:
docker run --rm -it ghcr.io/yuminosukesato/scigo:latest - βοΈ GitPod:
- π¦ Go Install:
go install github.com/YuminosukeSato/scigo/examples/quick-start@latest
π‘ Tip: For complete API documentation with examples, visit pkg.go.dev/scigo
package main
import (
"github.com/YuminosukeSato/scigo/sklearn/lightgbm"
"gonum.org/v1/gonum/mat"
)
func main() {
// Super convenient one-liner training!
X := mat.NewDense(100, 4, data) // Your data
y := mat.NewDense(100, 1, labels) // Your labels
// Train and predict in one line!
result := lightgbm.QuickTrain(X, y)
predictions := result.Predict(X_test)
// Or use AutoML for automatic tuning
best := lightgbm.AutoFit(X, y)
// Load Python LightGBM models directly!
model := lightgbm.NewLGBMClassifier()
model.LoadModel("python_model.txt") // Full compatibility!
predictions, _ := model.Predict(X_test)
}package main
import (
"fmt"
"log"
"github.com/YuminosukeSato/scigo/linear"
"gonum.org/v1/gonum/mat"
)
func main() {
// Create and train model - just like scikit-learn!
model := linear.NewLinearRegression()
// Training data
X := mat.NewDense(4, 2, []float64{
1, 1,
1, 2,
2, 2,
2, 3,
})
y := mat.NewDense(4, 1, []float64{
2, 3, 3, 4,
})
// Fit the model
if err := model.Fit(X, y); err != nil {
log.Fatal(err)
}
// Make predictions
XTest := mat.NewDense(2, 2, []float64{
1.5, 1.5,
2.5, 3.5,
})
predictions, _ := model.Predict(XTest)
fmt.Println("Ready, Set, SciGo! Predictions:", predictions)
}The documentation includes comprehensive examples for all major APIs. Visit the Go Doc links above or use go doc locally:
# View package documentation
go doc github.com/YuminosukeSato/scigo/preprocessing
go doc github.com/YuminosukeSato/scigo/linear
go doc github.com/YuminosukeSato/scigo/metrics
# View specific function documentation
go doc github.com/YuminosukeSato/scigo/preprocessing.StandardScaler.Fit
go doc github.com/YuminosukeSato/scigo/linear.LinearRegression.Predict
go doc github.com/YuminosukeSato/scigo/metrics.MSE
# Run example tests
go test -v ./preprocessing -run Example
go test -v ./linear -run Example
go test -v ./metrics -run Example- β Linear Regression - Full scikit-learn compatible implementation with QR decomposition
- β SGD Regressor - Stochastic Gradient Descent for large-scale learning
- β SGD Classifier - Linear classifiers with SGD training
- β Passive-Aggressive - Online learning for classification and regression
- β StandardScaler - Standardizes features by removing mean and scaling to unit variance
- β MinMaxScaler - Scales features to a given range (e.g., [0,1] or [-1,1])
- β OneHotEncoder - Encodes categorical features as one-hot numeric arrays
- β
LightGBM - Full Python model compatibility (.txt/JSON/string formats)
- LGBMClassifier - Binary and multiclass classification
- LGBMRegressor - Regression with multiple objectives
- QuickTrain - One-liner training with automatic model selection
- AutoFit - Automatic hyperparameter tuning
- Superior to leaves - training + convenience features
- π§ Random Forest (Coming Soon)
- π§ XGBoost compatibility (Coming Soon)
- β MiniBatch K-Means - Scalable K-Means for large datasets
- π§ DBSCAN (Coming Soon)
- π§ Hierarchical Clustering (Coming Soon)
- β Incremental Learning - Update models with new data batches
- β Partial Fit - scikit-learn compatible online learning
- β Concept Drift Detection - DDM and ADWIN algorithms
- β Streaming Pipelines - Real-time data processing with channels
SciGo implements the familiar scikit-learn API with full compatibility:
// Just like scikit-learn!
model.Fit(X, y) // Train the model
model.Predict(X) // Make predictions
model.Score(X, y) // Evaluate the model
model.PartialFit(X, y) // Incremental learning
// New in v0.3.0 - Full scikit-learn compatibility
model.GetParams(deep) // Get model parameters
model.SetParams(params) // Set model parameters
weights, _ := model.ExportWeights() // Export model weights
model.ImportWeights(weights) // Import with guaranteed reproducibility
// Streaming - unique to Go!
model.FitStream(ctx, dataChan) // Streaming training- Complete Weight Reproducibility - Guaranteed identical outputs with same weights
- gRPC/Protobuf Support - Distributed training and prediction
- Full Parameter Management - GetParams/SetParams for all models
- Model Serialization - Export/Import with full precision
SciGo leverages Go's concurrency for exceptional performance:
| Algorithm | Dataset Size | SciGo | scikit-learn (Python) | Speedup |
|---|---|---|---|---|
| Linear Regression | 1MΓ100 | 245ms | 890ms | 3.6Γ |
| SGD Classifier | 500KΓ50 | 180ms | 520ms | 2.9Γ |
| MiniBatch K-Means | 100KΓ20 | 95ms | 310ms | 3.3Γ |
| Streaming SGD | 1M streaming | 320ms | 1.2s | 3.8Γ |
Benchmarks on MacBook Pro M2, 16GB RAM
| Dataset Size | Memory | Allocations |
|---|---|---|
| 100Γ10 | 22.8KB | 22 |
| 1,000Γ10 | 191.8KB | 22 |
| 10,000Γ20 | 3.4MB | 57 |
| 50,000Γ50 | 41.2MB | 61 |
scigo/
βββ linear/ # Linear models
βββ sklearn/ # scikit-learn compatible implementations
β βββ linear_model/ # SGD, Passive-Aggressive
β βββ cluster/ # Clustering algorithms
β βββ drift/ # Concept drift detection
βββ metrics/ # Evaluation metrics
βββ core/ # Core abstractions
β βββ model/ # Base model interfaces
β βββ tensor/ # Tensor operations
β βββ parallel/ # Parallel processing
βββ datasets/ # Dataset utilities
βββ examples/ # Usage examples
Comprehensive evaluation metrics with full documentation:
- Regression Metrics:
- MSE (Mean Squared Error) -
pkg.go.dev/metrics.MSE - RMSE (Root Mean Squared Error) -
pkg.go.dev/metrics.RMSE - MAE (Mean Absolute Error) -
pkg.go.dev/metrics.MAE - RΒ² (Coefficient of Determination) -
pkg.go.dev/metrics.R2Score - MAPE (Mean Absolute Percentage Error) -
pkg.go.dev/metrics.MAPE - Explained Variance Score -
pkg.go.dev/metrics.ExplainedVarianceScore
- MSE (Mean Squared Error) -
- Classification: Accuracy, Precision, Recall, F1-Score, ROC-AUC (coming)
- Clustering: Silhouette Score, Davies-Bouldin Index (coming)
# Run tests
go test ./...
# Run benchmarks
go test -bench=. -benchmem ./...
# Check coverage (76.7% overall coverage)
go test -cover ./...
# Run linter (errcheck, govet, ineffassign, staticcheck, unused, misspell)
make lint-full
# Run examples to see API usage
go test -v ./preprocessing -run Example
go test -v ./linear -run Example
go test -v ./metrics -run Example
go test -v ./core/model -run Example- β Test Coverage: 76.7% (target: 70%+)
- β Linting: golangci-lint with comprehensive checks
- β Documentation: Complete godoc for all public APIs
- β Examples: Comprehensive example functions for all major APIs
Check out the examples directory:
- Linear Regression - Basic regression
- Streaming Learning - Online learning demo
- Iris Classification - Classic dataset
- Error Handling - Robust error management
We welcome contributions! Please see our Contributing Guide.
# Clone the repository
git clone https://github.com/YuminosukeSato/scigo.git
cd scigo
# Install dependencies
go mod download
# Run tests
go test ./...
# Run linter
golangci-lint runSciGo uses automated continuous delivery for releases:
- Automatic Release: Every push to the
mainbranch triggers an automatic patch version release - Version Management: Versions are automatically incremented (e.g., 0.4.0 β 0.4.1)
- Release Assets: Binaries for Linux, macOS, and Windows are automatically built and attached
- Docker Images: Docker images are automatically built and pushed to GitHub Container Registry (ghcr.io)
- Documentation: pkg.go.dev is automatically updated with the latest version
- Merge PR to main: When a PR is merged to main branch
- Automatic Tests: CI runs all tests and coverage checks
- Version Bump: Patch version is automatically incremented
- Create Release: GitHub Release is created with:
- Multi-platform binaries (Linux, macOS, Windows)
- Release notes from CHANGELOG.md
- Docker image at
ghcr.io/yuminosukesato/scigo:VERSION
- Post-Release: An issue is created to track post-release verification tasks
For major or minor version releases, create and push a tag manually:
git tag v0.5.0 -m "Release v0.5.0"
git push origin v0.5.0This will trigger the release workflow via the existing release.yml workflow.
- β Linear models
- β Online learning
- β Basic clustering
- π§ Tree-based models
- Neural Networks (MLP)
- Deep Learning integration
- Model serialization (ONNX export)
- GPU acceleration
- Distributed training
- AutoML capabilities
- Model versioning
- A/B testing framework
- API Documentation - Complete API reference with examples
- Package Index - Browse all packages
| API | Package | Documentation |
|---|---|---|
StandardScaler |
preprocessing | pkg.go.dev/preprocessing.StandardScaler |
MinMaxScaler |
preprocessing | pkg.go.dev/preprocessing.MinMaxScaler |
OneHotEncoder |
preprocessing | pkg.go.dev/preprocessing.OneHotEncoder |
LinearRegression |
linear | pkg.go.dev/linear.LinearRegression |
BaseEstimator |
core/model | pkg.go.dev/model.BaseEstimator |
- π scikit-learn Migration Guide - Complete guide for Python developers
- ποΈ API Stability Analysis - v1.0.0 roadmap and compatibility
- π Streaming Guide (Coming Soon)
- β‘ Performance Tuning (Coming Soon)
- Inspired by scikit-learn
- Built with Gonum
- Error handling by CockroachDB errors
SciGo is licensed under the MIT License. See LICENSE for details.
- Author: Yuminosuke Sato
- GitHub: @YuminosukeSato
- Repository: https://github.com/YuminosukeSato/scigo
- Issues: GitHub Issues
Made with β€οΈ and lots of β in Go
Development-only parity tests compare the Go implementation against scikit-learn outputs.
They are not part of the default go test; use the parity build tag explicitly.
Steps
- Generate golden data
- Use
uvinstead of pip. - Command:
uv run --with scikit-learn --with numpy --with scipy python scripts/golden/gen_logreg.py
- Use
- Run parity tests
- Command:
go test ./sklearn/linear_model -tags=parity -run Parity -v
- Command:
One-liner
make parity-linear
Notes
- Current LogisticRegression uses simplified gradient descent. After implementing lbfgs/newton-cg, tolerances will be tightened.
- Golden file is written to
tests/golden/logreg_case1.json.