GitHub - InfoLab-SKKU/infodeslib

Infodeslib: Python Library for Dynamic Ensemble Learning using Late Fusion of Multimodal Data

There has been a notable increase in research focusing on dynamic selection (DS) techniques within the field of ensemble learning. This leads to the development of various techniques for ensembling multiple classifiers for a specific instance or set of instances during the prediction phase. Despite this progress, the design and development of DS approaches with late fusion settings and their explainability remain unexplored. This work proposes an open-source Python library, Infodeslib, to address this gap. The library provides an implementation of several DS techniques, including four dynamic classifier selections and seven dynamic ensemble selection techniques, all of which are integrated with late data fusion settings and novel explainability features. Infodeslib offers flexibility and customization options, making it a versatile tool for various complex applications that require the fusion of multimodal data and various explainability features. Multimodal data, which integrates information from diverse sources or sensor modalities, is a common and essential setting for real-world problems, enhancing the robustness and depth of data analysis. These data can be fused in two main ways: early fusion, where different modalities are combined at the feature level before model training, and late fusion, where each modality is processed separately and the results are combined at the decision level.

For more details, please check our paper.

Documentation

https://infodeslib.readthedocs.io/en/latest/

Installation

pip install infodeslib

Requirement

install SHAP (0.41.0)

Example

Loading necessary libraries and dataset:

import warnings
warnings.filterwarnings('ignore') 

import pandas as pd 
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

from sklearn.svm import SVC 
from sklearn.ensemble import RandomForestClassifier 
from sklearn.neighbors import KNeighborsClassifier 

from sklearn.metrics import accuracy_score 

## Load simple open dataset 
data = load_breast_cancer()
df = pd.DataFrame(data.data, columns = data.feature_names)
df['target'] = data.target

Split the dataset into training, validation for DES (DSEL), and testing.

X = df.drop(['target'], axis=1) 
y = df.target 

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)
X_pool, X_dsel, y_pool, y_dsel   = train_test_split(X_train, y_train, test_size=0.30, random_state=42)

Models and Feature sets Generation

model1 = SVC(probability=True, random_state=42)
model2 = RandomForestClassifier(random_state=42) 
model3 = KNeighborsClassifier() 

feature_set1 = data.feature_names[:10] 
feature_set2 = data.feature_names[10:20]
feature_set3 = data.feature_names[20:]

model_pool = [model1, 
              model2, 
              model3]

feature_sets = [feature_set1, 
                feature_set2, 
                feature_set3]

Train the models (pool):

for i in range(len(model_pool)): 
    model_pool[i].fit(X_pool[feature_sets[i]], y_pool)
    
    acc = round(model_pool[i].score(X_dsel[feature_sets[i]], y_dsel), 3) 
    print("[DSEL] Model {} acc: {}".format(i, acc)) 

    acc = round(model_pool[i].score(X_test[feature_sets[i]], y_test), 3)  
    print("[Test] Model {} acc: {}".format(i, acc))

Usage of our library:

import shap 
from infodeslib.des.knorau import KNORAU 

# initializing 
knorau = KNORAU(model_pool, feature_sets, k=7)
knorau.fit(X_dsel, y_dsel)

Testing

preds =  knorau.predict(X_test)  

acc = round(accuracy_score(y_test, preds), 3) 
print("[Test] acc: {}".format(acc))

Explainability

colors = {0: 'red', 1: 'green'}  

knorau = KNORAU(model_pool, feature_sets, k=7, colors=colors)
knorau.fit(X_dsel, y_dsel)

index = 18
query = X_test.iloc[[index]]

## Make plot=True 
knorau.predict(query, plot=True)

Citation

We would appreciate it if you could cite our work when using our code.

@inproceedings{juraev2024infodeslib,
  title={Infodeslib: Python Library for Dynamic Ensemble Learning using Late Fusion of Multimodal Data},
  author={Juraev, Firuz and El-Sappagh, Shaker and Abuhmed, Tamer},
  booktitle={ACM SIGKDD, Fourth Workshop on Knowledge-infused Learning (KiL'24)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
dcs		dcs
des		des
LICENSE		LICENSE
README.md		README.md
infodeslib_experimental_result.md		infodeslib_experimental_result.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Infodeslib: Python Library for Dynamic Ensemble Learning using Late Fusion of Multimodal Data

Documentation

Installation

Requirement

Example

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Infodeslib: Python Library for Dynamic Ensemble Learning using Late Fusion of Multimodal Data

Documentation

Installation

Requirement

Example

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages