Skip to content
View benjamin-elder's full-sized avatar

Block or report benjamin-elder

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
benjamin-elder/README.md

🧩 Selected Contributions

Some of the work below was developed in private enterprise repositories and later upstreamed to public repositories by my team. Public commit history may not reflect individual authorship.

🔹 Agent Lifecycle Toolkit (ALTK)

Repo: https://agenttoolkit.github.io/agent-lifecycle-toolkit
Tech: Python, Langchain

What I worked on:

  • Designed and implemented ToolGuard component for enforcing LLM/Agent adherance to user policies
  • Work described in ArXiV preprint: https://arxiv.org/abs/2510.14842, under conference review

🔹 M3 Train and Eval

Repo: https://github.com/ibm/m3-train-eval
Tech: Python, Deepspeed

What I worked on:

  • Designed and ran multi-GPU tuning experiments (LoRA/ALoRA) for models with transformer, MoE, and State-Space architectures
  • Included DPO-style training capability
  • Created tool-calling agent and evaluation harness to validate training efficacy

🔹 Uncertainty Quantification 360 (UQ360)

Repo: https://github.com/IBM/UQ360
Tech: Python, Uncertainty Quantification

What I worked on:

  • Led development of black-box meta-model uncertainty quantification algorithm
  • Lead author on AAAI accepted publication describing this approach: https://arxiv.org/abs/2012.08625

🔹 NeuNets

Repo: https://github.com/pmservice/NeuNetS
Tech: Python, Jupyter

What I worked on:

  • Contributed as member of dev team delivering component as production feature

Pinned Loading

  1. NeuNetS NeuNetS Public

    Forked from pmservice/NeuNetS

    Jupyter Notebook

  2. IBM/m3-train-eval IBM/m3-train-eval Public

    m3 training/evaluation pipeline

    Python 1 1