Skip to content
View DarshanFofadiya's full-sized avatar

Block or report DarshanFofadiya

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DarshanFofadiya/README.md

Hi, I'm Darshan Fofadiya.

I am a Senior Applied Scientist at Amazon, but at my core I have always been obsessed with systems.

Not just software systems, but the elegant, ruthless efficiency of natural ones. How does a neuron decide to fire? How does gravity organize chaos into galaxies? How does a child learn to speak not by memorizing dictionaries, but by grasping concepts?

For the last 11 years, I have worked as a builder. At Amazon, I operated at the bleeding edge of technology handling millions of requests, and petabytes of data. I learned the hard, practical lessons of engineering: that complexity is the enemy, and that "working" is not the same as "optimal."

But as I scaled these systems, I couldn't shake the feeling that we were brute-forcing intelligence. We were building bigger engines, not smarter drivers. We were training models to predict the next word, but not to think the next thought.

This dissatisfaction drove me back to first principles—to the physics, biology, and mathematics that underpin true intelligence. My work today is an attempt to bridge that gap: to bring the efficiency of biology and the structure of physics into the chaotic world of Artificial Intelligence.


1. Finding Order in Chaos (The Physics of Data)

The Project: Adaptive Density-Aware Clustering

My first major confrontation with "brute force" was at Amazon. We needed to organize 450,000 unique semantic tags across 20 languages. The standard approach was to use rigid, pre-defined rules. The result was chaos—"mega-clusters" that swallowed distinct concepts and stripped them of meaning.

I realized this wasn't a data problem; it was a physics problem.

I stopped treating data points as static labels and started treating them as stars in a galaxy.

I developed an algorithm based on gravitational stability: instead of forcing points into clusters, I let them "fall" into dense regions naturally. I built an "exponential growth trigger"—a mathematical tripwire that detected when a cluster was growing too fast (crossing a density bridge) and forced it to collapse back into stability.

It worked not because I added more compute, but because I respected the intrinsic geometry of the data.

2. The Efficiency of Silence (The Biology of Learning)

The Project: GatedAdam Optimization

As I moved deeper into Deep Learning, I was struck by how wasteful it is. The human brain is a marvel of sparsity; it burns 20 watts of power because it knows exactly what not to learn. It forgets the noise to keep the signal.

Standard AI optimizers do the opposite. They update every weight, every time, effectively trying to memorize the entire universe at once.

I wanted to build a "braking system" for AI. I developed GatedAdam, an optimizer inspired by biological synapses.

I introduced a "Benefit Score"—a gatekeeper that asks, "Is this update actually worth the energy?" If the answer is no, the weight stays dormant. By pivoting to Tanh activations (which naturally saturate) and enforcing a "saturation bonus," I forced the network to make hard choices.

The result was a model with 93% structural sparsity. It was a network that had learned the value of silence.

3. Ideas Before Words (The Cognitive Architecture)

The Project: Latent Semantic Planning

My most recent obsession is the "Stochastic Parrot" problem. Current LLMs are brilliant improvisers, but they are terrible planners. They write sentences word-by-word, often forgetting where they started by the time they reach the end. This is "Associative Drift."

I realized that humans don't think in words; we think in concepts. The sentence is just the delivery mechanism.

To fix this, I architected the Idea-Gated Transformer.

I ripped open the standard architecture and inserted a "System 2" module—a planner that looks 20 steps into the future. It forces the model to commit to a "Bag of Concepts" (an Idea) before it is allowed to generate a single token.

It was a risky architectural bet, but it paid off. By forcing the model to "plan then speak," we achieved a level of semantic coherence that statistical probability alone could never provide.


The Future

I am no longer just building systems that scale; I am building systems that reason.

My journey has taken me from the server rooms of Amazon to the abstract landscapes of high-dimensional optimization. I am driven by a curiosity that spans disciplines—from the entropy of thermodynamics to the plasticity of the brain.

I believe the next great leap in AI won't come from making models larger. It will come from making them deeper. It will come from understanding the physics of intelligence.


📫 Connect

  • Email: fofadiyadarshan@gmail.com

Popular repositories Loading

  1. sparselab sparselab Public

    Actually-sparse dynamic training for PyTorch. CPU-native, Apple Silicon first. Pluggable routers, drop-in SparseLinear.

    Python 6 2

  2. idea-gated-transformers idea-gated-transformers Public

    Official PyTorch implementation of Idea-Gated Transformers, a novel architecture that enforces semantic coherence via differentiable vocabulary pruning. Includes training code and reproduction scri…

    Python

  3. deepseek-mhc-explained deepseek-mhc-explained Public

    Simple explanation for Deepseek mHC

  4. DarshanFofadiya DarshanFofadiya Public

    Senior Applied Scientist (Amazon) pivoting to fundamental research. Author of "Idea-Gated Transformers" (ICML 2026 Submission).

  5. DarshanFofadiya.github.io DarshanFofadiya.github.io Public

    Senior Applied Scientist (Amazon) pivoting to fundamental research. Author of "Idea-Gated Transformers" (ICML 2026 Submission).

  6. blog-comments blog-comments Public

    Comments for darshanfofadiya.com