Research Preview

A better way to do attention

We're building a novel attention mechanism for transformers. It outperforms existing methods — and the advantage grows with scale.

Leads
At 350M scale
Zero
Extra compute cost
Scales validated
Experimental Results

Validated at three scales

Compared against established baselines under identical training conditions. Only the attention mechanism differs.

ScaleOur method vs best baseline
30M parametersBaseline leads
125M parametersOurs leads
350M parametersOurs leads (gap widens)

Lower perplexity = better. Full benchmark details available under NDA.

Scaling Trend

The gap accelerates

At small scale, baselines win. At medium scale, our method overtakes. At larger scale, the lead grows significantly. The crossover and acceleration are the key result.

Small scale
Baseline wins
Established methods have an edge at small sizes
Medium scale
Crossover
Our method overtakes the best baseline
Larger scale
Lead widens
The advantage accelerates — scaling to 1B+ next
About

Social Spider Labs

We're building novel, compute-efficient attention mechanisms for large language models. Our approach is bio-inspired and produces better models at no additional cost.

Currently scaling to billion-parameter models.

Interested?

We're looking for compute partners and early collaborators. Detailed results available under NDA.