Research Preview

A better way to do attention

We're building a novel attention mechanism for transformers. It outperforms existing methods — and the advantage grows with scale.

See the Results Contact Us →

Leads

At 350M scale

Zero

Extra compute cost

3×

Scales validated

Experimental Results

Validated at three scales

Compared against established baselines under identical training conditions. Only the attention mechanism differs.

Scale	Our method vs best baseline
30M parameters	Baseline leads
125M parameters	Ours leads
350M parameters	Ours leads (gap widens)

Lower perplexity = better. Full benchmark details available under NDA.

Scaling Trend

The gap accelerates

At small scale, baselines win. At medium scale, our method overtakes. At larger scale, the lead grows significantly. The crossover and acceleration are the key result.

Small scale

Baseline wins

Established methods have an edge at small sizes

Medium scale

Crossover

Our method overtakes the best baseline

Larger scale

Lead widens

The advantage accelerates — scaling to 1B+ next

About

Social Spider Labs

We're building novel, compute-efficient attention mechanisms for large language models. Our approach is bio-inspired and produces better models at no additional cost.

Currently scaling to billion-parameter models.

Interested?

We're looking for compute partners and early collaborators. Detailed results available under NDA.

Get in Touch