Loading
Loading
Open-math estimators for the AI Stack. Inputs are transparent, formulas are visible, presets are labelled. Useful for comparisons, dangerous as fake precision.
Describe the agent you are building; get a deterministic-first eval plan with real teeth.
Type what your agent does and get back an evaluation plan built on a method, not a vibe: cases engineered so the lazy solution fails and only the generalizing one passes, weighted across the three pillars — functional correctness, agentic efficiency, trajectory health. Each case names its deterministic checker, how a model could game it, and the pass bar. Grounded in the failure library behind claude-code-evals; the framework is the deliverable.
When the frontier labs expect human-level AI — and what each step unlocks.
A living read on the AGI timeline: each lab’s forecast, the capability ladder behind it, and what to do at each rung. Updated every time a new model moves the date.
Funnel, unit economics, and the distribution lever, for any consumer AI subscription.
A working model of consumer AI subscription growth. Pick a lab, size the funnel, watch LTV, CAC, ARPU, and payback move, and test the one lever that decides the economics: where the next subscriber comes from, whether that is paid media, the apps you already own, the device, or the carrier bill. Every input is editable; the formulas are the deliverable.
Orbital vs terrestrial AI compute — the launch-cost breakeven in $/GPU-hour.
Run the same GPU rack in orbit and on the ground, and find the launch cost at which space wins on $/GPU-hour. Then meet the real constraint: not lift cost, but the mass of the radiator you must launch — and never service — to dump the heat in a vacuum. Sourced defaults, every input editable, the breakeven math is the deliverable.
Datacenters, the power plants feeding them, and the labs — on one US map.
An interactive map of the physical AI buildout: the headline training datacenters (Stargate, Hyperion, Colossus) and major hyperscale hubs, the power plants tied to them (Palo Verde, Vogtle, the nuclear restart deals), and frontier-lab HQs. Marker size scales with capacity; tap any site for detail. Custom-drawn, no map API.
How building with AI goes wrong — incidents and the anti-patterns behind them.
A growing checklist of AI build failures: named, sourced public incidents plus the reusable anti-patterns they teach (unreviewed output, prompt injection, over-automation, capability overtrust). Read it yourself or point your agent at it before you ship — each card has a run-through checklist.
Balance grid energy, Blackwell compute, parameter-scaling models, software SaaS, and humanoids.
Can you orchestrate the AI Stack to align a safe superintelligence? Acquire modular nuclear power, secure Blackwell GPUs, manage parameters, and deploy physical systems while answering realistic Fermi estimations and navigating policy bottlenecks.
Tokens, watts, dollars — the cost of running a model at scale.
Plug in workload shape (model size, precision, monthly tokens) and get back the MW, capex, annual GPU rental, water draw, and cost-per-million-tokens. Lab presets included; every input is editable, every formula is visible.