Magma
Foundation Model for Multimodal AI Agents
About
Magma is presented as the first foundation model for multimodal AI agents, developed by Microsoft Research and collaborators. It aims to bridge verbal, spatial, and temporal intelligence to enable AI agents to perceive the multimodal world and take goal-driven actions across digital and physical environments. The model is pretrained on extensive heterogeneous vision-language datasets, including images, videos, and robotics data, utilizing novel Set-of-Mark (SoM) for action grounding and Trace-of-Mark (ToM) for action planning. Magma demonstrates state-of-the-art performance in UI navigation and robotic manipulation, and competitive results in vision-language tasks, spatial reasoning, and video QA.
Categories & Tags
Color Palette
Background White
#FFFFFF
Text Black
#000000
Magma Brand Blue/Purple
#5C3E9E
Link Blue
#007bff
Typography
Sans-serif
Headings and Body Text
Design Review
Similar Products
Clear for Slack
Clear messages get answered quicker
Griply 2026
Achieve your goals with a goal-oriented task manager
vibecoder.date
Find who you vibe with, git commit to love
HappyMail
We made email simple again
Blober.io
The easiest way to transfer files between cloud providers.
Supaguard
Scan, Detect & Protect Your Supabase Data
Timelines Time Tracking 4
Track your time to achieve your New Year resolutions.
SoftReveal — Reveal less. Engage more.
Hide Content, Reveal on Click
CalPal
The notebook calculator that thinks for you (now with AI).
Reword
Rewrite messages without leaving your workflow
Radial
Your shortcuts, one gesture away
MoovAI
Launch viral AI ads & pro social content in minutes