It’s been almost two months since ChatGPT’s image feature became super popular, and now Microsoft finally added something similar to Copilot. Looks like they’re trying to catch up with the others in the race.
Earlier this spring, OpenAI reshaped how users engage with visual content by embedding image creation directly within GPT-4o’s native capabilities. Rather than leaning on external engines like DALL·E, ChatGPT began producing images natively—responding to prompts with greater precision, readable text, and clear stylistic alignment. The upgrade not only improved output fidelity but also introduced dynamic editing, letting users manipulate visuals with follow-up commands or modify uploaded photos in real time.
That feature hit a nerve. In one week, users generated over 700 million images—showing both the demand for responsive visual AI and the virality of GPT-4o’s rollout.
Now, Microsoft is integrating similar functionality into its Copilot platform. The upgrade lets Copilot users build visuals from scratch, refine them with detailed instructions, apply stylistic changes to existing images, and even render legible text—all within the same interface. Microsoft is positioning the feature as a major leap forward for creative professionals and casual users alike.
However, the move has sparked commentary around timing. Microsoft’s update arrives weeks after ChatGPT's image features dominated the spotlight, and Google’s Gemini has also been iterating quickly. Although Microsoft unveiled the changes during its recent 50th anniversary event, much of what was presented mirrored capabilities already offered by competitors.
Mustafa Suleyman, who now leads Microsoft’s AI division, has pledged a vision of Copilot as an assistant that understands individual users on a personal level. Delivering on that ambition, though, may require faster execution and more original features—not just parity.
With user expectations rising and innovation cycles accelerating, Microsoft’s challenge lies in keeping pace without simply retracing the paths others have already carved.
Read next: Not Just Directions: Google Maps Became a Mirror of Daily Life Through 999 Million Local Reviews
Earlier this spring, OpenAI reshaped how users engage with visual content by embedding image creation directly within GPT-4o’s native capabilities. Rather than leaning on external engines like DALL·E, ChatGPT began producing images natively—responding to prompts with greater precision, readable text, and clear stylistic alignment. The upgrade not only improved output fidelity but also introduced dynamic editing, letting users manipulate visuals with follow-up commands or modify uploaded photos in real time.
That feature hit a nerve. In one week, users generated over 700 million images—showing both the demand for responsive visual AI and the virality of GPT-4o’s rollout.
Now, Microsoft is integrating similar functionality into its Copilot platform. The upgrade lets Copilot users build visuals from scratch, refine them with detailed instructions, apply stylistic changes to existing images, and even render legible text—all within the same interface. Microsoft is positioning the feature as a major leap forward for creative professionals and casual users alike.
However, the move has sparked commentary around timing. Microsoft’s update arrives weeks after ChatGPT's image features dominated the spotlight, and Google’s Gemini has also been iterating quickly. Although Microsoft unveiled the changes during its recent 50th anniversary event, much of what was presented mirrored capabilities already offered by competitors.
Mustafa Suleyman, who now leads Microsoft’s AI division, has pledged a vision of Copilot as an assistant that understands individual users on a personal level. Delivering on that ambition, though, may require faster execution and more original features—not just parity.
With user expectations rising and innovation cycles accelerating, Microsoft’s challenge lies in keeping pace without simply retracing the paths others have already carved.
Read next: Not Just Directions: Google Maps Became a Mirror of Daily Life Through 999 Million Local Reviews