Computer Use
Highlights
1. Anthropic introduces “Computer Use,” a public beta that allows Claude to interact with computer interfaces.
2. Microsoft’s new autonomous agents in Copilot Studio and Dynamics 365 empower organizations to streamline processes across sales, service, finance, and supply chains.
3. Perplexity AI’s Pro upgrade introduces a Reasoning Mode in Pro Search, enabling multi-step analysis for complex queries with integrated code execution, problem-solving, and data analysis features.
4. Inflection for Enterprise’s new Agentic Workflows enables businesses to trust AI agents to take action by integrating with enterprise systems and providing automated, accurate task execution.
5. Midjourney introduced a web tool that lets anyone upload images and make “powerful” edits.
6. Mochi 1, a new open-source state-of-the-art video generation model, offers high-fidelity motion and strong prompt adherence at 480p.
7. GLP-1 drugs could revolutionize treatment for a wide range of health conditions, offering transformative benefits for public health, longevity, and various industries.
8. An article from SAP points out that generative AI enables companies to develop new product ideas and designs by leveraging AI’s ability to combine and evaluate data swiftly.
Innovation Insights
1. Your problem-solving idea flow, AI-augmented (Gianni Giacomelli)
The article explores using AI to augment human creativity and problem-solving by blending human insight with AI’s processing capabilities. Rather than simply querying machines, it emphasizes a multi-step approach where humans guide AI to improve ideation quality. The process involves defining the problem thoroughly, considering multiple perspectives, and using AI to provide insights through analogies, constraints, and recombination of ideas. AI’s role is iterative, supporting creativity by prompting, critiquing, and generating solutions, while humans maintain control to steer away from errors or overly generic outputs.
2. How to lean on GenAI for new product ideas (SAP)
Generative AI has quickly become a powerful tool for innovation across industries, enabling companies to develop new product ideas and designs by leveraging AI’s ability to combine and evaluate data swiftly. Successful GenAI implementation, however, requires more than just prompting AI; it demands thoughtful supervision, prompt engineering, and a focus on real-world customer problems. For ideation, GenAI accelerates brainstorming and supports divergent thinking by rapidly generating diverse viewpoints. Proprietary platforms are recommended to protect sensitive data and ensure innovation remains unique to the organization.
AI Innovations
1. Alphabet
New generative AI tools like MusicFX DJ, Music AI Sandbox, and YouTube’s Dream Track enable creators to generate, refine, and share high-quality music in real-time, utilizing text prompts and intuitive controls to experiment with unique sounds and genres (Google).
SynthID-Text is a production-ready watermarking system for large language models that preserves text quality while enabling efficient and accurate detection of synthetic text, without affecting model training or performance, and has been validated in large-scale experiments (Nature).
2. Apple
The iOS 18.2 developer beta introduces Apple Intelligence enhancements, including Genmoji for custom emoji creation, Visual Intelligence for object recognition, and ChatGPT integration with Siri for tasks like itinerary and meal planning. Users can also enjoy the Image Playground and Image Wand tools, which offer creative options for personalized illustrations and improved sketches (Engadget).
3. Microsoft
Microsoft’s new autonomous agents in Copilot Studio and Dynamics 365 empower organizations to streamline processes across sales, service, finance, and supply chains, allowing teams to build, manage, and scale custom AI-driven workflows to boost productivity and reduce costs (Microsoft).
4. Anthropic
Claude.ai’s new analysis tool lets users run JavaScript code within the platform, enabling complex data analysis and visualization (Anthropic).
Anthropic’s new Claude 3.5 models—Claude 3.5 Sonnet and Claude 3.5 Haiku—offer advanced capabilities in coding and general intelligence, with Haiku delivering high performance at increased speed and affordability, while Sonnet introduces “computer use,” a public beta that allows Claude to interact with computer interfaces (Anthropic).
5. OpenAI
The new sCM approach for continuous-time consistency models enhances sampling speed significantly, achieving quality comparable to diffusion models with only two steps, enabling real-time generation on consumer hardware and making high-quality, efficient generative AI accessible across various domains like image, audio, and video (OpenAI).
6. Perplexity
Perplexity AI’s Pro upgrade introduces a Reasoning Mode in Pro Search, enabling multi-step analysis for complex queries with integrated code execution, problem-solving, and data analysis features, aiming to support advanced research across fields like academia, law, marketing, and software development (Testing Catalog).
7. Hugging Face
Hugging Face’s new service, HUGS, allows for efficient, zero-configuration deployment of open AI models on varied hardware in user-controlled infrastructure (Hugging Face).
8. Meta
Meta’s newly released quantized Llama models for mobile devices offer faster processing speeds and reduced memory needs (Meta).
Meta FAIR has released several open-source AI research tools, including SAM 2.1 for object segmentation, the multimodal Meta Spirit LM, Layer Skip for faster LLM inference, and datasets like Meta Open Materials 2024 for materials discovery, alongside tools for post-quantum cryptography, language model training, improved cross-lingual sentence encoding, and synthetic preference generation, aiming to advance open science and AI-driven innovation across fields (Meta).
9. Inflection
Inflection for Enterprise’s new Agentic Workflows enables businesses to trust AI agents to take action by integrating with enterprise systems and providing automated, accurate task execution beyond typical chatbot capabilities (Inflection).
10. Midjourney
Midjourney introduced a web tool that lets anyone upload images and make “powerful” edits (PetaPixel).
11. Ideogram
Ideogram Canvas offers a versatile digital workspace for image creation and editing, featuring Magic Fill for targeted inpainting and Extend for outpainting, allowing users to edit, expand, and blend images with precision (Ideogram).
12. Runway
Runway’s new tool, Act-One, enables creators to generate realistic and expressive character animations using only a simple video of an actor’s performance (Runway).
13. ElevenLabs
ElevenLabs’ new Voice Design tool allows users to create unique custom voices from text prompts, expanding creative possibilities for game developers, indie filmmakers, and storytellers by enabling personalized, real-time character voicing without relying on pre-recorded audio (Tom’s Guide).
14. Asana
Asana’s new AI Studio allows teams to build and deploy customized no-code AI agents to streamline workflows and automate tasks (ZDNet).
15. Other models
Mochi 1, a new open-source state-of-the-art video generation model, offers high-fidelity motion and strong prompt adherence at 480p (GenmoAI).
IBM’s Granite 3.0 models, including high-performing language models, efficiency-focused Mixture-of-Experts, advanced Granite Guardian safety models, and state-of-the-art time series models, offer robust solutions for enterprise AI (IBM).
Harvard’s CHIEF AI model, trained on extensive pathology imaging data, achieves 96% accuracy in detecting various cancers (Decrypt).
Haiper 2.0, a free AI video generation model, launches with hyper-realistic video quality and faster production speeds (Tom’s Guide).
Stable Diffusion 3.5 introduces a range of customizable, high-performance image generation models—available as Large, Large Turbo, and Medium variants—that run on consumer hardware and are free for most uses under a permissive license, balancing quality, prompt adherence, and versatile output styles (Stable AI).
Cohere For AI has released Aya Expanse, a new family of high-performance multilingual models available in 8B and 32B parameter versions, which excel across 23 languages by utilizing innovations in synthetic data generation, preference training, and model merging to achieve state-of-the-art multilingual performance (Cohere).
Other Innovations
1. Weight loss drug
GLP-1 receptor agonists, originally developed for diabetes, have rapidly expanded their use for weight loss and other conditions, with potential to treat cardiovascular disease, addiction, Alzheimer’s, and even delay aging. These drugs work by mimicking a hormone that regulates blood sugar and appetite, while also reducing inflammation and improving cell health across multiple organs. As competition increases and prices drop, GLP-1 drugs could revolutionize treatment for a wide range of health conditions, offering transformative benefits for public health, longevity, and various industries (The Economist).
2. Food production
A new wave of biotech startups is developing microbial proteins made from bacteria that consume carbon dioxide, aiming to create sustainable food alternatives that could revolutionize the food industry and reduce agricultural emissions (MIT Technology Review).
3. Data storage on DNA
A new user-friendly DNA data storage method leverages selective methylation to encode information, allowing non-experts to archive data efficiently and potentially overcoming the time and cost barriers of conventional DNA synthesis-based storage systems (Nature).
4. Robots
EngineAI Robotics launched SE01, a full-size humanoid robot that overcomes the challenge of achieving a natural human-like gait, positioning it as a milestone in humanoid robotics for industrial and educational applications (Globe Newswire).
5. Messaging
Daze, an AI-powered messaging app for Gen Z, has garnered significant prelaunch traction, driven by creative demo videos showcasing its free-form, visually rich chat experience (TechCrunch).