Text-to-music app
Highlights
1. AI and robotics can undermine the skill-building process of professionals. A new approach to integrating technology that preserves and enhances skill development is needed.
2. Claude AI’s new Artifacts feature allows users to visualize content like games and documents interactively in dedicated windows.
3. Meta 3DGen is a new text-to-3D asset generation pipeline that creates high-quality 3D shapes and textures in under a minute.
4. Suno has launched an iOS app enabling users to generate original music from text prompts.
5. RunwayML’s new Gen-3 Alpha model offers hyper-realistic video production capabilities.
6. Scientists have created a robot powered by a lab-grown artificial brain made from human stem cells, using brain-on-chip technology.
Innovation Insights
1. How can we preserve human ability in the age of machines? (Ideas Made to Matter)
Matt Beane, in “The Skill Code,” emphasizes that workers need healthy challenge, complexity, and connection to build skills, which intelligent technologies like AI and robots threaten by disrupting the expert-novice relationship crucial for skill development. Beane’s research suggests that while technologies can enhance learning, as seen with the PackBot in bomb disposal, they often undermine the skill-building process. He advocates for a reworked approach to integrating technology that preserves and enhances these essential elements of skill development, ensuring both productivity and the maintenance of critical human abilities.
2. What are AI agents? (MIT Technology Review)
AI agents are advanced AI systems that autonomously make decisions and perform complex tasks in dynamic environments, offering enhanced capabilities over traditional AI assistants. These agents, which can process multimodal inputs like language, audio, and video, were prominently featured in Google’s unveiling of Astra and OpenAI’s GPT-4o, highlighting their potential to revolutionize both personal and professional tasks. Despite their promise, AI agents currently face significant limitations in reliability, reasoning, and long-term task management, indicating that their development is still in its early stages. However, they represent a significant shift towards more powerful and versatile AI interactions that could greatly enhance productivity and user experience.
AI Innovations
1. OpenAI
OpenAI’s new “CriticGPT” model, based on GPT-4, is designed to identify and critique errors in AI-generated code, improving human oversight and accuracy in evaluating AI outputs through Reinforcement Learning from Human Feedback (RLHF) (Ars Technica).
2. Anthropic
Claude.ai Pro and Team users can now organize their chats into Projects, allowing them to collaborate more effectively by sharing curated knowledge and chat activity, and utilizing the latest Claude 3.5 Sonnet model for enhanced idea generation, strategic decision-making, and project execution (Anthropic).
Claude AI’s new Artifacts feature allows users to visualize content like games, documents, and pixel art interactively in dedicated windows, enabling easy editing, sharing, and iteration (Tom’s Guide).
3. Meta
Instagram is launching its “AI Studio” tool, allowing some creators to make AI chatbot versions of themselves as part of an early test in the US (The Verge).
Meta 3D Gen (3DGen) is a cutting-edge, rapid text-to-3D asset generation pipeline that creates high-quality 3D shapes and textures with high prompt fidelity in under a minute, supports physically-based rendering (PBR) and generative retexturing, and outperforms industry baselines in prompt fidelity and visual quality while being significantly faster (Meta).
4. Alphabet
Gemma 2, an advanced and efficient AI model by Google DeepMind available in both 9B and 27B parameter sizes, is now available to researchers and developers, offering top performance and seamless integration across various AI tools and hardware (Google).
The upcoming Google Pixel 9 will introduce “Google AI,” featuring new machine learning capabilities such as “Add Me” for group photos, the “Studio” generative image assistant, and a privacy-focused “Pixel Screenshots” feature that adds searchable metadata to user-captured screenshots (Android Authority).
5. Others
Suno has launched an iOS app enabling users to generate original music from text prompts (VentureBeat).
Character.AI now allows users to engage in phone calls with AI avatars in multiple languages, enhancing language practice, mock interviews, and role-playing games (TechCrunch).
RunwayML’s new Gen-3 Alpha model, now available via a paid plan starting at $12/month per editor, offers hyper-realistic video production capabilities with features like imaginative transitions and expressive human characters (VentureBeat).
Perplexity’s Pro Search has been upgraded to handle more complex queries, advanced math and programming computations, and multi-step reasoning, making research faster, more efficient, and capable of delivering comprehensive, in-depth answers and analyses (The Verge).
Moshi Chat, a new speech AI model from Kyutai, offers a local and offline alternative to GPT-4o with promising features but currently lacks the responsiveness and coherence of its competitors (Tom’s Guide).
Other Innovations
1. Robotics
GXO Logistics and Agility Robotics have signed a multi-year Robots-as-a-Service agreement to deploy Agility’s humanoid robots, Digit, in GXO’s logistics operations, marking the industry’s first commercial deployment and RaaS deployment of humanoid robots (Agility Robotics).
Chinese scientists have created a robot powered by a lab-grown artificial brain made from human stem cells, using brain-on-chip technology to perform tasks like gripping objects and avoiding obstacles, potentially advancing brain-like computing and hybrid human-robot intelligence (South China Morning Post).
West Japan Railway has introduced a 12-meter tall humanoid robot with large arms and a Wall-E-like head to perform maintenance tasks such as painting and trimming tree branches along train lines (The Guardian).
Figure 01, a humanoid robot by Figure, which previously demonstrated coffee-making skills, is now autonomously participating in BMW’s car assembly process using neural networks that map pixels to actions (Interesting Engineering).
2. Neurons that encode words’ meaning
For the first time, scientists have identified individual neurons in the prefrontal cortex that encode the meanings of specific words, creating a highly detailed brain map that could help understand how the brain categorizes and stores linguistic information (Nature).
3. Chemical recycling
Researchers have developed a rapid chemical recycling process that can break down mixed-material fabrics into reusable molecules within 15 minutes, potentially helping to address the waste generated by the fast fashion industry (Nature).
4. Bionic legs
A new neural interface allows a bionic leg to be controlled by the brain, making it feel more like a natural part of the wearer’s body and enhancing mobility and balance (MIT Technology Review).