AI Analyzing Speech Patterns
Highlights
1. OpenAI announced new API features, such as vision fine-tuning and prompt caching, as well as Canvas, a new interface for writing and coding in ChatGPT.
2. Microsoft announced an updated Copilot AI, integrating features like voice interaction, vision-based assistance, and daily summaries.
3. Google has begun rolling out a new video search feature in Google Lens.
4. Meta announced a new feature that will introduce AI-generated images into Facebook and Instagram feeds.
5. Pika Labs launched Pika 1.5, a major upgrade to its AI video generator.
6. AI is being developed to diagnose mental-health conditions by analyzing subtle speech patterns that are imperceptible to humans.
Innovation Insights
1. How AI agents will help us make better decisions (Fast Company)
AI agents will transform decision-making by enhancing how we receive, research, reason, and act on new information. By documenting and analyzing human decision processes, AI can identify weaknesses and strengths, helping professionals improve their reasoning and action steps. AI agents can automate tasks like researching or assessing information, such as evaluating an event speaker’s suitability, and provide tailored recommendations before humans even engage with the content. Developing AI agents will lead to cross-organizational sharing of decision-making best practices and individualized growth plans. Ultimately, AI agents will augment and refine one of the most critical and underdeveloped skills: making effective decisions.
2. How to Accelerate Progress on AI (Bain)
Bain’s rapid and successful AI deployment provides three key lessons for accelerating AI progress in organizations. First, involving employees is crucial; Bain achieved 80% adoption in pilot offices by making AI tools central to employees’ daily tasks, empowering them to create over 2,000 custom AI tools and fostering innovation through competitions. Second, creating a detailed playbook ensures scalable implementation, covering leadership approval, an AI code of conduct, regional advocates, training, and fostering citizen innovation. Lastly, tapping into the AI ecosystem through partnerships with tech leaders like OpenAI and Microsoft has helped Bain continuously refine its AI tools, leading to a “rolling revolution” of AI adoption across functions, boosting both internal and client outcomes.
3. The Intersection of Design Thinking and AI: Enhancing Innovation (IDEOU)
The integration of AI and design thinking offers transformative potential by combining AI’s data-driven capabilities with design thinking’s human-centered approach to problem-solving. AI can enhance each phase of the design thinking process by providing deeper user insights through data analysis, accelerating idea generation, and improving prototyping and testing efficiency. However, AI cannot replace human creativity and intuition; instead, it amplifies these abilities, enabling faster iterations and more informed decision-making. Key benefits include improved creativity, efficiency, and accuracy in testing, but challenges such as balancing human intuition, ensuring ethical AI use, and overcoming resistance to change must be addressed.
AI Innovations
1. OpenAI
OpenAI announced four new API features—model distillation, prompt caching, vision fine-tuning, and the Realtime API—aimed at helping developers customize models, create speech-based applications, improve image recognition, and reduce operational costs (Inc.).
OpenAI launched “Canvas,” a new ChatGPT interface designed for writing and coding projects that allows users to edit AI-generated content directly in an adjacent workspace (TechCrunch).
2. Microsoft
Microsoft announced an updated Copilot AI, designed to be a personalized and supportive AI companion, integrating features like voice interaction, vision-based assistance, and daily summaries to enhance productivity and simplify users’ daily lives (Microsoft).
Microsoft announced new AI-powered features for Copilot+ PCs and Windows 11, including Recall for quick access to previously viewed content, Click to Do for interactive task suggestions, improved offline search capabilities, and image enhancement tools, all aimed at enhancing user experience and productivity. (Maginative).
3. Alphabet
Google has begun rolling out a new video search feature in Google Lens, allowing users to record short clips, ask questions about them, and receive relevant search results, including AI-generated responses in supported regions, enhancing interaction with video content (Android Authority).
Gemini Live is now available for all users (TechRadar).
Google introduces a multi-functional quick insert key and new AI capabilities to Chromebook Plus (TechCrunch).
Google is escalating its competition with OpenAI by developing AI models with enhanced reasoning capabilities, focusing on chain-of-thought prompting to solve multistep problems (Quartz).
Gemini 1.5 Flash-8B is now production-ready, offering 50% lower cost, double the rate limits, and improved latency, making it suitable for high-volume, simple tasks and available for free to developers via Google AI Studio and the Gemini API (Google).
4. Meta
Meta announced a new feature that will introduce AI-generated images into Facebook and Instagram feeds, including some that may incorporate users’ faces, based on their interests or trends, while offering options for personalization and removal (The Verge).
Meta announces Movie Gen, an AI-powered video generator (The Verge).
5. AMD
AMD has unveiled its first small language model, AMD-135M, featuring speculative decoding to enhance inference efficiency, open-source accessibility for developers (AMD).
6. NVIDIA
Nvidia has launched NVLM 1.0, an open-source large language model that matches the performance of leading proprietary models like GPT-4 (Yahoo).
7. Pika
Pika Labs launched Pika 1.5, a major upgrade to its AI video generator focused on hyper-realism and advanced video effects (Tom’s Guide).
8. Black Forest Labs
Black Forest Labs has launched an API for its Flux image generation models, introduced a faster Flux1.1 Pro model, and is expanding availability through partners (TechCrunch).
9. Pinterest
Pinterest has launched generative AI tools for advertisers, enabling them to transform product imagery by adding lifestyle backgrounds to enhance Pinterest Product Pins (TechCrunch).
10. Other models
Emu3 is a new suite of state-of-the-art multimodal models trained solely on next-token prediction, overcoming previous limitations in multimodal tasks by tokenizing images, text, and videos into a discrete space and training a unified transformer on these multimodal sequences (Beijing Academy of AI).
MIT spinoff Liquid has introduced its Liquid Foundation Models (LFMs), a series of non-transformer multimodal AI models that outperform traditional transformer-based models like Meta’s Llama and Microsoft’s Phi, achieving state-of-the-art performance while using significantly less memory (VentureBeat).
11. Mental health
AI is being developed to diagnose mental-health conditions by analyzing subtle speech patterns that are imperceptible to humans, offering more accurate and faster diagnoses across different languages and regions (The Economist).
12. Storytime
Ello launched “Storytime,” a new AI-powered feature that allows children to co-create personalized stories by choosing characters, settings, and plots, while the AI adapts to the child’s reading level and teaches phonics-based skills (TechCrunch).
Other Innovations
1. Stethoscope
Lapsi Health has launched Keikku, a digital stethoscope designed as a health tracking platform with advanced features like AI-based acoustic analysis, aiming to transform the traditional stethoscope into a comprehensive tool for monitoring chronic heart and lung conditions (TechCrunch).
2. Hotel
The world’s first 3D-printed hotel, El Cosmico, is under construction in the Texas desert, featuring camping areas, vacation homes, permanent residences, and shared amenities, all designed with curvilinear structures inspired by the landscape and built using Icon’s 3D-printing technology (New Atlas).
3. Chip
BrainChip announced Akida Pico, a new ultra-low-power neuromorphic chip for AI processing in power-constrained edge devices like wearables and smart appliances (IEEE Spectrum).
4. Robots
Fourier has launched GR-2, an advanced humanoid robot featuring significant upgrades in hardware, design, and software, including enhanced dexterity, improved modularity, and AI-driven capabilities (PR News Wire).
Engineers at ETH Zurich have modified the quadrupedal robot ANYMal to climb standard ladders using custom hook-like paws and reinforcement learning, achieving a 90% success rate in real-world testing (TechXplore).
Researchers are using generative AI models like Stable Diffusion to create visual training data for robots, enabling them to learn tasks by analyzing image-based patterns, which could improve robot training for both simulations and real-world applications (MIT Tech Review).
5. Brain map
The evolution of brains in animals has driven biodiversity, complex behaviors, and body development, and now with advancements like mapping the fruit fly’s brain connectome, this knowledge could lead to breakthroughs in AI, technology, and understanding human brains, though it raises concerns about the future trajectory of brain-like systems (The Economist).