AI Robotics Tutorial
Highlights
1. Hugging Face has released a comprehensive tutorial on their LeRobot platform, designed for AI-powered robotics. This advancement makes robotics more accessible to people with varying levels of expertise.
2. Fine-tuning is now available for GPT-4o.
3. Microsoft’s new compact language model, Phi-3.5, outperforms some competitors.
4. Salesforce has introduced two new tools: the Einstein SDR Agent for lead engagement and the Einstein Sales Coach Agent for sales coaching.
5. Ideogram 2, a text-to-image model, is now available, making it easier to specify text, fonts, and styles in image generation.
6. Researchers have identified various ways AI can enhance logistics. While logistics is commonly associated with predictive AI, GenAI also has valuable applications in this field.
Innovation Insights
1. How artificial intelligence is transforming logistics (Ideas Made to Matter)
AI is transforming logistics by integrating traditional AI, generative AI, and operations research to optimize tasks such as vehicle routing and dynamic pricing. AI technologies offer advantages over traditional methods by generalizing complex logistics challenges without needing specific algorithms for every scenario, enabling faster and more flexible solutions. Generative AI is proving valuable in managing unexpected variables and optimizing routes to reduce waste and emissions, as seen with Uber Freight’s efforts to minimize empty truck miles. Moreover, AI’s continuous learning capabilities improve logistics management by automatically adapting to changes, eliminating the need for manual adjustments.
2. How to fine-tune AI for prosperity (MIT Technology Review)
To achieve economic prosperity through AI, there must be a strategic shift toward using AI to boost productivity across all sectors, not just those directly related to technology. For significant economic impact, AI needs to be integrated more broadly to support innovation, manufacturing, and the creation of new jobs. Economists predict that for generative AI to realize the potential to increase productivity and drive economic growth, there needs to be collaborative efforts between AI developers and various industries to tailor AI solutions to specific needs. The success of AI in fostering widespread economic growth depends on both technological advancements and policy interventions that promote equitable distribution and application of AI capabilities.
AI Innovations
1. Hugging Face
Hugging Face has released a comprehensive tutorial on their LeRobot platform, enabling developers of all skill levels to build and train their own AI-powered robots (VentureBeat).
2. Google
Google’s new Prompt Gallery in AI Studio enhances developer tools by offering a range of free, pre-built prompts for the Gemini API (VentureBeat).
3. OpenAI
Fine-tuning is now available for GPT-4o, allowing developers to customize the model with specific datasets for enhanced performance and accuracy in various applications, with 1 million free training tokens offered daily through September 23 (OpenAI).
4. Small language model
Microsoft’s new small language model, Phi-3.5, outperforms competitors like Gemini and GPT-4o in reasoning, math, and other benchmarks, offering efficient performance in a compact format that can be run locally or on IoT devices (Tom’s Guide).
The Mistral-NeMo-Minitron 8B delivers unparalleled accuracy on multiple benchmarks, outperforming other state-of-the-art language models of similar size while maintaining efficiency and reduced resource usage (NVIDIA).
5. Meta
Meta Reality Labs’ Sapiens models excel in human-centric vision tasks like pose estimation and depth prediction, offering high-resolution inference and adaptability with minimal fine-tuning (Meta).
6. NVIDIA
At Gamescom 2024, Nvidia unveiled advancements in digital humans, avatar technology, and small language models for gaming, along with updates to their RTX graphics, G-Sync technology, and GeForce Now cloud gaming service (VentureBeat).
7. Salesforce
Salesforce has released xGen-MM, a suite of open-source multimodal AI models designed to enhance visual language understanding by combining text, images, and other data types (VentureBeat).
Salesforce has introduced two autonomous AI sales agents, Einstein SDR Agent and Einstein Sales Coach Agent, designed to automate lead engagement and provide sales training through role-play simulations (Salesforce).
8. Image
Ideogram has launched its version 2 AI image generator, significantly enhancing customization, photorealism, and design-oriented features (Tom’s Guide).
Adobe’s Magic Fixup, a new AI image editing model trained on video data, automates complex edits while preserving artistic intent (DIY Photography).
9. Jamba 1.5
AI21 has released Jamba 1.5, an enhanced hybrid model combining transformers with Structured State Space (SSM) architecture, designed to improve performance, accuracy, and functionality for agentic AI applications (VentureBeat).
10. Living computers
Some researchers are developing “living computers” using lab-grown human neurons, like FinalSpark’s Neuroplatform, which employs brain organoids connected to electrodes and can be rented for $500 a month (Live Science).
11. Benchmark
The new Geekbench AI benchmark, now out of beta, allows users to test the performance of CPUs, GPUs, and NPUs across various devices (Ars Technica).
12. Geoengineering model
Andrew Ng’s new online tool, Planet Parasol, allows users to experiment with a solar geoengineering model to explore potential climate intervention scenarios and their effects, though it currently lacks the ability to simulate the broader societal and ecological risks involved (MIT Technology Review).
13. Video
Luma Labs has released Dream Machine 1.5, an upgraded AI video generation model that offers enhanced realism, better prompt adherence, improved text rendering, and faster video creation capabilities (Tom’s Guide).
Lightricks’ LTX Studio, a video storytelling app, has expanded its availability and features, offering powerful tools like text-to-video generation, enhanced editing control, and collaborative capabilities to empower filmmakers and creators to produce visual content more efficiently (Tech Times).
Hotshot has launched its new text-to-video AI generator model as a free public preview, offering users the ability to create up to 10-second videos at 720p resolution (VentureBeat).
Hedra’s new AI character generation model, version 1.5, enhances realistic head movements, facial expressions, and lip-syncing, while also introducing a “stylize” feature for customizing characters’ appearances, making it a significant upgrade for creators aiming to produce more lifelike AI-generated videos (Tom’s Guide).
14. Text-to-speech
ElevenLabs has globally released its AI Reader app, which now supports text-to-speech narration in 32 languages and features AI-generated voices, including those of deceased celebrities (The Verge).
15. 3D
Meshy-4 revolutionizes 3D generative AI by enhancing geometry quality, providing refined text-to-3D workflows, and introducing a retry feature for improved control (Meshy).
16. Brain Pacemaker
Researchers have developed a personalized approach to deep brain stimulation using AI, which tailors electrical stimulation to individual symptoms of Parkinson’s patients, significantly reducing their most bothersome symptoms and improving their quality of life (NY Times).
17. LLM understanding
MIT CSAIL researchers found that large language models (LLMs) can develop an internal understanding of reality through controlled experiments, indicating that these models may learn language semantics beyond mere mimicry, as they improve their generative abilities and develop deeper linguistic meaning (MIT News).
18. AI for soft skills
CodeSignal’s new “Conversation Practice” feature uses AI to help learners develop soft skills, such as communication and leadership, by simulating real-time workplace conversations with AI partners (Forbes).
An AI assistant developed by MIT CSAIL researchers monitors and coordinates human and AI team members to align their actions and improve collaboration in various tasks, such as search-and-rescue missions, medical procedures, and video games (MIT News).
Other Innovations
1. Tinkering with human evolution
As CRISPR technology advances and becomes easier to administer, the potential to alter the human genome could lead to a future where gene editing is used not only to prevent diseases but also to enhance human traits, raising significant ethical, social, and biological implications for the future evolution of our species (MIT Technology Review).
2. Robots
Agibot has unveiled five new humanoid robots, including the versatile Expedition and Lingxi series, with advancements in power, perception, communication, and control, aiming for mass production and commercial deployment to integrate robots into everyday life and accelerate industry innovation (Kr Asia).
Unitree has introduced the G1 humanoid robot, priced at $16,000, which is designed for research and capable of tasks like walking, climbing stairs, and handling delicate objects, but it will require user training and imitation learning to perform specific tasks, such as making breakfast (The Verge).
Renovate Robotics has unveiled Rufus V1, a refined rooftop robot designed to quickly and accurately install shingles (TechCrunch).
3. 3D display
Samsung’s new Odyssey 3D gaming monitor offers glasses-free 3D gaming through eye-tracking technology and a lenticular lens, allowing users to switch seamlessly between 2D and 3D modes (The Verge).