News
1. First-person video dataset
Meta has announced Ego-Exo4D, a significant dataset and benchmark suite developed over two years by Meta’s FAIR, Project Aria, and 15 university partners, aimed at advancing research in video learning and multimodal perception. This dataset uniquely captures both first-person “egocentric” views from wearable cameras and multiple “exocentric” views from surrounding cameras, providing AI models with a comprehensive perspective on complex human skills (Meta).
2. Gemini Pro
Google has introduced Gemini Pro, a lightweight version of its advanced GenAI model family, to its fully managed AI development platform, Vertex AI, now accessible in public preview through the new Gemini Pro API. This API, which is currently free within certain limits, supports 38 languages and offers features like chat functionality and filtering, allowing developers to build applications with advanced reasoning and coding skills, and includes an additional endpoint for processing text and imagery.(TechCrunch).
3. AI in Chemistry
MIT researchers have developed a machine learning-based model that can quickly calculate the elusive transition states of chemical reactions, a process traditionally reliant on time-consuming quantum chemistry methods. This new approach, which can generate accurate transition state structures within seconds, holds potential for aiding in the design of new reactions and catalysts, and for modeling natural chemical processes like those involved in the evolution of life on Earth (MIT).
A new paper highlights the release of OpenAI’s GPT-4, which demonstrates exceptional problem-solving abilities in various fields, and introduces “Coscientist,” a multi-LLM-based intelligent agent capable of autonomously designing, planning, and performing complex scientific experiments, integrating web and documentation search, coding environments, and robotic experimentation platforms (Nature).
Research demonstrates the use of an AI-based graph neural network, which was trained using experimental data, to pinpoint chemical substructures responsible for selective antibiotic properties in more than 12 million compounds. The outcome of this study was the identification of a novel class of antibiotics, proven to be effective both in vitro and in vivo against Gram-positive bacteria, such as Staphylococcus aureus (Nature).
4. Midjourney’s Update
Midjourney version 6 has been released with significant improvements, including more realistic and detailed images and the ability to generate legible text within images. This update, which requires users to learn new prompting methods, also features enhanced image prompting, improved coherence, and model knowledge (VentureBeat).
5. Video Generation
Google Research introduces VideoPoet, an LLM capable of various video generation tasks like text-to-video, image-to-video, video stylization, inpainting, outpainting, and video-to-audio. Unlike diffusion-based models, VideoPoet integrates multiple video generation capabilities within a single LLM, offering a more efficient and versatile approach to video generation across different modalities (Google).
6. Text to 3D
Text-to-CAD, a new tool from Zoo, allows users to generate editable CAD models from text descriptions, distinguishing itself from other text-to-3D models by creating B-Rep surfaces instead of point clouds. This enables the generated STEP files to be imported into any CAD program for editing, offering a more practical and versatile tool for CAD design, with future updates to include editing via KittyCAD Language code in their Modeling App (Zoo.dev).
Articles
1. Prompt Engineering (OpenAI)
The guide on OpenAI’s platform emphasizes the importance of writing clear and detailed instructions for effective prompt engineering. It advises specifying desired outputs and providing context or examples to guide the AI’s responses, ensuring more accurate and relevant results.
2. The science events to watch for in 2024 (Nature)
OpenAI’s anticipated release of GPT-5 promises to advance AI capabilities, while major space missions like NASA’s Artemis II and China’s Chang’e-6 aim to explore the Moon and beyond. Additionally, efforts to combat diseases and climate change are underway, with the World Mosquito Program fighting vector-borne diseases and the International Court of Justice addressing climate change obligations. Moreover, the development of exascale supercomputers in Europe and the U.S. is set to revolutionize computational capabilities in various scientific fields.
3. The road ahead reaches a turning point in 2024 (GatesNotes)
In “The Year Ahead: 2024” by Bill Gates, he reflects on 2023 as a significant year, marked by personal milestones and the adoption of AI for serious purposes. Gates expresses optimism about the future despite challenges like global conflicts and climate change. He discusses the potential of AI in various fields, including health and education, and emphasizes its role in driving innovation and addressing global inequities.
4. Cancer-fighting CAR-T cells could be made inside body with viral injection (Nature)
Scientists are developing methods to create CAR-T cells directly inside the body using a viral injection, potentially making this innovative but expensive cancer treatment more accessible. This approach, which avoids the need to extract and re-introduce T cells, was demonstrated in experiments with monkeys, showing promise for future human trials and potentially reducing treatment costs.