Large Language Model

Anthropic AI has launched Claude 3.5 Sonnet, marking the first release in its new Claude 3.5 model family. This latest iteration of Claude brings significant advancements in AI capabilities, setting a new benchmark in the industry...
Large Language Models (LLMs) have gained significant attention in the field of simultaneous speech-to-speech translation (SimulS2ST). This technology has become crucial for low-latency communication in various scenarios, such as international conferences, live broadcasts, and online subtitles....

Fireworks AI Releases Firefunction-v2: An Open Weights Function Calling Model with Function Calling Capability on Par with GPT4o at 2.5x the Speed and 10%...

Fireworks AI releases Firefunction-v2, an open-source function-calling model designed to excel in real-world applications. It integrates with multi-turn conversations, instruction following, and parallel function...

Unveiling the Shortcuts: How Retrieval Augmented Generation (RAG) Influences Language Model Behavior and Memory Utilization

Researchers from Microsoft, the University of Massachusetts, Amherst, and the University of Maryland, College Park, address the challenge of understanding how Retrieval Augmented Generation...

Meta FAIR’s Groundbreaking AI Releases: Enhancing Creativity, Efficiency, and Responsibility in Open Science AI Research and Development

Meta's Fundamental AI Research (FAIR) team has announced several significant advancements in artificial intelligence research, models, and datasets. These contributions, grounded in openness, collaboration,...

Key Metrics for Evaluating Large Language Models (LLMs)

Evaluating Large Language Models (LLMs) is a challenging problem in language modeling, as real-world problems are complex and variable. Conventional benchmarks frequently fail to...

Together AI Introduces Mixture of Agents (MoA): An AI Framework that Leverages the Collective Strengths of Multiple LLMs to Improve State-of-the-Art Quality

In a significant leap forward for AI, Together AI has introduced an innovative Mixture of Agents (MoA) approach, Together MoA. This new model harnesses...

DataComp for Language Models (DCLM): An AI Benchmark for Language Model Training Data Curation

Data curation is essential for developing high-quality training datasets for language models. This process includes techniques such as deduplication, filtering, and data mixing, which...

TopicGPT: A Prompt-based AI Framework that Uses Large Language Models (LLMs) to Uncover Latent Topics in a Text Collection

Topic modeling is a technique to uncover the underlying thematic structure in large text corpora. Traditional topic modeling methods, such as Latent Dirichlet Allocation...

Microsoft Research Launches AutoGen Studio: A Low-Code Platform Revolutionizing Multi-Agent AI Workflow Development and Deployment

Microsoft Research has announced the release of AutoGen Studio, a low-code interface designed to streamline the creation, testing, and deployment of multi-agent AI workflows....

Meet DeepSeek-Coder-V2 by DeepSeek AI: The First Open-Source AI Model to Surpass GPT4-Turbo in Coding and Math, Supporting 338 Languages and 128K Context Length

Code intelligence focuses on creating advanced models capable of understanding and generating programming code. This interdisciplinary area leverages natural language processing and software engineering...

NVIDIA AI Releases HelpSteer2 and Llama3-70B-SteerLM-RM: An Open-Source Helpfulness Dataset and a 70 Billion Parameter Language Model Respectively

Nvidia recently announced the release of two groundbreaking technologies in artificial intelligence: HelpSteer2 and Llama3-70B-SteerLM-RM. These innovations promise to significantly enhance the capabilities of...

Apple Releases 4M-21: A Very Effective Multimodal AI Model that Solves Tens of Tasks and Modalities

Large language models (LLMs) have made significant strides in handling multiple modalities and tasks, but they still need to improve their ability to process...

Separating Fact from Logic: Test of Time ToT Benchmark Isolates Reasoning Skills in LLMs for Improved Temporal Understanding

Temporal reasoning involves understanding and interpreting the relationships between events over time, a crucial capability for intelligent systems. This field of research is essential...

Galileo Introduces Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High...

0
The Galileo Luna represents a significant advancement in language model evaluation. It is specifically designed to address the prevalent issue of hallucinations in large...

Yandex Introduces YaFSDP: An Open-Source AI Tool that Promises to Revolutionize LLM Training by...

0
Developing large language models requires substantial investments in time and GPU resources, translating directly into high costs. The larger the model, the more pronounced...

Gretel AI Releases a New Multilingual Synthetic Financial Dataset on HuggingFace 🤗 for AI...

0
Detecting personally identifiable information PII in documents involves navigating various regulations, such as the EU’s General Data Protection Regulation (GDPR) and various U.S. financial...

Snowflake AI Research Team Unveils Arctic: An Open-Source Enterprise-Grade Large Language Model (LLM) with...

0
Snowflake AI Research has launched the Arctic, a cutting-edge open-source large language model (LLM) specifically designed for enterprise AI applications, setting a new standard...

Google DeepMind Releases RecurrentGemma: One of the Strongest 2B-Parameter Open Language Models Designed for...

0
Language models are the backbone of modern artificial intelligence systems, enabling machines to understand and generate human-like text. These models, which process and predict...

Recent articles

🐝 🐝 Join the Fastest Growing AI Research Newsletter...

X