Page 110 of 130 • 1560 Total Articles

createLiveAI

Continue exploring the latest AI breakthroughs, technology insights, and industry analysis. Page 110 of our comprehensive AI news collection.

📰 Latest Intelligence

Showing 12 articles on page 110 of 130

Live feed
Research
📄 arXiv Machine Learning

Accelerating RLHF Training with Reward Variance Increase

A new method called reward adjustment model is proposed to enhance the efficiency of reinforcement learning from human feedback (RLHF) in training large language models, by increasing reward variance while preserving preferences. The approach integrates into the group relative policy optimization (GRPO) algorithm, resulting in the more efficient GRPOVI, which significantly accelerates RLHF training compared to existing methods.

Policy
📄 arXiv Machine Learning

Bayesian Neural Scaling Laws Extrapolation with Prior-Fitted Networks

A new Bayesian framework using Prior-data Fitted Networks (PFNs) has been developed to improve neural scaling law extrapolation by quantifying uncertainty, addressing limitations of existing point estimation methods. The approach demonstrates superior performance in real-world scenarios, especially with limited data, enabling more reliable decision-making in applications like resource investment.

policy machine-learning
Read More
Research
📄 arXiv Machine Learning

Calibrated Value-Aware Model Learning with Stochastic Environment Models

This paper examines the limitations of the MuZero loss and similar value-aware model learning methods, revealing that they are uncalibrated surrogate losses that may not accurately recover the true model and value functions. The authors propose corrective measures and analyze the impact of model architectures and auxiliary losses, finding that while deterministic models can suffice for value prediction, calibrated stochastic models offer advantages.

Research
📄 arXiv Machine Learning

>-

A new system called CDR-Agent, based on large language models, has been developed to assist emergency department clinicians by autonomously identifying and applying appropriate Clinical Decision Rules (CDRs) from unstructured clinical notes, thereby reducing cognitive load and improving decision accuracy. Validated on synthetic and real datasets, CDR-Agent significantly outperforms baseline models in CDR selection accuracy and helps make more efficient and cautious imaging decisions, minimizing unnecessary interventions.

research machine-learning
Read More
Research
📄 arXiv Machine Learning

Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data

The paper introduces CompFlow, a novel method for reinforcement learning that models target dynamics using flow matching and optimal transport principles, enabling better generalization and more accurate estimation of the dynamics gap via Wasserstein distance. This approach facilitates an active exploration strategy that reduces performance gaps in environments with shifted dynamics, outperforming existing baselines across various RL benchmarks.

Page 110 of 130 • Showing articles 1309-1320 of 1560