Page 60 of 130 • 1560 Total Articles

createLiveAI

Continue exploring the latest AI breakthroughs, technology insights, and industry analysis. Page 60 of our comprehensive AI news collection.

All Articles 1560 Business 249 Ethics 150 General 142 Policy 12 Research 793 Startups 13 Technology 201

📰 Latest Intelligence

Showing 12 articles on page 60 of 130

Live feed

📱 2-column layout

MarkTechPost

Ethics

📄 MarkTechPost

Aug 31, 2025

What is AI Agent Observability? Top 7 Best Practices for Reliable AI

Agent observability represents a comprehensive approach to instrumenting, tracing, and monitoring AI agents throughout their entire lifecycle, from initial planning and tool invocation to memory management and final outputs. This discipline enables teams to debug failures, assess safety and quality, manage latency and operational costs, and ensure compliance with governance standards. By integrating traditional telemetry methodssuch as traces, metrics, and logswith LLM-specific signals like token usage, hallucination rates, and tool success metrics, agent observability leverages emerging standards like OpenTelemetry (OTel) GenAI conventions to facilitate standardized, portable monitoring across diverse AI systems

MarkTechPost

General

Alibaba Qwen Team Releases Mobile-Agent-v3 and GUI-Owl: Next-Generation Multi-Agent Framework for GUI Automation - AI news coverage from MarkTechPost in General

General

📄 MarkTechPost

Aug 31, 2025

Alibaba Qwen Team Releases Mobile-Agent-v3 and GUI-Owl: Next-Generation Multi-Agent Framework for GUI Automation

Alibaba's Qwen team has developed Mobile-Agent-v3 and GUI-Owl, a next-generation multi-agent framework designed to automate graphical user interface (GUI) tasks across mobile, desktop, and web platforms. These models leverage advanced vision-language capabilities, with GUI-Owl built on the Qwen2.5-VL foundation and trained on extensive GUI interaction datasets, enabling it to understand screens, reason about tasks, and execute actions in a human-like manner. The key innovation lies in GUI-Owl's unified, end-to-end multimodal architecture that integrates perception, grounding, reasoning, planning, and action

MarkTechPost

Technology

📄 MarkTechPost

Aug 30, 2025

A Coding Guide to Building a Brain-Inspired Hierarchical Reasoning AI Agent with Hugging Face Models

A recent tutorial demonstrates how to develop a brain-inspired hierarchical reasoning AI agent using a free, locally-run Hugging Face model, specifically the Qwen2.5-1.5B-Instruct. By decomposing complex problems into subgoals, solving them with Python, and iteratively critiquing and synthesizing results, the approach mimics hierarchical planning and execution, enhancing reasoning capabilities without relying on large-scale models or costly APIs. This method showcases how structured, multi-layered reasoning workflows can be implemented efficiently on accessible hardware, emphasizing the potential for scalable, interpretable AI systems inspired by human

MarkTechPost

Business

Microsoft AI Introduces rStar2-Agent: A 14B Math Reasoning Model Trained with Agentic Reinforcement Learning to Achieve Frontier-Level Performance - AI news coverage from MarkTechPost in Business

Business

📄 MarkTechPost

Aug 30, 2025

Microsoft AI Introduces rStar2-Agent: A 14B Math Reasoning Model Trained with Agentic Reinforcement Learning to Achieve Frontier-Level Performance

Microsoft has developed rStar2-Agent, a 14-billion-parameter large language model that advances mathematical reasoning by employing agentic reinforcement learning, enabling the model to interact dynamically with a Python execution environment to verify, explore, and refine its reasoning steps. This approach overcomes the limitations of traditional Chain-of-Thought methods, which often compound subtle errors by simply "thinking longer," by teaching the model to "think smarter" through active tool use and iterative self-correction.

Microsoft

General

📈 VentureBeat AI

Aug 30, 2025

How Sakana AIs new evolutionary algorithm builds powerful AI models without expensive retraining

M2N2 is an innovative model merging technique designed to develop versatile, multi-skilled AI agents efficiently, eliminating the need for extensive retraining and large datasets. This approach enables the combination of multiple specialized models into a single, cohesive agent, significantly reducing computational costs and accelerating deployment across diverse tasks.

MarkTechPost

Research

📄 MarkTechPost

Aug 29, 2025

Microsoft AI Lab Unveils MAI-Voice-1 and MAI-1-Preview: New In-House Models for Voice AI

Microsoft AI Lab has launched two new in-house AI models, MAI-Voice-1 and MAI-1-preview, marking a significant step in the companys independent AI research efforts. MAI-Voice-1 is a transformer-based speech synthesis model capable of generating high-fidelity, natural-sounding audio in under one second per minute using a single GPU, supporting multilingual and multi-speaker scenarios with applications in interactive assistants and podcast narration, and is integrated into Microsoft products like Copilot Daily.

Microsoft NVIDIA +1

Towards Data Science

Research

📄 Towards Data Science

Aug 29, 2025

Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social Networks

Recent research explores how machine learning and generative AI can be leveraged to detect and mitigate bias within social networks, addressing the challenge of unlearning ingrained prejudices. By employing advanced AI models, such as generative adversarial networks (GANs) and natural language processing techniques, the study demonstrates potential methods for identifying biased content and promoting more equitable online interactions. This development signifies a crucial step toward enhancing digital well-being by fostering fairer social media environments through targeted bias reduction strategies.

Machine Learning NLP

Towards Data Science

Research

📄 Towards Data Science

Aug 29, 2025

Unlocking Multimodal Video Transcription with Gemini

A recent development introduces a method for transcribing videos with integrated speaker identification using a single prompt, streamlining the process of extracting both dialogue and speaker attribution simultaneously. This innovation leverages multimodal models, such as Google's Gemini, to enhance accuracy and efficiency in video transcription tasks by combining audio and visual cues within a unified framework.

Google AI

The Hacker News

Ethics

📄 The Hacker News

Aug 29, 2025

Can Your Security Stack See ChatGPT? Why Network Visibility Matters

Generative AI platforms such as ChatGPT, Google Gemini, Microsoft Copilot, and Anthropic's Claude are becoming integral to organizational workflows, enhancing productivity across various tasks. However, their widespread adoption introduces significant data security challenges, as sensitive information can be inadvertently shared through prompts, uploaded files, or browser extensions that circumvent traditional security measures, necessitating advanced data leak prevention strategies tailored to AI environments.

GPT Claude +2

MarkTechPost

Research

How to Cut Your AI Training Bill by 80%? Oxfords New Optimizer Delivers 7.5x Faster Training by Optimizing How a Model Learns - AI news coverage from MarkTechPost in Research

Research

📄 MarkTechPost

Aug 29, 2025

How to Cut Your AI Training Bill by 80%? Oxfords New Optimizer Delivers 7.5x Faster Training by Optimizing How a Model Learns

Researchers at the University of Oxford have developed a novel optimizer called Fisher-Orthogonal Projection (FOP) that significantly reduces the computational costs associated with AI model training, achieving up to an 87% reduction in GPU expenses. By rethinking the way gradients are handled during training, FOP effectively optimizes the learning process, enabling models such as vision transformers trained on ImageNet-1K to be trained 7.5 times faster and more efficiently. This innovation addresses a critical bottleneck in AI development, where the high cost of GPU compute limits experimentation and progress across startups, research labs, and

NVIDIA Transformers

MarkTechPost

General

OpenAI Releases an Advanced Speech-to-Speech Model and New Realtime API Capabilities including MCP Server Support, Image Input, and SIP Phone Calling Support - AI news coverage from MarkTechPost in General

General

📄 MarkTechPost

Aug 29, 2025

OpenAI Releases an Advanced Speech-to-Speech Model and New Realtime API Capabilities including MCP Server Support, Image Input, and SIP Phone Calling Support

OpenAI has launched Realtime API and GPT-Realtime, its most advanced speech-to-speech model, marking a significant step forward in voice AI technology by enabling direct audio processing through a unified system that reduces latency and preserves speech nuances. This architectural innovation replaces traditional pipelines that chain separate speech-to-text, language processing, and text-to-speech models, resulting in measurable performance improvements, such as a 26% increase in reasoning accuracy on the Big Bench Audio evaluation and enhanced instruction-following capabilities. Despite these advancements, the performance gains remain incremental, with GPT-Realtime achieving 82.8% accuracy

GPT

VentureBeat AI

Business

In crowded voice AI market, OpenAI bets on instruction-following and expressive speech to win enterprise adoption - AI news coverage from VentureBeat AI in Business

Business

📈 VentureBeat AI

Aug 28, 2025

In crowded voice AI market, OpenAI bets on instruction-following and expressive speech to win enterprise adoption

OpenAI has introduced gpt-realtime, a new speech model designed to generate highly naturalistic and expressive voices, aiming to enhance the realism of AI-generated speech in enterprise applications. This development seeks to increase adoption of AI voice technology across industries by providing more human-like and engaging audio outputs, potentially transforming applications such as virtual assistants, customer service, and multimedia content creation.

GPT

1 2 3 4 5 6 7 ... 130

Page 60 of 130 • Showing articles 709-720 of 1560

Quick Navigation

Jump to any page or browse by category

Latest (Page 1) Business 249 Ethics 150 General 142 Policy 12 Research 793 Startups 13