Loading…

AI & Machine Learning News Hub

Research, releases, and applied work in AI & ML

Latest
MIT News - Artificial intelligenceStudy: Firms often use automation to control certain workers’ wagescs.LG updates on arXiv.orgEndogenous Regime Switching Driven by Scalar-Irreducible Learning Dynamicscs.LG updates on arXiv.orgA Self-Attentive Meta-Optimizer with Group-Adaptive Learning Rates and Weight Decaycs.LG updates on arXiv.orgTransformation Categorization Based on Group Decomposition Theory Using Parameter Divisioncs.LG updates on arXiv.orgStructured Progressive Knowledge Activation for LLM-Driven Neural Architecture Searchcs.LG updates on arXiv.orgMP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learningcs.LG updates on arXiv.orgContinual Distillation of Teachers from Different Domainscs.LG updates on arXiv.orgLookahead Drifting Modelcs.LG updates on arXiv.orgSingle-Position Intervention Fails: Distributed Output Templates Drive In-Context Learningcs.LG updates on arXiv.orgEdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillationcs.LG updates on arXiv.orgInvestigating Trustworthiness of Nonparametric Deep Survival Models for Alzheimer's Disease Progression Analysiscs.LG updates on arXiv.orgImproving Medical VQA through Trajectory-Aware Process Supervisioncs.LG updates on arXiv.orgDesigning a double deep reinforcement learning selection tool for resilient demand predictioncs.LG updates on arXiv.orgLAWS: Learning from Actual Workloads Symbolically -- A Self-Certifying Parametrized Cache Architecture for Neural Inference, Robotics, and Edge Deploymentcs.LG updates on arXiv.orgFlatASCEND: Autoregressive Clinical Sequence Generation with Continuous Time Prediction and Association-Based Pharmacological Testingcs.LG updates on arXiv.orgSparse Autoencoder Decomposition of Clinical Sequence Model Representations: Feature Complexity, Task Specialisation, and Mortality Predictioncs.LG updates on arXiv.orgConfronting Label Indeterminacy in Automated Bail Decisionscs.LG updates on arXiv.orgA Physics-Aware Framework for Short-Term GPU Power Forecasting of AI Data Centerscs.LG updates on arXiv.orgRetentiveKV: State-Space Memory for Uncertainty-Aware Multimodal KV Cache Evictioncs.LG updates on arXiv.orgA Regulatory Governance Framework for AI-Driven Financial Fraud Detection in U.S. Banking: Integrating OCC, SR 11-7, CFPB, and FinCEN Compliance Requirements for Model Development, Validation, and Monitoring LifecyclesMIT News - Artificial intelligenceStudy: Firms often use automation to control certain workers’ wagescs.LG updates on arXiv.orgEndogenous Regime Switching Driven by Scalar-Irreducible Learning Dynamicscs.LG updates on arXiv.orgA Self-Attentive Meta-Optimizer with Group-Adaptive Learning Rates and Weight Decaycs.LG updates on arXiv.orgTransformation Categorization Based on Group Decomposition Theory Using Parameter Divisioncs.LG updates on arXiv.orgStructured Progressive Knowledge Activation for LLM-Driven Neural Architecture Searchcs.LG updates on arXiv.orgMP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learningcs.LG updates on arXiv.orgContinual Distillation of Teachers from Different Domainscs.LG updates on arXiv.orgLookahead Drifting Modelcs.LG updates on arXiv.orgSingle-Position Intervention Fails: Distributed Output Templates Drive In-Context Learningcs.LG updates on arXiv.orgEdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillationcs.LG updates on arXiv.orgInvestigating Trustworthiness of Nonparametric Deep Survival Models for Alzheimer's Disease Progression Analysiscs.LG updates on arXiv.orgImproving Medical VQA through Trajectory-Aware Process Supervisioncs.LG updates on arXiv.orgDesigning a double deep reinforcement learning selection tool for resilient demand predictioncs.LG updates on arXiv.orgLAWS: Learning from Actual Workloads Symbolically -- A Self-Certifying Parametrized Cache Architecture for Neural Inference, Robotics, and Edge Deploymentcs.LG updates on arXiv.orgFlatASCEND: Autoregressive Clinical Sequence Generation with Continuous Time Prediction and Association-Based Pharmacological Testingcs.LG updates on arXiv.orgSparse Autoencoder Decomposition of Clinical Sequence Model Representations: Feature Complexity, Task Specialisation, and Mortality Predictioncs.LG updates on arXiv.orgConfronting Label Indeterminacy in Automated Bail Decisionscs.LG updates on arXiv.orgA Physics-Aware Framework for Short-Term GPU Power Forecasting of AI Data Centerscs.LG updates on arXiv.orgRetentiveKV: State-Space Memory for Uncertainty-Aware Multimodal KV Cache Evictioncs.LG updates on arXiv.orgA Regulatory Governance Framework for AI-Driven Financial Fraud Detection in U.S. Banking: Integrating OCC, SR 11-7, CFPB, and FinCEN Compliance Requirements for Model Development, Validation, and Monitoring Lifecycles

By Source

Feeds organized so you can skim by site.

Density Sort
MN
Study: Firms often use automation to control certain workers’ wages 14h ago A new study shows that rather than use automation to pursue maximal efficiency, U.S. firms have often used it to replace employees who enjoy a “wage premium,” earning higher salaries than other comparable workers. Games people — and machines — play: Untangling strategic reasoning to advance AI 1d ago MIT Assistant Professor Gabriele Farina explores his approach to untangling strategic reasoning to advance AI. Beacon Biosignals is mapping the brain during sleep 6d ago Beacon Biosignals is creating a model to help diagnose and treat brain disorders, based on data collected while people sleep at home. The firm was founded by MIT alumnus Jake Donoghue and former MIT researcher Jarrett Revels. Improving understanding with language 6d ago MIT senior Olivia Honeycutt studies the brain and linguistics to explore how cognition, education, language, language learning, education, policy, and the ways we communicate can shape our views of the world. Making the case for curiosity-driven science 7d ago President Sally Kornbluth spoke in front of a packed crowd about growing challenges to the U.S. research ecosystem as funding for America’s top research universities becomes increasingly strained. Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models 7d ago A new debiasing approach called WRING resolves the "Whac-a-Mole dilemma" of existing debiasing approaches that can create or amplify existing biases. The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing 8d ago IBM and MIT announced the launch of the MIT-IBM Computing Research Lab, advancing their long-standing collaboration to shape the next era of computing that combines AI, algorithms, and quantum computing. The new lab evolved from the MIT-IBM... Enabling privacy-preserving AI training on everyday devices 8d ago MIT researchers developed a technique that accelerates a privacy-preserving approach for training AI models on edge devices. Their new framework could enable more accurate, efficient, and secure AI models to be used in under-resourced setti... A faster way to estimate AI power consumption 10d ago The EnergAIzer technique can predict how much power a certain AI workload will consume when run on a particular processor. This method could help data center operators and algorithm developers improve the sustainability of AI workloads. MIT scientists build the world’s largest collection of Olympiad-level math problems, and open it to everyone 13d ago MIT CSAIL scientists have compiled the largest high-quality dataset of proof-based math problems ever created. It can help researchers test AI models’ mathematical reasoning, while capturing the full range of mathematical perspectives and p...
20 loaded
CL
cs.LG updates on arXiv.org
14h ago · 20 items
Endogenous Regime Switching Driven by Scalar-Irreducible Learning Dynamics 14h ago Abstract page for arXiv paper 2605.04054: Endogenous Regime Switching Driven by Scalar-Irreducible Learning Dynamics A Self-Attentive Meta-Optimizer with Group-Adaptive Learning Rates and Weight Decay 14h ago Abstract page for arXiv paper 2605.04055: A Self-Attentive Meta-Optimizer with Group-Adaptive Learning Rates and Weight Decay Transformation Categorization Based on Group Decomposition Theory Using Parameter Division 14h ago Abstract page for arXiv paper 2605.04056: Transformation Categorization Based on Group Decomposition Theory Using Parameter Division Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search 14h ago Abstract page for arXiv paper 2605.04057: Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search MP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learning 14h ago Abstract page for arXiv paper 2605.04058: MP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learning Continual Distillation of Teachers from Different Domains 14h ago Abstract page for arXiv paper 2605.04059: Continual Distillation of Teachers from Different Domains Lookahead Drifting Model 14h ago Abstract page for arXiv paper 2605.04060: Lookahead Drifting Model Single-Position Intervention Fails: Distributed Output Templates Drive In-Context Learning 14h ago Abstract page for arXiv paper 2605.04061: Single-Position Intervention Fails: Distributed Output Templates Drive In-Context Learning EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation 14h ago Abstract page for arXiv paper 2605.04062: EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation Investigating Trustworthiness of Nonparametric Deep Survival Models for Alzheimer's Disease Progression Analysis 14h ago Abstract page for arXiv paper 2605.04063: Investigating Trustworthiness of Nonparametric Deep Survival Models for Alzheimer's Disease Progression Analysis
20 loaded
SM
stat.ML updates on arXiv.org
14h ago · 20 items
A Consistency-Centric Approach to Set-Based Optimization with Multiple Models of Unranked Fidelity 14h ago Abstract page for arXiv paper 2605.04051: A Consistency-Centric Approach to Set-Based Optimization with Multiple Models of Unranked Fidelity Heterogeneous Ordinal Structure Learning with Bayesian Nonparametric Complexity Discovery 14h ago Abstract page for arXiv paper 2605.04191: Heterogeneous Ordinal Structure Learning with Bayesian Nonparametric Complexity Discovery Entropic Riemannian Neural Optimal Transport 14h ago Abstract page for arXiv paper 2605.04255: Entropic Riemannian Neural Optimal Transport Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization 14h ago Abstract page for arXiv paper 2605.04269: Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization Perturbation is All You Need for Extrapolating Language Models 14h ago Abstract page for arXiv paper 2605.04344: Perturbation is All You Need for Extrapolating Language Models Multiscale Euclidean Network Trajectories: Second-Moment Geometry, Attribution, and Change Points 14h ago Abstract page for arXiv paper 2605.04589: Multiscale Euclidean Network Trajectories: Second-Moment Geometry, Attribution, and Change Points Jacobian-Velocity Bounds for Deployment Risk Under Covariate Drift 14h ago Abstract page for arXiv paper 2605.04932: Jacobian-Velocity Bounds for Deployment Risk Under Covariate Drift Scalable inference of spatial regions and temporal signatures from time series 14h ago Abstract page for arXiv paper 2605.05008: Scalable inference of spatial regions and temporal signatures from time series Hypergraph Generation via Structured Stochastic Diffusion 14h ago Abstract page for arXiv paper 2605.05024: Hypergraph Generation via Structured Stochastic Diffusion Proximal Projection for Doubly Sparse Regularized Models 14h ago Abstract page for arXiv paper 2605.05093: Proximal Projection for Doubly Sparse Regularized Models
20 loaded
Artificial intelligence for predicting transient hypocalcemia after total thyroidectomy 18h ago Transient hypocalcemia is a common complication of total thyroidectomy. This study aimed to evaluate whether machine learning (ML)-based models could enhance early risk prediction and support a robust clinical decision support system for hy... Non-invasive profiling of the tumour microenvironment with spatial ecotypes 1d ago Multicellular programs in the tumour microenvironment (TME) drive cancer pathogenesis and response to therapy but remain challenging to identify and profile clinically1–3. Here, we present a machine-learning framework for multi-analyte prof... AI agents in research: when productivity comes at the cost of apprenticeship 2d ago Letter to the Editor Responses to the AI grant flood must prioritize fairness as part of excellence 2d ago Research funding agencies are battling a wave of AI-assisted applications. Countermeasures should not entrench existing power structures. Seesaw signatures capture trajectory-like transcriptomic shifts and enable compact tumour cell classification across cancers 2d ago Accurate identification of tumor cells remains a major challenge in single-cell cancer research because malignant and normal cells often differ only subtly and vary greatly across datasets. Here we show that Seesaw pairs, defined by consist... $${\bf{Micro}}{{\mathbb{S}}}{\bf{plit}}$$ Micro S plit : semantic unmixing of fluorescent microscopy data 2d ago Fluorescence microscopy is constrained by optical limits, fluorophore chemistry and finite photon budgets, imposing trade-offs between imaging speed, resolution and phototoxicity. Here we introduce $${\rm{Micro}}{\mathbb{S}}{\rm{plit}}$$ , ... Inference of latent epidemic regimes and generative simulations reveal how inequality and mobility shape COVID-19 transmission 3d ago Epidemic waves in large metropolitan areas unfold heterogeneously across territories shaped by persistent socioeconomic inequalities. Explaining how transmission intensifies, stabilises, and shifts across the urban landscape remains a centr... These powerful tools reveal the ‘control knobs’ of the genome 3d ago By accelerating the identification of DNA sequences that control gene expression, assays are revealing the hidden grammar of the regulatory genome — and giving scientists the means to rewrite it Hierarchical dynamic model for risk-stratified screening of nasopharyngeal carcinoma 3d ago Early detection of nasopharyngeal carcinoma through Epstein-Barr virus serology is hampered by a low positive predictive value. This study aims to develop a hierarchical dynamic model to refine risk stratification among individuals initiall... Unsupervised transfer learning enables multi-animal tracking without training annotation 3d ago Quantitative ethology necessitates accurate tracking of animal locomotion, especially for population-level analyses involving multiple individuals. However, current methods mostly rely on laborious annotations for supervised training and ha...
20 loaded
AM
Apple Machine Learning Research
18h ago · 10 items
What Matters in Practical Learned Image Compression 18h ago One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be… Text-Conditional JEPA for Learning Semantically Rich Visual Representations 18h ago Image-based Joint-Embedding Predictive Architecture (I-JEPA) offers a promising approach to visual self-supervised learning through masked… SpecMD: A Comprehensive Study on Speculative Expert Prefetching 1d ago Mixture-of-Experts (MoE) models enable sparse expert activation, meaning that only a subset of the model’s parameters is used during each… Normalizing Flows with Iterative Denoising 1d ago Normalizing Flows (NFs) are a classical family of likelihood-based methods that have received revived attention. Recent efforts such as… From Where Things Are to What They’re For: Benchmarking Spatial–Functional Intelligence for Multimodal LLMs 1d ago True spatial intelligence for multimodal agents transcends low-level geometric perception, evolving from knowing where things are to… Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing 2d ago Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during… PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning 3d ago Multi-tool-integrated reasoning enables LLM-empowered tool-use agents to solve complex tasks by interleaving natural-language reasoning with… Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents 6d ago This paper was accepted at the Fifth Workshop on Natural Language Generation, Evaluation, and Metrics at ACL 2026. Tool-calling agents are… International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2026 7d ago Apple is presenting new research at the annual International Conference on Acoustics, Speech and Signal Processing (ICASSP), which takes… STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows 7d ago Normalizing flows (NFs) are end-to-end likelihood-based generative models for continuous data, and have recently regained attention with…
QM
Quanta Magazine
1d ago · 5 items
GD
Google DeepMind News
1d ago · 20 items
AlphaEvolve: How our Gemini-powered coding agent is scaling impact across fields 1d ago Discover how AlphaEvolve optimizes algorithms for genomics, quantum physics, global infrastructure, and more to accelerate scientific progress and solve real-world challenges. Enabling a new model for healthcare with AI co-clinician 7d ago Google DeepMind is researching the path toward an AI co-clinician that could work under physician authority to assist doctors and patients, enabling new models for AI-augmented care. Announcing our partnership with the Republic of Korea 10d ago Google DeepMind partners with Korea's MSIT to establish an AI Campus to help accelerate scientific breakthroughs, support local talent, and advance AI safety research Decoupled DiLoCo: A new frontier for resilient, distributed AI training 15d ago Google’s new distributed architecture keeps AI training runs on track across distant data centers, with exceptional efficiency – even when hardware fails. Partnering with industry leaders to accelerate AI transformation 16d ago Google DeepMind is partnering with leading consultancies to bridge the AI adoption gap and drive agentic transformation with frontier models and expert research. Gemini 3.1 Flash TTS: the next generation of expressive AI speech 22d ago Gemini 3.1 Flash TTS is now available across Google products. Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning 24d ago Gemini Robotics ER 1.6 upgrades spatial reasoning and multi-view understanding, unlocking new capabilities like instrument reading for autonomous robots. Gemma 4: Byte for byte, the most capable open models 35d ago Gemma 4: our most intelligent open models to date, purpose-built for advanced reasoning and agentic workflows. Gemini 3.1 Flash Live: Making audio AI more natural and reliable 42d ago Gemini 3.1 Flash Live is now available across Google products. Protecting people from harmful manipulation 43d ago Google DeepMind releases new findings and an evaluation framework to measure AI's potential for harmful manipulation in areas like finance and health, with the goal of enhancing AI safety.
20 loaded
MN
MIT News - Machine learning
1d ago · 20 items
Games people — and machines — play: Untangling strategic reasoning to advance AI 1d ago MIT Assistant Professor Gabriele Farina explores his approach to untangling strategic reasoning to advance AI. Beacon Biosignals is mapping the brain during sleep 6d ago Beacon Biosignals is creating a model to help diagnose and treat brain disorders, based on data collected while people sleep at home. The firm was founded by MIT alumnus Jake Donoghue and former MIT researcher Jarrett Revels. Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models 7d ago A new debiasing approach called WRING resolves the "Whac-a-Mole dilemma" of existing debiasing approaches that can create or amplify existing biases. The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing 8d ago IBM and MIT announced the launch of the MIT-IBM Computing Research Lab, advancing their long-standing collaboration to shape the next era of computing that combines AI, algorithms, and quantum computing. The new lab evolved from the MIT-IBM... Enabling privacy-preserving AI training on everyday devices 8d ago MIT researchers developed a technique that accelerates a privacy-preserving approach for training AI models on edge devices. Their new framework could enable more accurate, efficient, and secure AI models to be used in under-resourced setti... A faster way to estimate AI power consumption 10d ago The EnergAIzer technique can predict how much power a certain AI workload will consume when run on a particular processor. This method could help data center operators and algorithm developers improve the sustainability of AI workloads. MIT scientists build the world’s largest collection of Olympiad-level math problems, and open it to everyone 13d ago MIT CSAIL scientists have compiled the largest high-quality dataset of proof-based math problems ever created. It can help researchers test AI models’ mathematical reasoning, while capturing the full range of mathematical perspectives and p... Teaching AI models to say “I’m not sure” 14d ago MIT CSAIL's “Reinforcement Learning with Calibration Rewards” technique improves AI confidence estimates without sacrificing performance, addressing a root cause of hallucination in reasoning models. Jacob Andreas and Brett McGuire named Edgerton Award winners 20d ago MIT associate professors Jacob Andreas and Brett McGuire have been selected as the winners of the 2026 Harold E. Edgerton Faculty Achievement Award for exceptional contributions to teaching, research, and service at MIT. Bringing AI-driven protein-design tools to biologists everywhere 20d ago OpenProtein.AI is helping biologists stay on the cutting edge of AI with a no-code platform for protein engineering. It was founded by MIT alumni Tristan Bepler and Tim Lu.
20 loaded
MR
Microsoft Research
2d ago · 10 items
Microsoft at NSDI 2026: Advances in large-scale networked systems 2d ago Microsoft researchers share advances in building and operating large-scale distributed systems, spanning datacenters, networking, and the growing intersection with AI during NSDI ’26. Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale 6d ago Safe agents don’t guarantee a safe ecosystem of interconnected agents. Microsoft Research examines what breaks when AI agents interact and why network-level risks require new approaches. Learn more: AutoAdapt: Automated domain adaptation for large language models 15d ago AutoAdapt automates the design and tuning of domain adaptation workflows for large language models. It improves performance without requiring additional compute, making deployment more accessible: Can we AI our way to a more sustainable world? 17d ago Doug Burger, sustainability expert Amy Luers, and optimization researcher Ishai Menache examine the global emissions implications of datacenter operations, efficiency gains, and AI's potential across electrification, materials, and food sys... New Future of Work: AI is driving rapid change, uneven benefits 28d ago For the past five years, the New Future of Work report has captured how work is changing. This year, the shift feels especially sharp. Previous editions have focused on technology’s role in increasing productivity by automating tasks, accel... Ideas: Steering AI toward the work future we want 28d ago On the Microsoft Research Podcast, Chief Scientist Jaime Teevan & researchers Jenna Butler, Jake Hofman, & Rebecca Janssen unpack the New Future of Work Report 2025 & explore what an ideal AI-driven working world looks like (it’s not just d... ADeLe: Predicting and explaining AI performance across tasks 36d ago AI benchmarks report how large language models (LLMs) perform on specific tasks but provide little insight into their underlying capabilities that drive their performance. They do not explain failures or reliably predict outcomes on new tas... AsgardBench: A benchmark for visually grounded interactive planning 41d ago AsgardBench evaluates whether embodied agents can revise their plans based on visual observations as tasks unfold. By focusing on perception-driven planning, it exposes key limitations and guides improvements in agent reliability. GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation 42d ago Vision-language models (VLMs) use images and text to plan robot actions, but they still struggle to decide what actions to take and where to take them. Most systems split these decisions into two steps: a VLM generates a plan in natural lan... Will machines ever be intelligent? 45d ago In Episode 1 of “The Shape of Things to Come,” technologists Subutai Ahmad & Nicolò Fusi join Microsoft’s Doug Burger to compare how large language models work with how the human brain learns & what it means for AI’s future.
FO
Future of Life Institute
21d ago · 20 items
FLI’s President and CEO on Trump’s support for an AI ‘kill switch’ 21d ago FLI CEO’s statement on the attack against Sam Altman’s home 26d ago Prominent Scientists, Faith Leaders, Policymakers and Artists Call for a Prohibition on Superintelligence, as Poll Shows Americans Don’t Want It 40d ago Statement: Head of US Policy on the White House AI legislative recommendations 45d ago Governor DeSantis Directs Florida State Agencies to Partner with Future of Life Institute to Shield Families from AI Harm 59d ago “This is What it Means to be Pro-Human” Declares Broad Coalition of Conservative, Progressive, and Civil Society Groups in Statement of Shared Principles on AI 63d ago Statement from Max Tegmark on the Department of War’s ultimatum 69d ago Future of Life Institute Launches Multimillion Dollar Nationwide AI Regulation Campaign 86d ago AI Company Safety Practices Fall Short of Public Commitments and Show Structural Weaknesses, as Top Performers Widen the Gap 155d ago The U.S. Public Wants Regulation (or Prohibition) of Expert‑Level and Superhuman AI 199d ago
20 loaded
IN
inFERENCe
71d ago · 15 items
The Future of Software 71d ago The world of software is undergoing a shift not seen since the advent of compilers in the 1970s. Compilers were the original vibe coding: they automatically generate complex machine code that human programmers had to manually write before. ... Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On 96d ago Ten years ago this week, I wrote a post called "Deep Learning is Easy - Learn Something Harder". The post blew up, top spot on HackerNews. Needless to say, it didn't age well. Discrete Diffusion: Continuous-Time Markov Chains 350d ago A tutorial explaining some intuitions behind continuous time Markov chains for machine learners interested in discrete diffusion models. We may finally crack Maths. But should we? 1064d ago Automating mathematical theorem proving has been a long standing goal of artificial intelligence and indeed computer science. It's one of the areas I became very interested in recently. This is because I feel we may have the ingredients nee... Mortal Komputation: On Hinton's argument for superhuman AI. 1073d ago Last week in Cambridge was Hinton bonanza. He visited the university town where he was once an undergraduate in experimental psychology, and gave a series of back-to-back talks, Q&A sessions, interviews, dinners, etc. He was stopped on the ... Autoregressive Models, OOD Prompts and the Interpolation Regime 1134d ago A few years ago I was very much into maximum likelihood-based generative modeling and autoregressive models (see this, this or this). More recently, my focus shifted to characterising inductive biases of gradient-based optimization focussin... We May be Surprised Again: Why I take LLMs seriously. 1142d ago "Deep Learning is Easy, Learn something Harder" - I proclaimed in one of my early and provocative blog posts from 2016. While some observations were fair, that post is now evidence that I clearly underestimated the impact simple techniques ... Implicit Bayesian Inference in Large Language Models 1526d ago This intriguing paper kept me thinking long enough for me to I decide it's time to resurrect my blogging (I started writing this during ICLR review period, and realised it might be a good idea to wait until that's concluded) * Sang Michael ... Eastern European Guide to Writing Reference Letters 1529d ago Excruciating. One phrase I often use to describe what it's like to read reference letters for Eastern European applicants to PhD and Master's programs in Cambridge. Even objectively outstanding students often receive dull, short, factual, a... Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models 1792d ago This post is written with my PhD student and now guest author Patrik Reizinger [https://twitter.com/rpatrik96] and is part 4 of a series of posts on causal inference: * Part 1: Intro to causal inference and do-calculus [https://www.inferenc...
15 loaded
TG
The Gradient
77d ago · 15 items
15 loaded
VI
VITALab
252d ago · 10 items
Brain Latent Progression Individual-based spatiotemporal disease progression on 3D Brain MRIs via latent diffusion 252d ago This article aims at reviewing a Alzheimer’s spatiotemporal disease progression predictive model called Brain Latent Progression (BrLP). All in all, this is ... A Survey of popular LLM Evaluation Metrics 260d ago Large Language Models (LLMs) are increasingly applied to critical domains such as medical report generation, where accuracy and trust are essential. Evaluati... Open-Source Large Language Models in Radiology: A Review and Tutorial for Practical Research and Clinical Deployment 269d ago Open-Source Large Language Models in Radiology MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation 338d ago MemSAM Simplifying Deep Temporal Difference Learning 395d ago tl;dr The authors propose PQN, a simplified deep online Q-Learning that uses very small replay buffers. Normalization and parallelized sampling from vectoriz... EchoPrime: Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation 409d ago Objective EchoPrime is a foundation model designed for comprehensive echocardiographic interpretation. Unlike previous models that use single views or static... DeepSeek-V3 Technical Report 450d ago DeepSeek-V3 Variational Autoencoders for Generating Synthetic Tractography-Based Bundle Templates in a Low-Data Setting 478d ago Highlights Implicit neural representations 506d ago Implicit neural networks Foundations of diffusion networks 527d ago Diffusion networks As there’s a lot of recent developments around image generation and diffusion models in general, I took a deep dive in the fundamentals of...
TS
The Stanford AI Lab Blog
1437d ago · 15 items
LinkBERT: Improving Language Model Training with Document Link 1437d ago Language Model Pretraining Language models (LMs), like BERT 1 and the GPT series 2, achieve remarkable performance on many natural language processing (NLP) tasks. They are now the foundation of today’s NLP systems. 3 These models serve imp... Stanford AI Lab Papers and Talks at ACL 2022 1443d ago The official Stanford AI Lab blog Stanford AI Lab Papers and Talks at ICLR 2022 1473d ago The official Stanford AI Lab blog Discovering the systematic errors made by machine learning models 1491d ago Discovering systematic errors with cross-modal embeddings Grading Complex Interactive Coding Programs with Reinforcement Learning 1501d ago The official Stanford AI Lab blog Understanding Deep Learning Algorithms that Leverage Unlabeled Data, Part 1: Self-training 1533d ago The official Stanford AI Lab blog Stanford AI Lab Papers and Talks at AAAI 2022 1535d ago The official Stanford AI Lab blog How to Improve User Experience (and Behavior): Three Papers from Stanford's Alexa Prize Team 1556d ago Introduction Reward Isn't Free: Supervising Robot Learning with Language and Video from the Web 1567d ago This work was conducted as part of SAIL and CRFM. BanditPAM: Almost Linear-Time k-medoids Clustering via Multi-Armed Bandits 1602d ago The official Stanford AI Lab blog
15 loaded

No matching sources found.