每日信息看板 · 2026-02-10

1. ChatGPT · 卢诗翰 · 可能是AI行业影响极大的一个新闻，甚至可能不止AI行业面临中国AI的不断进击，openai准备…

分类：社交讨论/观点来源：weibo_ai_trending分数：100作者：卢诗翰时间：2026-02-12 21:31

微博热议称OpenAI或将放宽成人向内容政策并伴随高管争议性解雇传闻，该话题因触及AI内容边界与性别议题而被认为可能影响行业规则与舆论走向。

帖子称OpenAI准备放开成人向内容，认为这是可能影响AI行业的重要动态。
文中提到一名女副总反对后被指“对男性性别歧视”并遭解雇，争议性强。
该信息来自微博讨论而非官方公告，当前更偏向舆情与观点传播。
互动数据较高（转发393、评论347、点赞8815），显示话题热度明显。

#ChatGPT #Weibo #社交讨论/观点

原链接详情页

2. 大模型 · 新京报我们视频 · 【#千问3.5成本仅为谷歌大模型5%#】2月16日除夕，阿里巴巴开源全新一代大模型千问Qwen…

分类：产品/发布来源：weibo_ai_trending分数：52作者：新京报我们视频时间：2026-02-16 18:00

阿里在2月16日开源发布Qwen3.5-Plus，称其以更低激活参数实现接近或超越更大模型的性能且成本仅为谷歌同类模型约5%，这凸显了国产开源大模型在性价比与工程效率上的竞争力。

阿里巴巴开源新一代大模型Qwen3.5-Plus。
相关信息称其性能可媲美Gemini 3 Pro，并被描述为强开源模型之一。
模型总参数约3970亿、激活参数约170亿，强调“以小胜大”的架构优化思路。
内容提到其成本约为谷歌大模型的5%，并称性能超过更大参数规模的Qwen3-Max。
微博话题传播度较高（转发845、评论807、点赞1439）。

#大模型 #Weibo #产品/发布 #阿里巴巴 #Qwen3.5-Plus #视频 #开源模型

原链接详情页

3. Sora · 暴食症患者李舜生 · 👅这是美国ChatGPT的Sora吗，感觉好先进啊😡是中国字节的Seedance，那会不会很影…

分类：社交讨论/观点来源：weibo_ai_trending分数：45作者：暴食症患者李舜生时间：2026-02-15 12:26

一条微博围绕“是否为OpenAI的Sora或字节Seedance”展开辨识并引出对环境影响的担忧，反映了公众对AI视频技术来源与外部成本的关注升温。

讨论核心是将热门生成式视频能力与Sora、字节Seedance进行对比辨认。
发帖者表达“技术很先进”的直观感受，同时提出可能的环境影响疑问。
该话题在微博获得较高互动（转发218、评论131、点赞3847），显示公众关注度较高。

#Sora #Weibo #社交讨论/观点 #Seedance

原链接详情页

4. spmallick/learnopencv

分类：开源项目来源：github_search分数：100作者：spmallick时间：2026-02-10T21:36:54Z

GitHub仓库 spmallick/learnopencv 汇总了 LearnOpenCV 博客的大量计算机视觉、深度学习与AI教程及配套代码，重要性在于为开发者提供可复现、覆盖前沿主题的一站式学习与实践资源。

仓库定位为 LearnOpenCV 博客文章的代码集合，聚焦计算机视觉、深度学习与AI实践。
内容覆盖广泛，包括YOLO系列、VLM/LLM、RAG、3D重建、SLAM、扩散模型、医疗影像、机器人等方向。
多数文章提供对应代码链接，强调从论文解读到落地实现的可复现学习路径。
包含大量边缘部署与工程化主题，如Jetson、Arduino、Raspberry Pi、ROS2、Carla等。
仓库持续更新，反映近期热门模型与技术趋势（如SAM2/3、Qwen、Gemma、VideoRAG等）。

#GitHub #repo #开源项目 #OpenCV #YOLO #RAG #Agent

原链接详情页

5. leejet/stable-diffusion.cpp

分类：开源项目来源：github_search分数：24作者：leejet时间：2026-02-10T18:51:55Z

stable-diffusion.cpp 是一个基于纯 C/C++ 与 ggml 的轻量跨平台扩散模型推理项目，近期持续新增 FLUX.2、Qwen Image、Wan2.2、Z-Image 等支持，重要性在于可用统一后端低依赖部署多类图像/视频生成能力。

项目主打纯 C/C++ 实现、轻量无外部依赖，定位类似 llama.cpp 的本地推理方案。
已覆盖 SD1.x/2.x/XL、SD3/3.5、FLUX 系列、Qwen Image、Z-Image、Wan2.1/2.2 等模型，并支持图像编辑与视频模型。
支持 LoRA、ControlNet、LCM、TAESD、ESRGAN、PhotoMaker 等常用能力，兼容多种生成工作流。
后端支持 CPU/CUDA/Vulkan/Metal/OpenCL/SYCL，平台覆盖 Linux/macOS/Windows/Android。
支持 ckpt/pth、safetensors、GGUF 权重格式，并提供多语言绑定与多种第三方 UI 集成。

#GitHub #repo #开源项目 #stable-diffusion.cpp #C++ #ggml #FLUX #Qwen Image

原链接详情页

6. jamiepine/voicebox

分类：开源项目来源：github_search分数：14作者：jamiepine时间：2026-02-10T14:59:43Z

GitHub 开源项目 Voicebox 发布本地优先语音合成工作室，支持语音克隆、TTS与多轨编辑，重要性在于以隐私可控和可扩展 API 提供对云端语音平台的替代方案。

定位为本地运行、免费开源的语音克隆与合成工具，对标 ElevenLabs
当前基于 Qwen3-TTS，支持少量音频样本克隆语音，并集成 Whisper 转录
提供类 DAW 的多轨时间线编辑、剪辑、对话混音与生成历史管理
桌面端采用 Tauri(Rust)+React，后端 FastAPI，支持 REST API 集成到外部应用
在 Apple Silicon 上通过 MLX/Metal 获得 4-5 倍推理加速，现已支持 macOS 与 Windows

#GitHub #repo #开源项目 #Voicebox #TTS #Qwen3-TTS #Whisper #Tauri

原链接详情页

7. Start It With ChatGPT

分类：视频/演讲来源：youtube_rss分数：100作者：OpenAI时间：2026-02-13T23:43:33+00:00

该视频以“Start It With ChatGPT”为主题，强调用ChatGPT把看似天马行空的想法快速落地，重要性在于降低创意到执行的门槛并激发个人创新实践。

核心信息是鼓励用户用ChatGPT启动并推进任何创意想法。
内容定位偏激励与行动导向，强调“先开始”而非等待完美方案。
来源为YouTube视频条目，呈现形式可能是短视频或宣传向内容。
附带背景音乐信息为Joe Cocker在The Fillmore现场版《The Letter》。

#YouTube #视频/演讲 #ChatGPT

原链接详情页

8. Fix With ChatGPT

分类：视频/演讲来源：youtube_rss分数：100作者：OpenAI时间：2026-02-13T23:43:29+00:00

该YouTube视频以“Fix With ChatGPT”为题，围绕用ChatGPT提升实操技能并应用到出行场景展开，体现了AI助手在日常问题解决中的实用价值。

视频标题为“Fix With ChatGPT”，主题聚焦借助ChatGPT进行修复或问题处理。
描述强调“Sharpen your skills and hit the road”，指向技能提升与实际出行应用。
内容信息较简短，呈现为面向大众的轻量化展示视频。
来源为YouTube RSS，附有可直接访问的视频链接。

#YouTube #视频/演讲 #ChatGPT

原链接详情页

9. Gemini 3 Deep Think: Accelerating mechanical engineering and rapid prototyping

分类：视频/演讲来源：youtube_rss分数：48作者：Google DeepMind时间：2026-02-12T16:12:16+00:00

Google平台与设备部门研发负责人测试Gemini 3 Deep Think，通过文本与图像输入推理几何约束生成可3D打印涡轮叶片设计，展示其可加速机械工程设计与快速原型并降低对CAD专家的依赖。

面向工程工作流，将逻辑需求转化为可执行的物理解决方案
使用文本提示与图像参考作为输入进行设计推理
推理几何约束以生成可3D打印的涡轮叶片设计
该任务通常需要专业CAD技能，Deep Think可缩短设计与原型周期
由Google Platforms and Devices部门的R&D负责人进行测试验证

#YouTube #视频/演讲 #Gemini 3 Deep Think #Google #CAD

原链接详情页

10. Flash-SD-KDE: Accelerating SD-KDE with Tensor Cores

分类：研究/论文来源：arxiv_search分数：100作者：Elliot L. Epstein时间：2026-02-10T23:56:03Z

论文提出Flash-SD-KDE，通过重排计算以利用GPU Tensor Cores显著加速SD-KDE，在保持其统计优势的同时将大规模密度估计推向可实用范围。

SD-KDE虽较经典KDE具更优渐近收敛率，但因经验score计算导致实际速度慢。
作者将SD-KDE计算重排为更适合矩阵乘法的形式，从而可充分利用GPU Tensor Cores。
在3.2万样本、16维任务上，相比强基线GPU版SD-KDE最高提速47倍。
相同设置下，相比scikit-learn KDE最高提速约3300倍。
在100万样本、16维且13.1万查询的大规模任务中，单GPU仅需2.3秒完成。

#arXiv #paper #研究/论文 #SD-KDE #Tensor Cores #Flash-SD-KDE

原链接详情页

11. Hardware Co-Design Scaling Laws via Roofline Modelling for On-Device LLMs

分类：研究/论文来源：arxiv_search分数：98作者：Luoyang Sun时间：2026-02-10T23:51:00Z

该论文提出面向端侧LLM的软硬件协同缩放律，将训练损失与Roofline时延建模耦合以直接优化精度-延迟权衡，在Jetson Orin上显著加速架构选型并在同延迟下优于Qwen2.5-0.5B。

提出硬件协同设计定律：用架构超参数显式建模训练损失，并用Roofline模型刻画推理时延。
在NVIDIA Jetson Orin上评估1942个候选架构，选取170个模型各训练100亿token以拟合缩放律。
通过耦合损失缩放律与时延模型，建立精度-延迟直接映射并给出Pareto前沿。
将架构搜索表述为精度与性能的联合优化，在工业硬件与应用预算下得到可行设计区域。
将架构选型周期从数月缩短到数天；在与Qwen2.5-0.5B相同时延下，WikiText-2困惑度降低19.42%。
作者称这是首个可操作的端侧LLM硬件协同缩放律框架，并计划开源代码与检查点。

#arXiv #paper #研究/论文 #Scaling Law #Jetson Orin #VLA

原链接详情页

12. Simple LLM Baselines are Competitive for Model Diffing

分类：研究/论文来源：arxiv_search分数：95作者：Elias Kempf时间：2026-02-10T23:45:26Z

Standard LLM evaluations only test capabilities or dispositions that evaluators designed them for, missing unexpected differences such as behavioral shifts bet…

Standard LLM evaluations only test capabilities or dispositions that evaluators designed them for, missing unexpected differences such as b…
Model diffing addresses this limitation by automatically surfacing systematic behavioral differences
Recent approaches include LLM-based methods that generate natural language descriptions and sparse autoencoder (SAE)-based methods that ide…
However, no systematic comparison of these approaches exists nor are there established evaluation criteria
We address this gap by proposing evaluation metrics for key desiderata (generalization, interestingness, and abstraction level) and use the…
Our results show that an improved LLM-based baseline performs comparably to the SAE-based method while typically surfacing more abstract be…

#arXiv #paper #研究/论文

原链接详情页

13. Causal Effect Estimation with Learned Instrument Representations

分类：研究/论文来源：arxiv_search分数：92作者：Frances Dean时间：2026-02-10T23:41:11Z

Instrumental variable (IV) methods mitigate bias from unobserved confounding in observational causal inference but rely on the availability of a valid instrume…

Instrumental variable (IV) methods mitigate bias from unobserved confounding in observational causal inference but rely on the availability…
In this paper, we propose a representation learning approach that constructs instrumental representations from observed covariates, which e…
Our model (ZNet) achieves this through an architecture that mirrors the structural causal model of IVs; it decomposes the ambient feature s…
e
, relevance, exclusion restriction, and instrumental unconfoundedness)
Importantly, ZNet is compatible with a wide range of downstream two-stage IV estimators of causal effects

#arXiv #paper #研究/论文

原链接详情页

14. LiveMedBench: A Contamination-Free Medical Benchmark for LLMs with Automated Rubric Evaluation

分类：研究/论文来源：arxiv_search分数：90作者：Zhiling Yan时间：2026-02-10T23:38:25Z

The deployment of Large Language Models (LLMs) in high-stakes clinical settings demands rigorous and reliable evaluation. However, existing medical benchmarks …

The deployment of Large Language Models (LLMs) in high-stakes clinical settings demands rigorous and reliable evaluation
However, existing medical benchmarks remain static, suffering from two critical limitations: (1) data contamination, where test sets inadve…
Furthermore, current evaluation metrics for open-ended clinical reasoning often rely on either shallow lexical overlap (e
g
, ROUGE) or subjective LLM-as-a-Judge scoring, both inadequate for verifying clinical correctness
To bridge these gaps, we introduce LiveMedBench, a continuously updated, contamination-free, and rubric-based benchmark that weekly harvest…

#arXiv #paper #研究/论文 #Agent

原链接详情页

15. ENIGMA: EEG-to-Image in 15 Minutes Using Less Than 1% of the Parameters

分类：研究/论文来源：arxiv_search分数：88作者：Reese Kneeland时间：2026-02-10T23:20:51Z

To be practical for real-life applications, models for brain-computer interfaces must be easily and quickly deployable on new subjects, effective on affordable…

To be practical for real-life applications, models for brain-computer interfaces must be easily and quickly deployable on new subjects, eff…
To directly address these current limitations, we introduce ENIGMA, a multi-subject electroencephalography (EEG)-to-Image decoding model th…
6M benchmarks, while fine-tuning effectively on new subjects with as little as 15 minutes of data
ENIGMA boasts a simpler architecture and requires less than 1% of the trainable parameters necessary for previous approaches
Our approach integrates a subject-unified spatio-temporal backbone along with a set of multi-subject latent alignment layers and an MLP pro…
We evaluate our approach using a broad suite of image reconstruction metrics that have been standardized in the adjacent field of fMRI-to-I…

#arXiv #paper #研究/论文

原链接详情页

16. Beyond Calibration: Confounding Pathology Limits Foundation Model Specificity in Abdominal Trauma CT

分类：研究/论文来源：arxiv_search分数：85作者：Jineel H Raythatha时间：2026-02-10T23:08:06Z

Purpose: Translating foundation models into clinical practice requires evaluating their performance under compound distribution shift, where severe class imbal…

Purpose: Translating foundation models into clinical practice requires evaluating their performance under compound distribution shift, wher…
This challenge is relevant for traumatic bowel injury, a rare but high-mortality diagnosis
We investigated whether specificity deficits in foundation models are associated with heterogeneity in the negative class
Methods: This retrospective study used the multi-institutional, RSNA Abdominal Traumatic Injury CT dataset (2019-2023), comprising scans fr…
Two foundation models (MedCLIP, zero-shot; RadDINO, linear probe) were compared against three task-specific approaches (CNN, Transformer, E…
Models were trained on 3,147 patients (2

#arXiv #paper #研究/论文

原链接详情页

17. Theoretical Analysis of Contrastive Learning under Imbalanced Data: From Training Dynamics to a Pruning Solution

分类：研究/论文来源：arxiv_search分数：82作者：Haixu Liao时间：2026-02-10T23:06:12Z

Contrastive learning has emerged as a powerful framework for learning generalizable representations, yet its theoretical understanding remains limited, particu…

Contrastive learning has emerged as a powerful framework for learning generalizable representations, yet its theoretical understanding rema…
Such an imbalance can degrade representation quality and induce biased model behavior, yet a rigorous characterization of these effects is …
In this work, we develop a theoretical framework to analyze the training dynamics of contrastive learning with Transformer-based encoders u…
Our results reveal that neuron weights evolve through three distinct stages of training, with different dynamics for majority features, min…
We further show that minority features reduce representational capacity, increase the need for more complex architectures, and hinder the s…
Inspired by these neuron-level behaviors, we show that pruning restores performance degraded by imbalance and enhances feature separation, …

#arXiv #paper #研究/论文

原链接详情页

18. Synthesizing the Kill Chain: A Zero-Shot Framework for Target Verification and Tactical Reasoning on the Edge

分类：研究/论文来源：arxiv_search分数：80作者：Jesse Barkley时间：2026-02-10T23:00:19Z

Deploying autonomous edge robotics in dynamic military environments is constrained by both scarce domain-specific training data and the computational limits of…

Deploying autonomous edge robotics in dynamic military environments is constrained by both scarce domain-specific training data and the com…
This paper introduces a hierarchical, zero-shot framework that cascades lightweight object detection with compact Vision-Language Models (V…
Grounding DINO serves as a high-recall, text-promptable region proposer, and frames with high detection confidence are passed to edge-class…
We evaluate this pipeline on 55 high-fidelity synthetic videos from Battlefield 6 across three tasks: false-positive filtering (up to 100% …
5%), and fine-grained vehicle classification (55-90%)
We further extend the pipeline into an agentic Scout-Commander workflow, achieving 100% correct asset deployment and a 9

#arXiv #paper #研究/论文

原链接详情页

19. Physically Interpretable AlphaEarth Foundation Model Embeddings Enable LLM-Based Land Surface Intelligence

分类：研究/论文来源：arxiv_search分数：78作者：Mashrekur Rahman时间：2026-02-10T22:58:50Z

Satellite foundation models produce dense embeddings whose physical interpretability remains poorly understood, limiting their integration into environmental d…

Satellite foundation models produce dense embeddings whose physical interpretability remains poorly understood, limiting their integration …
Using 12
1 million samples across the Continental United States (2017--2023), we first present a comprehensive interpretability analysis of Google A…
Combining linear, nonlinear, and attention-based methods, we show that individual embedding dimensions map onto specific land surface prope…
90$; temperature and elevation approach $R^2 = 0
97$)

#arXiv #paper #研究/论文

原链接详情页

20. Learning Self-Interpretation from Interpretability Artifacts: Training Lightweight Adapters on Vector-Label Pairs

分类：研究/论文来源：arxiv_search分数：75作者：Keenan Pepper时间：2026-02-10T22:50:02Z

Self-interpretation methods prompt language models to describe their own internal states, but remain unreliable due to hyperparameter sensitivity. We show that…

Self-interpretation methods prompt language models to describe their own internal states, but remain unreliable due to hyperparameter sensi…
We show that training lightweight adapters on interpretability artifacts, while keeping the LM entirely frozen, yields reliable self-interp…
A scalar affine adapter with just $d_\text{model}+1$ parameters suffices: trained adapters generate sparse autoencoder feature labels that …
The learned bias vector alone accounts for 85% of improvement, and simpler adapters generalize better than more expressive alternatives
Controlling for model knowledge via prompted descriptions, we find self-interpretation gains outpace capability gains from 7B to 72B parame…
Our results demonstrate that self-interpretation improves with scale, without modifying the model being interpreted

#arXiv #paper #研究/论文

原链接详情页

每日信息看板 · 2026-02-10

Daily Focus

按分类

按来源