每日信息看板 · 2026-02-17

Generated 2026-02-18 18:57 · Items 17
17
Items
2
Categories
2
Sources
8
LLM Calls
12156
LLM Tokens
0
Cost (USD)
ce2f0bba6b8b4195abaf7e8c69cbdd8c
Run ID

Daily Focus

每日看板 · 2026-02-17 · 2026-02-18 08:54 · Open
Issues: 3Reports: 3Day: 0m
  • 未记录具体事项
  • 提醒补充关键经历
  • 建议采用三条简记法
P0(先做)
AI Vibe Coding 规范化:流程、工具与复刻思路
Next: 选一个可量化的小功能按报告 Playbook 跑通一轮,并用“成本/质量/可审阅性”三指标决定订阅或自建方案。
Clawdbot(OpenClaw)实现用例与复现排障指南
Next: 把你当前的配置文件(脱敏)+启动日志+平台回调报错贴出,我按 Playbook 逐步定位卡点

按分类

研究/论文14开源项目3

按来源

arxiv_search14github_search3
1. mudler/LocalAI
分类:开源项目来源:github_search分数:100作者:mudler时间:2026-02-17T23:14:42Z
LocalAI 是一个可本地部署、兼容 OpenAI/Anthropic 等接口的开源推理平台,支持文本/图像/音频与多硬件后端,重要性在于让开发者以较低成本在本地或私有环境构建可控的 AI 服务。
  • 定位为 OpenAI 的开源替代方案,提供可直接替换的 REST API,并强调本地与 on-prem 推理能力。
  • 支持多模态能力与多模型家族(LLM、图像生成、语音等),且可在消费级硬件运行,部分场景不依赖 GPU。
  • 提供 Docker/容器化与 macOS 安装方式,并具备自动后端检测以适配 NVIDIA/AMD/Intel 等 GPU 环境。
  • 项目在 2025-2026 持续高频迭代,新增 MCP、Realtime API、Anthropic 支持、统一 GPU 后端及多项新语音后端。
  • 已形成 LocalAI 家族生态(如 LocalAGI、LocalRecall、Cogito、Wiz、SkillServer),面向 agent 与工作流扩展。
#GitHub #repo #开源项目 #LocalAI #Docker #MCP #Agent
2. carla-simulator/carla
分类:开源项目来源:github_search分数:32作者:carla-simulator时间:2026-02-17T23:18:58Z
CARLA在GitHub提供面向自动驾驶研发的开源仿真平台,并推进UE5.5开发分支,支持多传感器与场景验证,对自动驾驶训练、测试和生态集成具有关键价值。
  • CARLA是用于自动驾驶开发、训练与验证的开源仿真器,提供代码、协议和可自由使用的城市场景资产。
  • 当前仓库为UE5.5开发分支(ue5-dev),与UE4.26分支并行且差异显著,需按需求选择版本。
  • UE5.5版本明确要求Ubuntu 22.04或Windows 11,不支持Ubuntu 20.04及Windows 10及以下系统。
  • 项目提供完整文档与构建流程(Linux/Windows)、Python API、蓝图库和资产目录,便于二次开发。
  • 生态包含Leaderboard、Scenario Runner、ROS桥接、强化学习与AutoWare等配套仓库,覆盖评测到集成落地。
#GitHub #repo #开源项目 #CARLA #Unreal Engine 5.5 #Python API #ROS
3. Lightning-AI/litgpt
分类:开源项目来源:github_search分数:31作者:Lightning-AI时间:2026-02-17T22:53:51Z
Lightning AI 开源 LitGPT 项目,提供20+主流LLM从预训练、微调到评测与部署的一体化高性能流程,因其低抽象与高可控设计可显著降低训练推理成本并加速生产落地。
  • 项目主打“从零实现、无抽象层”,强调可调试性、性能与企业级可控性。
  • 支持20+模型族与多尺寸版本(如Llama、Qwen、Gemma、Phi、Mistral等),覆盖广泛场景。
  • 提供完整工作流:pretrain、continued pretrain、finetune、evaluate、deploy、test。
  • 集成Flash Attention、FSDP、LoRA/QLoRA/Adapter与fp4/8/16/32量化,支持低显存GPU。
  • 支持从单卡到1000+ GPUs/TPUs扩展,并提供已验证的YAML训练配方。
  • 采用Apache 2.0许可证,便于企业商业化使用与二次开发。
#GitHub #repo #开源项目 #LitGPT #Lightning AI #LLM #PyTorch
4. Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution
分类:研究/论文来源:arxiv_search分数:100作者:Christopher David Roberts时间:2026-02-17T18:59:55Z
该论文揭示最小化aCRPS的部分后处理与深度学习方法会因成员依赖导致集合规模敏感和过度离散,并提出轨迹Transformer在不同训练与推理集合规模下仍能稳健提升或保持预报可靠性。
  • aCRPS在成员可交换且条件独立时对集合规模公平无偏,但结构化成员依赖会破坏这一前提。
  • 作者用线性逐成员校准与基于集合维自注意力的深度学习两种方法展示:表面aCRPS改进可能伴随系统性不可靠(过度离散)。
  • 提出trajectory transformers:将PoET思路改为沿提前期做自注意力,避免在集合维引入依赖,从而保留aCRPS所需条件独立性。
  • 在ECMWF次季节系统的周平均2米气温预报上,该方法可减小系统偏差,并在不同训练规模(3/9)和实时规模(9/100)下保持或提升可靠性。
#arXiv #paper #研究/论文 #aCRPS #Transformer #ECMWF
5. Operationalising the Superficial Alignment Hypothesis via Task Complexity
分类:研究/论文来源:arxiv_search分数:98作者:Tomás Vergara-Browne时间:2026-02-17T18:59:39Z
该论文以“任务复杂度”形式化表层对齐假说并实证表明:预训练已蕴含高性能能力,而后训练将调用这些能力所需程序长度从GB级大幅压缩到KB级,这对理解模型适配成本与训练分工很关键。
  • 提出任务复杂度指标:达到目标性能所需最短程序长度,用于精确定义SAH。
  • 将既有支持SAH的不同论证统一为“寻找更短任务程序”的不同策略。
  • 在数学推理、机器翻译和指令跟随任务上估计复杂度,发现依赖预训练模型时可非常低。
  • 实验显示仅靠预训练虽可达到强性能,但访问这些性能可能需要GB级程序。
  • 后训练可将达到同等性能的复杂度降低数个数量级,常仅需几KB信息。
#arXiv #paper #研究/论文 #Superficial Alignment Hypothesis
6. Dex4D: Task-Agnostic Point Track Policy for Sim-to-Real Dexterous Manipulation
分类:研究/论文来源:arxiv_search分数:95作者:Yuxuan Kuang时间:2026-02-17T18:59:31Z
Learning generalist policies capable of accomplishing a plethora of everyday tasks remains an open challenge in dexterous manipulation. In particular, collecti…
  • Learning generalist policies capable of accomplishing a plethora of everyday tasks remains an open challenge in dexterous manipulation
  • In particular, collecting large-scale manipulation data via real-world teleoperation is expensive and difficult to scale
  • While learning in simulation provides a feasible alternative, designing multiple task-specific environments and rewards for training is sim…
  • We propose Dex4D, a framework that instead leverages simulation for learning task-agnostic dexterous skills that can be flexibly recomposed…
  • Specifically, Dex4D learns a domain-agnostic 3D point track conditioned policy capable of manipulating any object to any desired pose
  • We train this 'Anypose-to-Anypose' policy in simulation across thousands of objects with diverse pose configurations, covering a broad spac…
#arXiv #paper #研究/论文
7. Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching
分类:研究/论文来源:arxiv_search分数:92作者:Zhen Wu时间:2026-02-17T18:59:11Z
While recent advances in humanoid locomotion have achieved stable walking on varied terrains, capturing the agility and adaptivity of highly dynamic human moti…
  • While recent advances in humanoid locomotion have achieved stable walking on varied terrains, capturing the agility and adaptivity of highl…
  • In particular, agile parkour in complex environments demands not only low-level robustness, but also human-like motion expressiveness, long…
  • In this paper, we present Perceptive Humanoid Parkour (PHP), a modular framework that enables humanoid robots to autonomously perform long-…
  • Our approach first leverages motion matching, formulated as nearest-neighbor search in a feature space, to compose retargeted atomic human …
  • This framework enables the flexible composition and smooth transition of complex skill chains while preserving the elegance and fluidity of…
  • Next, we train motion-tracking reinforcement learning (RL) expert policies for these composed motions, and distill them into a single depth…
#arXiv #paper #研究/论文
8. CrispEdit: Low-Curvature Projections for Scalable Non-Destructive LLM Editing
分类:研究/论文来源:arxiv_search分数:90作者:Zarif Ikram时间:2026-02-17T18:58:04Z
A central challenge in large language model (LLM) editing is capability preservation: methods that successfully change targeted behavior can quietly game the e…
  • A central challenge in large language model (LLM) editing is capability preservation: methods that successfully change targeted behavior ca…
  • We present CrispEdit, a scalable and principled second-order editing algorithm that treats capability preservation as an explicit constrain…
  • CrispEdit formulates editing as constrained optimization and enforces the constraint by projecting edit updates onto the low-curvature subs…
  • At the crux of CrispEdit is expressing capability constraint via Bregman divergence, whose quadratic form yields the Gauss-Newton Hessian e…
  • We make this second-order procedure efficient at the LLM scale using Kronecker-factored approximate curvature (K-FAC) and a novel matrix-fr…
  • Across standard model-editing benchmarks, CrispEdit achieves high edit success while keeping capability degradation below 1% on average acr…
#arXiv #paper #研究/论文
9. Stabilizing Test-Time Adaptation of High-Dimensional Simulation Surrogates via D-Optimal Statistics
分类:研究/论文来源:arxiv_search分数:88作者:Anna Zimmel时间:2026-02-17T18:55:18Z
Machine learning surrogates are increasingly used in engineering to accelerate costly simulations, yet distribution shifts between training and deployment ofte…
  • Machine learning surrogates are increasingly used in engineering to accelerate costly simulations, yet distribution shifts between training…
  • g
  • , unseen geometries or configurations)
  • Test-Time Adaptation (TTA) can mitigate such shifts, but existing methods are largely developed for lower-dimensional classification with s…
  • We address this challenge by proposing a TTA framework based on storing maximally informative (D-optimal) statistics, which jointly enables…
  • When applied to pretrained simulation surrogates, our method yields up to 7% out-of-distribution improvements at negligible computational c…
#arXiv #paper #研究/论文
10. Solving Parameter-Robust Avoid Problems with Unknown Feasibility using Reinforcement Learning
分类:研究/论文来源:arxiv_search分数:85作者:Oswin So时间:2026-02-17T18:53:31Z
Recent advances in deep reinforcement learning (RL) have achieved strong results on high-dimensional control tasks, but applying RL to reachability problems ra…
  • Recent advances in deep reinforcement learning (RL) have achieved strong results on high-dimensional control tasks, but applying RL to reac…
  • This mismatch can result in policies that perform poorly on low-probability states that are still within the safe set
  • A natural alternative is to frame the problem as a robust optimization over a set of initial conditions that specify the initial state, dyn…
  • We propose Feasibility-Guided Exploration (FGE), a method that simultaneously identifies a subset of feasible initial conditions under whic…
  • Empirical results demonstrate that FGE learns policies with over 50% more coverage than the best existing method for challenging initial co…
#arXiv #paper #研究/论文
11. Developing AI Agents with Simulated Data: Why, what, and how?
分类:研究/论文来源:arxiv_search分数:82作者:Xiaoran Liu时间:2026-02-17T18:53:27Z
As insufficient data volume and quality remain the key impediments to the adoption of modern subsymbolic AI, techniques of synthetic data generation are in hig…
  • As insufficient data volume and quality remain the key impediments to the adoption of modern subsymbolic AI, techniques of synthetic data g…
  • Simulation offers an apt, systematic approach to generating diverse synthetic data
  • This chapter introduces the reader to the key concepts, benefits, and challenges of simulation-based synthetic data generation for AI train…
#arXiv #paper #研究/论文 #Agent
12. Avey-B
分类:研究/论文来源:arxiv_search分数:80作者:Devang Acharya时间:2026-02-17T18:50:40Z
Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets. Their effectiveness stems from self-att…
  • Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets
  • Their effectiveness stems from self-attention's ability to deliver high-quality bidirectional contextualization with sequence-level paralle…
  • Recently, Avey was introduced as an autoregressive, attention-free alternative that naturally admits an encoder-only adaptation
  • In this paper, we reformulate Avey for the encoder-only paradigm and propose several innovations to its architecture, including decoupled s…
  • Results show that this reformulated architecture compares favorably to four widely used Transformer-based encoders, consistently outperform…
#arXiv #paper #研究/论文
13. Task-Agnostic Continual Learning for Chest Radiograph Classification
分类:研究/论文来源:arxiv_search分数:78作者:Muthu Subash Kavitha时间:2026-02-17T18:47:30Z
Clinical deployment of chest radiograph classifiers requires models that can be updated as new datasets become available without retraining on previously ob- s…
  • Clinical deployment of chest radiograph classifiers requires models that can be updated as new datasets become available without retraining…
  • We study, for the first time, a task-incremental continual learning setting for chest radiograph classification, in which heterogeneous che…
  • We propose a continual adapter-based routing learning strategy for Chest X-rays (CARL-XRay) that maintains a fixed high-capacity backbone a…
  • A latent task selector operates on task-adapted features and leverages both current and historical context preserved through compact protot…
  • This design supports stable task identification and adaptation across sequential updates while avoiding raw-image storage
  • Experiments on large-scale public chest radiograph datasets demonstrate robust performance retention and reliable task-aware inference unde…
#arXiv #paper #研究/论文
14. Decision Quality Evaluation Framework at Pinterest
分类:研究/论文来源:arxiv_search分数:75作者:Yuqi Tian时间:2026-02-17T18:45:55Z
Online platforms require robust systems to enforce content safety policies at scale. A critical component of these systems is the ability to evaluate the quali…
  • Online platforms require robust systems to enforce content safety policies at scale
  • A critical component of these systems is the ability to evaluate the quality of moderation decisions made by both human agents and Large La…
  • However, this evaluation is challenging due to the inherent trade-offs between cost, scale, and trustworthiness, along with the complexity …
  • To address this, we present a comprehensive Decision Quality Evaluation Framework developed and deployed at Pinterest
  • The framework is centered on a high-trust Golden Set (GDS) curated by subject matter experts (SMEs), which serves as a ground truth benchma…
  • We introduce an automated intelligent sampling pipeline that uses propensity scores to efficiently expand dataset coverage
#arXiv #paper #研究/论文
15. The Geometry of Alignment Collapse: When Fine-Tuning Breaks Safety
分类:研究/论文来源:arxiv_search分数:72作者:Max Springer时间:2026-02-17T18:39:15Z
Fine-tuning aligned language models on benign tasks unpredictably degrades safety guardrails, even when training data contains no harmful content and developer…
  • Fine-tuning aligned language models on benign tasks unpredictably degrades safety guardrails, even when training data contains no harmful c…
  • We show that the prevailing explanation, that fine-tuning updates should be orthogonal to safety-critical directions in high-dimensional pa…
  • We then resolve this through a novel geometric analysis, proving that alignment concentrates in low-dimensional subspaces with sharp curvat…
  • While initial fine-tuning updates may indeed avoid these subspaces, the curvature of the fine-tuning loss generates second-order accelerati…
  • We formalize this mechanism through the Alignment Instability Condition, three geometric properties that, when jointly satisfied, lead to s…
  • Our main result establishes a quartic scaling law: alignment loss grows with the fourth power of training time, governed by the sharpness o…
#arXiv #paper #研究/论文
16. Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings
分类:研究/论文来源:arxiv_search分数:70作者:Suhyung Jang时间:2026-02-17T18:26:36Z
Accurate representation of building semantics, encompassing both generic object types and specific subtypes, is essential for effective AI model training in th…
  • Accurate representation of building semantics, encompassing both generic object types and specific subtypes, is essential for effective AI …
  • Conventional encoding methods (e
  • g
  • , one-hot) often fail to convey the nuanced relationships among closely related subtypes, limiting AI's semantic comprehension
  • To address this limitation, this study proposes a novel training approach that employs large language model (LLM) embeddings (e
  • , OpenAI GPT and Meta LLaMA) as encodings to preserve finer distinctions in building semantics
#arXiv #paper #研究/论文
17. This human study did not involve human subjects: Validating LLM simulations as behavioral evidence
分类:研究/论文来源:arxiv_search分数:68作者:Jessica Hullman时间:2026-02-17T18:18:38Z
A growing literature uses large language models (LLMs) as synthetic participants to generate cost-effective and nearly instantaneous responses in social scienc…
  • A growing literature uses large language models (LLMs) as synthetic participants to generate cost-effective and nearly instantaneous respon…
  • However, there is limited guidance on when such simulations support valid inference about human behavior
  • We contrast two strategies for obtaining valid estimates of causal effects and clarify the assumptions under which each is suitable for exp…
  • Heuristic approaches seek to establish that simulated and observed human behavior are interchangeable through prompt engineering, model fin…
  • While useful for many exploratory tasks, heuristic approaches lack the formal statistical guarantees typically required for confirmatory re…
  • In contrast, statistical calibration combines auxiliary human data with statistical adjustments to account for discrepancies between observ…
#arXiv #paper #研究/论文