Avey-B
每日信息看板 · 2026-02-17
2026-02-17T18:50:40Z
Published
AI 总结
Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets. Their effectiveness stems from self-att…
- Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets
- Their effectiveness stems from self-attention's ability to deliver high-quality bidirectional contextualization with sequence-level paralle…
- Recently, Avey was introduced as an autoregressive, attention-free alternative that naturally admits an encoder-only adaptation
- In this paper, we reformulate Avey for the encoder-only paradigm and propose several innovations to its architecture, including decoupled s…
- Results show that this reformulated architecture compares favorably to four widely used Transformer-based encoders, consistently outperform…
#arXiv #paper #研究/论文
内容摘录
Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets. Their effectiveness stems from self-attention's ability to deliver high-quality bidirectional contextualization with sequence-level parallelism, as popularized by BERT-style architectures. Recently, Avey was introduced as an autoregressive, attention-free alternative that naturally admits an encoder-only adaptation. In this paper, we reformulate Avey for the encoder-only paradigm and propose several innovations to its architecture, including decoupled static and dynamic parameterizations, stability-oriented normalization, and neural compression. Results show that this reformulated architecture compares favorably to four widely used Transformer-based encoders, consistently outperforming them on standard token-classification and information-retrieval benchmarks while scaling more efficiently to long contexts.