Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Meulemans et al.

To repeat or not to repeat: Insights from scaling llm under token-crisis

Xue et al.

IACS-LRILT: Machine Translation for Low-Resource Indic Languages

Suman et al.

A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation

Fel et al.

Beyond Invariance: Test-Time Label-Shift Adaptation for Addressing "Spurious" Correlations

Sun et al.

Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization

Grinsztajn et al.

Speculative Decoding with Big Little Decoder

Kim et al.

Blockwise Parallel Transformers for Large Context Models

Liu & Abbeel

Combinatorial Optimization with Policy Adaptation using Latent Space Search

Chalumeau et al.

Recurrent Linear Transformers


Octopus: A Multitask Model and Toolkit for Arabic Natural Language Generation

Elmadany, Nagoudi & Abdul-Mageed

Cache me if you Can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models

Stogiannidis et al.

Kunstig intelligens og Nasjonalbiblioteket


GUANinE v1. 0: Benchmark Datasets for Genomic AI Sequence-to-Function Models

robson & Ioannidis

NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration

Sridhar et al.

TabLib: A Dataset of 627M Tables with Context

Eggert et al.

3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers

Chen et al.

Lemur: Harmonizing Natural Language and Code for Language Agents

Xu et al.

FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning

Xu et al.

Ring Attention with Blockwise Transformers for Near-Infinite Context

Liu, Zaharia & Abbeel

A Foundational Large Language Model for Edible Plant Genomes

Mendoza-Revilla et al.


Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners

Yadav et al.

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design


Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Meulemans et al.

CLIPA-v2: Scaling CLIP Training with 81.1% Zero-shot ImageNet Accuracy within a\$10,000 Budget; An Extra\$4,000 Unlocks 81.8% Accuracy

Li, Wang & Xie

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate through Compiler Co-design


AmnioML: Amniotic Fluid Segmentation and Volume Prediction with Uncertainty Quantification

Csillag et al.

Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression

Raventós et al.

Script, Language, and Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

Lee, Lee & Hwang

ViNT: A Foundation Model for Visual Navigation

Shah et al.

Long-range Language Modeling with Self-retrieval

Rubin & Berant

Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent

Lin et al.

Understanding and Mitigating Hardware Failures in Deep Learning Training Systems

He et al.

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

Bonnet et al.

Resolution based Incremental Scaling Methodology for CNNs

Lim, Lee & Ha

Anticipatory Music Transformer

Thickstun et al.

HunEmBERT: a fine-tuned BERT-model for classifying sentiment and emotion in political communication

Üveges & Ring

SqueezeLLM: Dense-and-Sparse Quantization

Kim et al.

RoBERTweet: A BERT Language Model for Romanian Tweets

Tăiatu et al.

Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces

Boige et al.

Performance-optimized deep neural networks are evolving into worse models of inferotemporal visual cortex

Linsley et al.

Adversarial alignment: Breaking the trade-off between the strength of an attack and its relevance to human perception

Linsley et al.

LexGPT 0.1: pre-trained GPT-J models with Pile of Law


Unifying (Machine) Vision via Counterfactual World Modeling

Bear et al.

Extracting Reward Functions from Diffusion Models

Nuti, Franzmeyer & Henriques

Masked Autoencoders with Multi-Window Attention Are Better Audio Learners

Yadav et al.

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Subramanian et al.


Full Stack Optimization of Transformer Inference

Kim et al.

Temporally Consistent Transformers for Video Generation

Yan et al.

Hardware Software Co-design and Architectural Optimization of Deep Learning Models for Natural Language Processing

Wattanawong & Keutzer

Blockwise Parallel Transformer for Long Context Large Models

Liu & Abbeel

Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation

Minixhofer, Pfeiffer & Vulic

Self Information Update for Large Language Models through Mitigating Exposure Bias

Yu & Ji

Emergent Agentic Transformer from Chain of Hindsight Experience

Liu & Abbeel

The False Promise of Imitating Proprietary LLMs

Gudibande et al.

Beyond Model Efficiency: Data Optimizations for Machine Learning Systems


GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP

Khondaker et al.

BERT を用いた日本語の意味変化の分析


Dolphin: A Challenging and Diverse Benchmark for Arabic NLG

Nagoudi et al.

Difference-Masking: Choosing What to Mask in Continued Pretraining

Wilf et al.

Video Prediction Models as Rewards for Reinforcement Learning

Escontrela et al.

Exploring Large Language Models for Classical Philology

Riemenschneider & Frank

CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models

Minixhofer, Pfeiffer & Vulic

Training Diffusion Models with Reinforcement Learning

Black et al.

An Inverse Scaling Law for CLIP Training

Li, Wang & Xie

Varta: A Large-Scale Headline-Generation Dataset for Indic Languages

Aralikatte et al.

Harnessing the Power of BERT in the Turkish Clinical Domain: Pretraining Approaches for Limited Data Scenarios

Türkmen et al.

Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation

Kirstain et al.

Enriching Biomedical Knowledge for Low-resource Language Through Large-Scale Translation

Phan et al.

Poisoning Language Models During Instruction Tuning

Wan et al.



Random Sharpness-Aware Minimization

Liu et al.

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

Singh et al.

Learning Probabilistic Models from Generator Latent Spaces with Hat EBM

Hill et al.

Pruning's Effect on Generalization Through the Lens of Training and Regularization

Jin et al.

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Gandhi, von Platen & Rush

Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models

Maroudas et al.

Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models

Liu et al.

MetaFormer Baselines for Vision

Yu et al.

Do Language Models Understand Measurements?

Park, Ryu & Choi

Bioberturk: Exploring Turkish Biomedical Language Model Development Strategies in Low Resource Setting

Türkmen et al.

A Comprehensive Analysis of Subword Tokenizers for Morphologically Rich Languages


Optimizing Hierarchical Image VAEs for Sample Quality

Luhman & Luhman

MTet: Multi-domain Translation for English and Vietnamese

Ngo et al.

Integrative dissection of gene regulatory elements at base resolution

Chen et al.

EleutherAI: Going Beyond “Open Science” to “Science in the Open”

Phang et al.

IndoLib: A Natural Language Processing Toolkit for Low-Resource South Asian Languages


Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

Kumar et al.

ConserWeightive Behavioral Cloning for Reliable Offline Reinforcement Learning

Nguyen, Zheng & Grover

An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

Chalkidis et al.

Population-Based Reinforcement Learning for Combinatorial Optimization

Grinsztajn, Furelos-Blanco & Barrett

Temporally Consistent Video Transformer for Long-Term Video Prediction

Yan et al.


Divide to adapt: Mitigating confirmation bias for domain adaptation of black-box predictors

Yang et al.

FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders

Wang et al.

Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction

Kolluru et al.

Describing Differences between Text Distributions with Natural Language

Zhong et al.

Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

Notin et al.

EBM Life Cycle: MCMC Strategies for Synthesis, Defense, and Density Modeling

Hill et al.

hmBERT: Historical Multilingual Language Models for Named Entity Recognition

Schweter et al.

Multimodal Masked Autoencoders Learn Transferable Representations

Geng et al.

Zero-Shot and Few-Shot Learning for Lung Cancer Multi-Label Classification using Vision Transformer

Guo & Fan

Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval

Gao & Callan

Inception Transformer

Si et al.

BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla

Bhattacharjee et al.

Semi-self-supervised Automated ICD Coding

Hlynsson et al.



Generating Disentangled Arguments With Prompts: A Simple Event Extraction Framework That Works

Si et al.

Multilingual multi-aspect explainability analyses on machine reading comprehension models

Cui et al.

ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language

Phan et al.

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Liu et al.

Long Document Re-ranking with Modular Re-ranker

Gao & Callan

Odor Descriptor Understanding through Prompting


On the Design of 2D Human Pose Estimation Networks using Accelerated Neuroevolution and Novel Keypoint Representations




Revisiting transposed convolutions for interpreting raw waveform sound event recognition CNNs by sonification

Yadav & Foster

Training on Test Data with Bayesian Adaptation for Covariate Shift

Zhou & Levine

O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information

Soares et al.

JAX vs PyTorch: A simple transformer benchmark


The Challenge of Appearance-Free Object Tracking with Feedforward Neural Networks

Malik et al.

AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation

Nagoudi, Elmadany & Abdul-Mageed

Clustering Monolingual Vocabularies to Improve Cross-Lingual Generalization


Pretrained Neural Models for Turkish Text Classification

Okur & Sertbaş

Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations

Araujo et al.

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation

Xu, Van Durme & Murray

ReasonBERT: Pre-trained to Reason with Distant Supervision

Deng et al.



Performance of chemical structure string representations for chemical image recognition using transformers

Rajan, Zielesny & Steinbeck

An Approach to Extractive Bangla Question Answering Based On BERT-Bangla And BQuAD

Saha et al.

TRC로 월 몇만원에 GPU 수십개급의.. TPU 사용 가능


Characterizing Possible Failure Modes in Physics-Informed Neural Networks

Krishnapriyan et al.

An Empirical Exploration in Quality Filtering of Text Data



Contextualized Query Embeddings for Conversational Search

Lin, Yang & Lin

DECIMER1.0: Deep Learning for Chemical Image Recognition using Transformers

Rajan, Zielesny & Steinbeck

Clinical BERT Models Trained on Pseudo Re-identified MIMIC-III Notes

Lehman et al.

Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model

Kummervold et al.

Categorising Vaccine Confidence with TransformerBased Machine Learning Model: The Nuances of Vaccine Sentiment on Twitter

Kummervold et al.

City-Scale Simulation Of Covid-19 Pandemic & Intervention Policies Using Agent-Based Modelling

Suryawanshi et al.

CodeTrans: Towards Cracking the Language of Silicone's Code Through Self-Supervised Deep Learning and High Performance Computing

Elnaggar et al.

Arabic Compact Language Modelling for Resource Limited Devices

Alyafeai & Ahmad

Igor Ivanov: Harnessing Machine Learning Skills to Reduce Damages from Tropical Storms

Radiant Earth Foundation

Computer Vision and Deep Learning for Environment-Adaptive Control of Robotic Lower-Limb Exoskeletons

Laschowski et al.

InAugment: Improving Classifiers via Internal Augmentation

Arar, Shamir & Bermano

IndT5: A Text-to-Text Transformer for 10 Indigenous Languages

Nagoudi et al.

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

Ramesh et al.

Self-Supervised Representation Learning with Relative Predictive Coding

Tsai et al.

Virtual Sensing and Sensors Selection for Efficient Temperature Monitoring in Indoor Environments

Brunello et al.




Don't see your TRC-supported work here?

Please let us know about it by filling out this short form .