Publications

2024

October

Birdie: Advancing State Space Language Modeling with Dynamic Mixtures of Training Objectives

Blouir et al.

COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences

Liu et al.

COADREADx: A comprehensive algorithmic dissection of colorectal cancer unravels salient biomarkers and actionable insights into its discrete progression

Palaniappan, Muthamilselvan & Sarathi

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Miranda et al.

From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages

Kiulian et al.

Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance

Nakamoto et al.

Cliqueformer: Model-Based Optimization with Structured Transformers

Kuba, Abbeel & Levine

Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging

Morrison et al.

Beyond Oversmoothing: Evaluating DDPM and MSE for Scalable Speech Synthesis in ASR

Minixhofer, Klejch & Bell

One Step Diffusion via Shortcut Models

Frans et al.

AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning

Pramanik et al.

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

Jiang et al.

ACER: Automatic Language Model Context Extension via Retrieval

Gao, Zhang & Callan

A community effort to optimize sequence-based deep learning models of gene regulation

Rafi et al.

Conic Activation Functions

Fu & Cohen

Best Unpacking DPO and PPO: Disentangling Practices for Learning from Preference Feedback

Ivison et al.

ElasticTok: Adaptive Tokenization for Image and Video

Yan et al.

HaloClass: Salt-Tolerant Protein Classification with Protein Language Models

Narang et al.

Temperature Optimization for Bayesian Deep Learning

Ng et al.

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

Wang et al.

Distilling an End-to-End Voice Assistant Without Instruction Training Data

Held et al.

Tracking objects that change in appearance with phase synchrony

Muzellec et al.

June

BulkRNABert: Cancer prognosis from bulk RNA-seq based language models

Gélard et al.

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

Tong et al.

Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization

Chalumeau et al.

Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation

Frohmann et al.

SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization

Ahn et al.

Autoregressive Image Generation without Vector Quantization

Li et al.

Generative Visual Instruction Tuning

Hernandez, Villegas & Ordonez

Loss landscape geometry reveals stagewise development of transformers

Wang et al.

ContrastiveMix: Overcoming Code-Mixing Dilemma in Cross-Lingual Transfer for Information Retrieval

Do, Lee & Hwang

COMMIT: Code-Mixing English-Centric Large Language Model for Multilingual Instruction Tuning

Lee, Jung & Hwang

Re-evaluating the Need for Visual Signals in Unsupervised Grammar Induction

Li et al.

ScriptMix: Mixing Scripts for Low-resource Language Parsing

Lee, Lee & Hwang

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Ivison et al.

Pecan: Cost-Efficient ML Data Preprocessing with Automatic Transformation Ordering and Hybrid Placement

Graur et al.

Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training

Gao et al.

NAFlora-1M: Continental-Scale High-Resolution Fine-Grained Plant Classification Dataset

Park et al.

Medical Vision Generalist: Unifying Medical Imaging Tasks in Context

Ren et al.

SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Dagli et al.

Robust Modeling through Causal Priors and Data Purification in Machine Learning

Bhat

EvIL: Evolution Strategies for Generalisable Imitation Learning

Sapora et al.

Learning to Explore for Stochastic Gradient MCMC

Kim et al.

March

Text Filtering Classifiers for Medium-Resource Languages

Daðason & Loftsson

Comparing human and machine visual perception

Veerabadran

DeepFake Video Detection using Vision Transformer

Hussien & Mohamed

IT5: Text-to-text Pretraining for Italian Language Understanding and Generation

Sarti & Nissim

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Lee et al.

BC-Predict: Mining of signal biomarkers and multilevel validation of cascade classifier for early-stage breast cancer subtyping and prognosis

Muthamilselvan & Palaniappan

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Khazatsky et al.

Deep Manifold Learning for Reading Comprehension and Logical Reasoning Tasks with Polytuplet Loss

Lu & Rodriguez

MicroDiffusion: Implicit Representation-Guided Diffusion for 3D Reconstruction from Limited 2D Microscopy Projections

Hui et al.

MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling

Limisiewicz et al.

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

Huang et al.

Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages

van Noord et al.

IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages

Khan et al.

Advancements in Hand-Drawn Chemical Structure Recognition through an Enhanced DECIMER Architecture

Rajan et al.

Eyes wide shut? exploring the visual shortcomings of multimodal llms

Tong et al.

Can a Confident Prior Replace a Cold Posterior?

Marek, Paige & Izmailov

Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models

Li et al.

2023

December

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Lu et al.

BantuLM: Enhancing Cross-Lingual Learning in the Bantu Language Family

Mohamed et al.

Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models

Tamber, Pradeep & Lin

Discovering modular solutions that generalize compositionally

Schug et al.

Diffusion Models With Learned Adaptive Noise

Sahoo et al.

NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation

Thakur et al.

OCaTS: an Online Cost-aware Teacher-Student Framework to Reduce the Calls to Large Language Models

Stogiannidis

Understanding Physical Dynamics with Counterfactual World Modeling

Venkatesh et al.

CogMemLM: Human-Like Memory Mechanisms Improve Performance and Cognitive Plausibility of LLMs

Thoma et al.

Compress & Align: Curating Image-Text Data with Human Knowledge

Zhang et al.

Better Quality Pre-training Data and T5 Models for African Languages

Oladipo et al.

Multilingual Lottery Tickets to Pretrain Language Models

Lee & Hwang

JASMINE: Arabic GPT Models for Few-Shot Learning

Nagoudi et al.

Dolphin: A Challenging and Diverse Benchmark for Arabic NLG

Nagoudi et al.

An LLM Compiler for Parallel Function Calling

Kim et al.

QTSumm: Query-Focused Summarization over Tabular Data

Zhao et al.

Rejuvenating image-GPT as Strong Visual Representation Learners

Ren et al.

Sequential Modeling Enables Scalable Learning for Large Vision Models

Bai et al.

October

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Meulemans et al.

To repeat or not to repeat: Insights from scaling llm under token-crisis

Xue et al.

IACS-LRILT: Machine Translation for Low-Resource Indic Languages

Suman et al.

A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation

Fel et al.

Beyond Invariance: Test-Time Label-Shift Adaptation for Addressing "Spurious" Correlations

Sun et al.

Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization

Grinsztajn et al.

Speculative Decoding with Big Little Decoder

Kim et al.

Blockwise Parallel Transformers for Large Context Models

Liu & Abbeel

Proving test set contamination in black box language models

Oren et al.

Combinatorial Optimization with Policy Adaptation using Latent Space Search

Chalumeau et al.

Recurrent Linear Transformers

Pramanik

Octopus: A Multitask Model and Toolkit for Arabic Natural Language Generation

Elmadany, Nagoudi & Abdul-Mageed

Cache me if you Can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models

Stogiannidis et al.

Kunstig intelligens og Nasjonalbiblioteket

Brygfjeld

GUANinE v1. 0: Benchmark Datasets for Genomic AI Sequence-to-Function Models

robson & Ioannidis

NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration

Sridhar et al.

TabLib: A Dataset of 627M Tables with Context

Eggert et al.

3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers

Chen et al.

Lemur: Harmonizing Natural Language and Code for Language Agents

Xu et al.

FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning

Xu et al.

Ring Attention with Blockwise Transformers for Near-Infinite Context

Liu, Zaharia & Abbeel

A Foundational Large Language Model for Edible Plant Genomes

Mendoza-Revilla et al.

June

Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners

Yadav et al.

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design

Guo

CLIPA-v2: Scaling CLIP Training with 81.1% Zero-shot ImageNet Accuracy within a\$10,000 Budget; An Extra\$4,000 Unlocks 81.8% Accuracy

Li, Wang & Xie

SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate through Compiler Co-design

Guo

AmnioML: Amniotic Fluid Segmentation and Volume Prediction with Uncertainty Quantification

Csillag et al.

Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression

Raventós et al.

Script, Language, and Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

Lee, Lee & Hwang

ViNT: A Foundation Model for Visual Navigation

Shah et al.

Long-range Language Modeling with Self-retrieval

Rubin & Berant

Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent

Lin et al.

Understanding and Mitigating Hardware Failures in Deep Learning Training Systems

He et al.

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

Bonnet et al.

Resolution based Incremental Scaling Methodology for CNNs

Lim, Lee & Ha

Anticipatory Music Transformer

Thickstun et al.

HunEmBERT: a fine-tuned BERT-model for classifying sentiment and emotion in political communication

Üveges & Ring

SqueezeLLM: Dense-and-Sparse Quantization

Kim et al.

RoBERTweet: A BERT Language Model for Romanian Tweets

Tăiatu et al.

Gradient-Informed Quality Diversity for the Illumination of Discrete Spaces

Boige et al.

Performance-optimized deep neural networks are evolving into worse models of inferotemporal visual cortex

Linsley et al.

Adversarial alignment: Breaking the trade-off between the strength of an attack and its relevance to human perception

Linsley et al.

LexGPT 0.1: pre-trained GPT-J models with Pile of Law

Lee

Unifying (Machine) Vision via Counterfactual World Modeling

Bear et al.

Extracting Reward Functions from Diffusion Models

Nuti, Franzmeyer & Henriques

Masked Autoencoders with Multi-Window Attention Are Better Audio Learners

Yadav et al.

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Subramanian et al.

May

Full Stack Optimization of Transformer Inference

Kim et al.

Temporally Consistent Transformers for Video Generation

Yan et al.

Hardware Software Co-design and Architectural Optimization of Deep Learning Models for Natural Language Processing

Wattanawong & Keutzer

Blockwise Parallel Transformer for Long Context Large Models

Liu & Abbeel

Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation

Minixhofer, Pfeiffer & Vulic

Self Information Update for Large Language Models through Mitigating Exposure Bias

Yu & Ji

Emergent Agentic Transformer from Chain of Hindsight Experience

Liu & Abbeel

The false promise of imitating proprietary language models

Gudibande et al.

Beyond Model Efficiency: Data Optimizations for Machine Learning Systems

Kuchnik

GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP

Khondaker et al.

BERT を用いた日本語の意味変化の分析

小林千真

Difference-Masking: Choosing What to Mask in Continued Pretraining

Wilf et al.

Video Prediction Models as Rewards for Reinforcement Learning

Escontrela et al.

Exploring Large Language Models for Classical Philology

Riemenschneider & Frank

CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models

Minixhofer, Pfeiffer & Vulic

Training Diffusion Models with Reinforcement Learning

Black et al.

An Inverse Scaling Law for CLIP Training

Li, Wang & Xie

Varta: A Large-Scale Headline-Generation Dataset for Indic Languages

Aralikatte et al.

Harnessing the Power of BERT in the Turkish Clinical Domain: Pretraining Approaches for Limited Data Scenarios

Türkmen et al.

Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation

Kirstain et al.

Enriching Biomedical Knowledge for Low-resource Language Through Large-Scale Translation

Phan et al.

Poisoning Language Models During Instruction Tuning

Wan et al.

2022

October

Random Sharpness-Aware Minimization

Liu et al.

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

Singh et al.

Learning Probabilistic Models from Generator Latent Spaces with Hat EBM

Hill et al.

Pruning's Effect on Generalization Through the Lens of Training and Regularization

Jin et al.

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Gandhi, von Platen & Rush

Legal-Tech Open Diaries: Lesson learned on how to develop and deploy light-weight models in the era of humongous Language Models

Maroudas et al.

Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models

Liu et al.

MetaFormer Baselines for Vision

Yu et al.

Do Language Models Understand Measurements?

Park, Ryu & Choi

Bioberturk: Exploring Turkish Biomedical Language Model Development Strategies in Low Resource Setting

Türkmen et al.

A Comprehensive Analysis of Subword Tokenizers for Morphologically Rich Languages

Erkaya

Optimizing Hierarchical Image VAEs for Sample Quality

Luhman & Luhman

MTet: Multi-domain Translation for English and Vietnamese

Ngo et al.

Integrative dissection of gene regulatory elements at base resolution

Chen et al.

EleutherAI: Going Beyond “Open Science” to “Science in the Open”

Phang et al.

IndoLib: A Natural Language Processing Toolkit for Low-Resource South Asian Languages

Timalsina

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

Kumar et al.

ConserWeightive Behavioral Cloning for Reliable Offline Reinforcement Learning

Nguyen, Zheng & Grover

An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

Chalkidis et al.

Population-Based Reinforcement Learning for Combinatorial Optimization

Grinsztajn, Furelos-Blanco & Barrett

Temporally Consistent Video Transformer for Long-Term Video Prediction

Yan et al.

May

Divide to adapt: Mitigating confirmation bias for domain adaptation of black-box predictors

Yang et al.

FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders

Wang et al.

Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction

Kolluru et al.

Describing Differences between Text Distributions with Natural Language

Zhong et al.

Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

Notin et al.

EBM Life Cycle: MCMC Strategies for Synthesis, Defense, and Density Modeling

Hill et al.

hmBERT: Historical Multilingual Language Models for Named Entity Recognition

Schweter et al.

Multimodal Masked Autoencoders Learn Transferable Representations

Geng et al.

Zero-Shot and Few-Shot Learning for Lung Cancer Multi-Label Classification using Vision Transformer

Guo & Fan

Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval

Gao & Callan

Inception Transformer

Si et al.

BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla

Bhattacharjee et al.

Semi-self-supervised Automated ICD Coding

Hlynsson et al.

xcit

Dagli

Generating Disentangled Arguments With Prompts: A Simple Event Extraction Framework That Works

Si et al.

Multilingual multi-aspect explainability analyses on machine reading comprehension models

Cui et al.

ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language

Phan et al.

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Liu et al.

Long Document Re-ranking with Modular Re-ranker

Gao & Callan

Odor Descriptor Understanding through Prompting

Sisson

On the Design of 2D Human Pose Estimation Networks using Accelerated Neuroevolution and Novel Keypoint Representations

McNally

2021

September

Revisiting transposed convolutions for interpreting raw waveform sound event recognition CNNs by sonification

Yadav & Foster

Training on Test Data with Bayesian Adaptation for Covariate Shift

Zhou & Levine

O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information

Soares et al.

JAX vs PyTorch: A simple transformer benchmark

Nolan

The Challenge of Appearance-Free Object Tracking with Feedforward Neural Networks

Malik et al.

AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation

Nagoudi, Elmadany & Abdul-Mageed

Clustering Monolingual Vocabularies to Improve Cross-Lingual Generalization

Bassani

Pretrained Neural Models for Turkish Text Classification

Okur & Sertbaş

Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations

Araujo et al.

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation

Xu, Van Durme & Murray

ReasonBERT: Pre-trained to Reason with Distant Supervision

Deng et al.

ChessCoach

Butner

Performance of chemical structure string representations for chemical image recognition using transformers

Rajan, Zielesny & Steinbeck

An Approach to Extractive Bangla Question Answering Based On BERT-Bangla And BQuAD

Saha et al.

TRC로 월 몇만원에 GPU 수십개급의.. TPU 사용 가능

Lee

Characterizing Possible Failure Modes in Physics-Informed Neural Networks

Krishnapriyan et al.

An Empirical Exploration in Quality Filtering of Text Data

Gao

April

Contextualized Query Embeddings for Conversational Search

Lin, Yang & Lin

DECIMER1.0: Deep Learning for Chemical Image Recognition using Transformers

Rajan, Zielesny & Steinbeck

Clinical BERT Models Trained on Pseudo Re-identified MIMIC-III Notes

Lehman et al.

Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model

Kummervold et al.

Categorising Vaccine Confidence with TransformerBased Machine Learning Model: The Nuances of Vaccine Sentiment on Twitter

Kummervold et al.

City-Scale Simulation Of Covid-19 Pandemic & Intervention Policies Using Agent-Based Modelling

Suryawanshi et al.

CodeTrans: Towards Cracking the Language of Silicone's Code Through Self-Supervised Deep Learning and High Performance Computing

Elnaggar et al.

Arabic Compact Language Modelling for Resource Limited Devices

Alyafeai & Ahmad

Igor Ivanov: Harnessing Machine Learning Skills to Reduce Damages from Tropical Storms

Radiant Earth Foundation

Computer Vision and Deep Learning for Environment-Adaptive Control of Robotic Lower-Limb Exoskeletons

Laschowski et al.

InAugment: Improving Classifiers via Internal Augmentation

Arar, Shamir & Bermano

IndT5: A Text-to-Text Transformer for 10 Indigenous Languages

Nagoudi et al.

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

Ramesh et al.

Self-Supervised Representation Learning with Relative Predictive Coding

Tsai et al.

Virtual Sensing and Sensors Selection for Efficient Temperature Monitoring in Indoor Environments

Brunello et al.

2020

2019

2018

Don't see your TRC-supported work here?

Please let us know about it by filling out this short form .