Publications
2024
October
Palaniappan, Muthamilselvan & Sarathi
Kiulian et al.
Cliqueformer: Model-Based Optimization with Structured Transformers
Kuba, Abbeel & Levine
Beyond Oversmoothing: Evaluating DDPM and MSE for Scalable Speech Synthesis in ASR
Minixhofer, Klejch & Bell
One Step Diffusion via Shortcut Models
Frans et al.
ACER: Automatic Language Model Context Extension via Retrieval
Gao, Zhang & Callan
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
Wang et al.
Tracking objects that change in appearance with phase synchrony
Muzellec et al.
November
September
Linsley et al.
Protein Sequence Modelling with Bayesian Flow Networks
Atkinson et al.
MURI: High-Quality Instruction Tuning Datasets for Low-Resource Languages via Reverse Instructions
Köksal et al.
Benchmarking Quantum Red TEA on CPUs, GPUs, and TPUs
Jaschke et al.
August
Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs
Yadav, Theodoridis & Tan
Gonzales, Ureta & Shrestha
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Doshi et al.
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition
Cornell et al.
STOUT V2. 0: SMILES to IUPAC name conversion using transformer models
Rajan, Zielesny & Steinbeck
A Novel One-To-One Framework for Relative Camera Pose Estimation
Aydogdu & Demirci
July
The Use of Clinical Language Models Pretrained on Institutional EHR Data for Downstream Tasks
Suvirat et al.
Understanding Reference Policies in Direct Preference Optimization
Liu, Liu & Cohan
Toucan: Many-to-Many Translation for 150 African Language Pairs
Elmadany, Adebara & Abdul-Mageed
Predicting Emergent Capabilities by Finetuning
Snell et al.
Is Transformer-Based Attention Agnostic of the Pretraining Language and Task?
Martin, Visser & Dunaiski
June
Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization
Chalumeau et al.
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation
Frohmann et al.
Generative Visual Instruction Tuning
Hernandez, Villegas & Ordonez
ContrastiveMix: Overcoming Code-Mixing Dilemma in Cross-Lingual Transfer for Information Retrieval
Do, Lee & Hwang
COMMIT: Code-Mixing English-Centric Large Language Model for Multilingual Instruction Tuning
Lee, Jung & Hwang
ScriptMix: Mixing Scripts for Low-resource Language Parsing
Lee, Lee & Hwang
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
Ivison et al.
Graur et al.
May
Scaling White-Box Transformers for Vision
Yang et al.
PureEBM: Universal Poison Purification via Mid-Run Dynamics of Energy-Based Models
Pooladzandi et al.
PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics
Bhat et al.
Subbotko, Jablonski & Bilinski
Learning the Language of Protein Structure
Gaujac et al.
Minixhofer, Ponti & Vulić
A unifying framework for functional organization in early and higher ventral visual cortex
Margalit et al.
April
Comparing GPU and TPU in an Iterative Scenario: A Study on Neural Network-based Image Generation
Roman, Schaarschmidt & Karl
Fast Ensembling with Diffusion Schrödinger Bridge
Kim, Yoon & Lee
Ecological Data and Objectives for Human Alignment
Nagaraj et al.
HMAX Strikes Back: Self-supervised Learning of Human-Like Scale Invariant Representations
Pant et al.
Measuring Cross-lingual Transfer in Bytes
de Souza et al.
Ljubešić et al.
March
Text Filtering Classifiers for Medium-Resource Languages
Daðason & Loftsson
Comparing human and machine visual perception
Veerabadran
DeepFake Video Detection using Vision Transformer
Hussien & Mohamed
Muthamilselvan & Palaniappan
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Khazatsky et al.
Deep Manifold Learning for Reading Comprehension and Logical Reasoning Tasks with Polytuplet Loss
Lu & Rodriguez
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
Limisiewicz et al.
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages
van Noord et al.
IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
Khan et al.
Advancements in Hand-Drawn Chemical Structure Recognition through an Enhanced DECIMER Architecture
Rajan et al.
Can a Confident Prior Replace a Cold Posterior?
Marek, Paige & Izmailov
February
What Evidence Do Language Models Find Convincing?
Wan, Wallace & Klein
SMX: Sequential Monte Carlo Planning for Expert Iteration
Macfarlane et al.
Gkouti et al.
Zhou, Finn & Harrison
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners
Pooladzandi & Li
The Developmental Landscape of In-Context Learning
Hoogland et al.
January
TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation
Uludoğan et al.
Industry-sensitive language modeling for business
Borchert et al.
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Ma et al.
Revisiting Adversarial Training at Scale
Wang et al.
Image Sculpting: Precise Object Editing with 3D Geometry Control
Yenphraphai et al.
Cheetah: Natural Language Generation for 517 African Languages
Adebara, Elmadany & Abdul-Mageed
2023
December
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Lu et al.
Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models
Tamber, Pradeep & Lin
Diffusion Models With Learned Adaptive Noise
Sahoo et al.
NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation
Thakur et al.
OCaTS: an Online Cost-aware Teacher-Student Framework to Reduce the Calls to Large Language Models
Stogiannidis
Understanding Physical Dynamics with Counterfactual World Modeling
Venkatesh et al.
CogMemLM: Human-Like Memory Mechanisms Improve Performance and Cognitive Plausibility of LLMs
Thoma et al.
JASMINE: Arabic GPT Models for Few-Shot Learning
Nagoudi et al.
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG
Nagoudi et al.
November
Offline RL for generative design of protein binders
Tarasov et al.
Neural Rendering in the Cloud with Tensor Processing Unit
Soto-Chirinos, Condori-Alejo & Alzamora
Jeon et al.
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Gandhi, von Platen & Rush
October
Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis
Meulemans et al.
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation
Fel et al.
Beyond Invariance: Test-Time Label-Shift Adaptation for Addressing "Spurious" Correlations
Sun et al.
Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization
Grinsztajn et al.
Speculative Decoding with Big Little Decoder
Kim et al.
Pramanik
Octopus: A Multitask Model and Toolkit for Arabic Natural Language Generation
Elmadany, Nagoudi & Abdul-Mageed
Stogiannidis et al.
GUANinE v1. 0: Benchmark Datasets for Genomic AI Sequence-to-Function Models
robson & Ioannidis
TabLib: A Dataset of 627M Tables with Context
Eggert et al.
Ring Attention with Blockwise Transformers for Near-Infinite Context
Liu, Zaharia & Abbeel
A Foundational Large Language Model for Edible Plant Genomes
Mendoza-Revilla et al.
September
A Manual Evaluation Method of Neural MT for Indigenous Languages
Wiechetek, Pirinen & Kummervold
FinAraT5: A text to text model for financial Arabic text understanding and generation
Zmandar, El-Haj & Rayson
An ML approach to resolution of singularities
Bérczi, Fan & Zeng
Amerio, Cuoco & Fornengo
August
Cabrita: closing the gap for foreign languages
Larcher et al.
A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models
Fu et al.
BridgeData V2: A Dataset for Robot Learning at Scale
Walke et al.
SE (3) Equivariant Augmented Coupling Flows
Midgley et al.
July
Jiang, Fang & Wang
Learning to Model the World with Language
Lin et al.
Developing and Evaluating Tiny to Medium-Sized Turkish BERT Models
Kesgin, Yüce & Amasyali
Bosschart
PASTA: Pretrained Action-State Transformer Agents
Boige et al.
Doddapaneni et al.
Focused Transformer: Contrastive Training for Context Scaling
Tworkowski et al.
Reinforcement Learning from Passive Data via Latent Intentions
Ghosh, Bhateja & Levine
June
Li, Wang & Xie
AmnioML: Amniotic Fluid Segmentation and Volume Prediction with Uncertainty Quantification
Csillag et al.
Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression
Raventós et al.
Lee, Lee & Hwang
ViNT: A Foundation Model for Visual Navigation
Shah et al.
Long-range Language Modeling with Self-retrieval
Rubin & Berant
Anticipatory Music Transformer
Thickstun et al.
HunEmBERT: a fine-tuned BERT-model for classifying sentiment and emotion in political communication
Üveges & Ring
SqueezeLLM: Dense-and-Sparse Quantization
Kim et al.
RoBERTweet: A BERT Language Model for Romanian Tweets
Tăiatu et al.
Linsley et al.
Linsley et al.
Extracting Reward Functions from Diffusion Models
Nuti, Franzmeyer & Henriques
May
Wattanawong & Keutzer
Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation
Minixhofer, Pfeiffer & Vulic
The false promise of imitating proprietary language models
Gudibande et al.
GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP
Khondaker et al.
小林千真
Video Prediction Models as Rewards for Reinforcement Learning
Escontrela et al.
Exploring Large Language Models for Classical Philology
Riemenschneider & Frank
CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models
Minixhofer, Pfeiffer & Vulic
An Inverse Scaling Law for CLIP Training
Li, Wang & Xie
Varta: A Large-Scale Headline-Generation Dataset for Indic Languages
Aralikatte et al.
Türkmen et al.
April
Factorized visual representations in the primate visual system and deep neural networks
Lindsey & Issa
Xiao et al.
March
Reduce, Reuse, Recycle: Selective Reincarnation in Multi-Agent Reinforcement Learning
Formanek et al.
Macé et al.
Pre-processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering
Tamber, Pradeep & Lin
February
Big Little Transformer Decoder
Kim et al.
Martingale Posterior Neural Processes
Lee et al.
Language-Driven Representation Learning for Robotics
Karamcheti et al.
Toward denoising of 3D CT scans with few data
Liang et al.
Bernardini et al.
Languages are Rewards: Hindsight Finetuning using Human Feedback
Liu, Sferrazza & Abbeel
Model-based Policy Optimization under Approximate Bayesian Inference
Wang, Chen & Murphy
January
後藤諒也 & 伊藤克亘
Does progress on ImageNet transfer to real world datasets?
Fang, Kornblith & Schmidt
2022
December
Generative Approach for Gender Rewriting Task with ArabicT5
Alrowili & Vijay-Shanker
Generating Classical Arabic Poetry using Pre-trained Models
ElOraby et al.
Cervical Cancer Screening on Multi-class Imbalanced Cervigram Dataset using Transfer Learning
Saini & Susan
Deep Learning Methodology for Early Detection and Outbreak Prediction of Invasive Species Growth
Elias
RoSummary: Control Tokens for Romanian News Summarization
Niculescu, Ruseti & Dascalu
Exploring Learning Rate Scaling Rules for Distributed ML Training on Transient Resources
André, Strati & Klimovic
ManyFold: an efficient and flexible library for training and validating protein folding models
Villegas-Morcillo et al.
November
Soto Chirinos
Semi-supervised Automated Clinical Coding Using International Classification of Diseases
Hlynsson et al.
Richter & Pal
NepBERTa: Nepali Language Model Trained in a Large Corpus
Gautam, Timilsina & Bhattarai
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function
Bonnet, Midgley & Laterre
Posterior Matching for Arbitrary Conditioning
Strauss & Oliva
October
Random Sharpness-Aware Minimization
Liu et al.
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Gandhi, von Platen & Rush
Maroudas et al.
MetaFormer Baselines for Vision
Yu et al.
Do Language Models Understand Measurements?
Park, Ryu & Choi
Türkmen et al.
Optimizing Hierarchical Image VAEs for Sample Quality
Luhman & Luhman
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Kumar et al.
ConserWeightive Behavioral Cloning for Reliable Offline Reinforcement Learning
Nguyen, Zheng & Grover
An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification
Chalkidis et al.
Population-Based Reinforcement Learning for Combinatorial Optimization
Grinsztajn, Furelos-Blanco & Barrett
September
Learning by Distilling Context
Snell, Klein & Zhong
GazeRadar: A Gaze and Radiomics-Guided Disease Localization Framework
Bhattacharya, Jain & Prasanna
A Light Recipe to Train Robust Vision Transformers
Debenedetti, Sehwag & Mittal
August
Text-to-Text Multi-view Learning for Passage Re-ranking
Ju, Yang & Wang
Chen
Conviformers: Convolutionally guided Vision Transformer
Vaishnav et al.
ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers
Li, Zhang & Xie
NUS-IDS at CheckThat!2022: Identifying Check-worthiness of Tweets using CheckthaT5
Du, Gollapalli & Ng
Alrowili & Vijay-Shanker
July
AfriTeVa: Extending “Small Data” Pretraining Approaches to Sequence-to-Sequence Models
Ogundepo et al.
Lightweight Transformers for Conversational AI
Pressel et al.
Detecting and mitigating issues in image-based COVID-19 diagnosis
Silva, Rezende & Ponti
Bhattacharjee et al.
Yes, No or IDK: The Challenge of Unanswerable Yes/No Questions
Sulem, Hay & Roth
Language Modelling with Pixels
Rust et al.
June
Lu et al.
Pre-training and Evaluating Transformer-based Language Models for Icelandic
Daðason & Loftsson
Insights into Pre-training via Simpler Synthetic Tasks
Wu, Li & Liang
Visualizing attention zones in machine reading comprehension models
Cui, Zhang & Liu
Can CNNs Be More Robust Than Transformers?
Wang et al.
Huang et al.
May
Divide to adapt: Mitigating confirmation bias for domain adaptation of black-box predictors
Yang et al.
Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction
Kolluru et al.
Notin et al.
Zero-Shot and Few-Shot Learning for Lung Cancer Multi-Label Classification using Vision Transformer
Guo & Fan
Si et al.
Bhattacharjee et al.
Semi-self-supervised Automated ICD Coding
Hlynsson et al.
Dagli
Generating Disentangled Arguments With Prompts: A Simple Event Extraction Framework That Works
Si et al.
Multilingual multi-aspect explainability analyses on machine reading comprehension models
Cui et al.
Long Document Re-ranking with Modular Re-ranker
Gao & Callan
April
Adversarially robust vision transformers
Debenedetti
Cross-stitched Multi-modal Encoders
Singla et al.
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Wang et al.
Scalable Semi-Modular Inference with Variational Meta-Posteriors
Carmona & Nicholls
March
Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines
Kuchnik et al.
Learning neural audio features without supervision
Yadav & Zeghidour
STaR: Bootstrapping Reasoning With Reasoning
Zelikman, Wu & Goodman
Dynamics of Transmon Ionization
Shillito et al.
KinyaBERT: a Morphology-aware Kinyarwanda Language Model
Nzeyimana & Rubungo
PERT: Pre-training BERT with permuted language model
Cui, Yang & Liu
Active Evaluation: Efficient NLG Evaluation with Few Pairwise Comparisons
Mohankumar & Khapra
IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation
Sarti & Nissim
February
Yadav
CEDILLE: A large autoregressive language model in French
Müller & Laurent
Tensor Processing Units as Quantum Chemistry Supercomputers
Pederson et al.
Laschowski et al.
January
Our Summer of Code Project on TF-GAN
P A, Maynard-Reid & Shor
BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA
Alrowili & Vijay-Shanker
Making and Using AI in the Library: Creating a BERT Model at the National Library of Sweden
Haffenden, Fano & Malmsten
2021
December
NLP Tasks with GreekLegalBERT v2
Apostolopoulou & Briakos
Beckmann
Information retrieval and question answering: A case study on COVID-19 scientific literature
Otegi et al.
How and What to Learn: Taxonomizing Self-Supervised Learning for 3D Action Recognition
Tanfous et al.
Learned Queries for Efficient Local Attention
Arar, Shamir & Bermano
CPPE-5: Medical Personal Protective Equipment Dataset
Dagli & Shaikh
Minixhofer, Klejch & Bell
Minixhofer, Paischer & Rekabsaz
November
RoGPT2: Romanian GPT2 for Text Generation
Niculescu, Ruseti & Dascalu
Alrowili & Vijay-Shanker
October
MutFormer: A context-dependent transformer-based model to predict pathogenic missense mutations
Jiang, Fang & Wang
Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems
Datta et al.
Wang et al.
Kousin
Delphi: Towards Machine Ethics and Norms
Jiang et al.
Cut the CARP: Fishing for zero-shot story evaluation
Matiana et al.
ResNet strikes back: An improved training procedure in timm
Wightman, Touvron & Jégou
September
Yadav & Foster
AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation
Nagoudi, Elmadany & Abdul-Mageed
Pretrained Neural Models for Turkish Text Classification
Okur & Sertbaş
Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations
Araujo et al.
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Xu, Van Durme & Murray
Butner
Rajan, Zielesny & Steinbeck
Characterizing Possible Failure Modes in Physics-Informed Neural Networks
Krishnapriyan et al.
August
Kirchner
Large Biomedical Question Answering Models with ALBERT and ELECTRA
Alrowili & Vijay-Shanker
Pretrained Transformers for Text Ranking: BERT and Beyond
Lin, Nogueira & Yates
July
Vera: Prediction Techniques for Reducing Harmful Misinformation in Consumer Health Search
Pradeep et al.
gaBERT — an Irish Language Model
Barry et al.
Mascolini et al.
Exploring Listwise Evidence Reasoning with T5 for Fact Verification
Jiang, Pradeep & Lin
June
Job Descriptions Keyword Extraction using Attention based Deep Learning Models with BERT
Mahdi et al.
Dangers of Bayesian Model Averaging under Covariate Shift
Izmailov et al.
Khemchandani et al.
Training Data Augmentation for Code-Mixed Translation
Gupta, Vavre & Sarawagi
GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model
Wang & Komatsuzaki
May
Scientific Claim Verification with VERT5ERINI
Pradeep et al.
Ozyurt et al.
BioELECTRA:Pretrained Biomedical text Encoder using Discriminators
Kanakarajan, Kundumani & Sankarasubbu
Stress Test Evaluation of Biomedical Word Embeddings
Araujo et al.
Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level
Zhong et al.
Jiang et al.
CoolMomentum: a method for stochastic optimization by Langevin dynamics with simulated annealing
Borysenko & Byshkin
DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera
McNally et al.
KLUE: Korean Language Understanding Evaluation
Park et al.
How Deep is your Learning: the DL-HARD Annotated Deep Learning Dataset
Mackie, Dalton & Yates
April
Contextualized Query Embeddings for Conversational Search
Lin, Yang & Lin
DECIMER1.0: Deep Learning for Chemical Image Recognition using Transformers
Rajan, Zielesny & Steinbeck
Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model
Kummervold et al.
Kummervold et al.
City-Scale Simulation Of Covid-19 Pandemic & Intervention Policies Using Agent-Based Modelling
Suryawanshi et al.
Elnaggar et al.
Arabic Compact Language Modelling for Resource Limited Devices
Alyafeai & Ahmad
Igor Ivanov: Harnessing Machine Learning Skills to Reduce Damages from Tropical Storms
Radiant Earth Foundation
Laschowski et al.
InAugment: Improving Classifiers via Internal Augmentation
Arar, Shamir & Bermano
March
Comparing score aggregation approaches for document retrieval with pretrained transformers
Zhang, Yates & Lin
MPII at the TREC 2020 Deep Learning Track
Li & Yates
Is Attention Better Than Matrix Decomposition?
Geng et al.
Wudenhe & Tseng
February
Galanos
I-BERT: Integer-only BERT Quantization
Kim et al.
Vaishnav et al.
Nemeskey
Symbolic regression for scientific discovery: an application to wind speed forecasting
Abdellaoui & Mehrkanoon
a-emami
Time Series (re)sampling using Generative Adversarial Networks
Dahl & Sørensen
January
The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models
Pradeep, Nogueira & Lin
A Multi-Class Hinge Loss for Conditional GANs
Kavalerov, Czaja & Chellappa
Bottleneck Transformers for Visual Recognition
Srinivas et al.
Neural Grammatical Error Correction for Romanian
Cotet, Ruseti & Dascalu
Pande et al.
2020
December
Müller & Salathé
ARAELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding
Antoun, Baly & Hajj
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
Antoun, Baly & Hajj
ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic
Abdul-Mageed, Elmadany & Nagoudi
STOUT: SMILES to IUPAC names using Neural Machine translation
Rajan, Zielesny & Steinbeck
The Depth-to-Width Interplay in Self-Attention
Levine et al.
GottBERT: a pure German Language Model
Scheible et al.
November
HAWQ-V3: Dyadic Neural Network Quantization
Yao et al.
A Little Bit Is Worse Than None: Ranking with Limited Training Data
Zhang, Yates & Lin
Ensemble Predictions of Wildfire Spread Through TPU-Compatible TensorFlow Acceleration
Bonanni & Ihme
Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks
Hui & Belkin
Learning from Task Descriptions
Weller et al.
Modern Control Technologies in Robotics
ΠΑΝΑΓΙΩΤΗΣ & ΧΑΤΖΟΠΟΥΛΟ
October
Ensemble Predictions of Wildfire Spread Through TPU-Compatible TensorFlow Acceleration
Bonanni & Ihme
Flexible IR Pipelines with Capreolus
Yates et al.
Chan, Schweter & Möller
Guiding Attention for Self-Supervised Learning with Transformers
Deshpande & Narasimhan
LEGAL-BERT: The Muppets straight out of Law School
Chalkidis et al.
MammoGANesis: Controlled Generation of High-Resolution Mammograms for Radiology Education
Zakka et al.
Neural Audio Fingerprint for High-specific Audio Retrieval based on Contrastive Learning
Chang et al.
Emami
September
Deep multi-stations weather forecasting: explainable recurrent convolutional neural networks
Abdellaoui & Mehrkanoon
Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining
Sai et al.
Kakwani et al.
S3NAS: Fast NPU-aware Neural Architecture Search Methodology
Lee, Kang & Ha
August
GREEK-BERT: The Greeks visiting Sesame Street
Koutsikakis et al.
HASeparator: Hyperplane-Assisted Softmax
Kansizoglou et al.
July
Latent Retrieval for Large-Scale Fact-Checking and Question Answering with NLI training
Samarinas, Hsu & Lee
Playing with Words at the National Library of Sweden - Making a Swedish BERT
Malmsten, Borjeson & Haffenden
Elnaggar et al.
June
Denoising Diffusion Probabilistic Models
Ho, Jain & Abbeel
FinEst BERT and CroSloEngual BERT: less is more in multilingual models
Ulˇcar & Robnik-Sikonja
How the Google AI Community Used Cloud to Help Biomedical Researchers
Elliott, Kwon & Goncharov
Learning compact generalizable neural representations supporting perceptual grouping
Veerabadran & de Sa
On the Predictability of Pruning Across Scales
Rosenfeld et al.
Reproducible and Portable Workflows for Scientific Computing and HPC in the Cloud
Vaillancourt et al.
Swedish NLP Solutions for Email Classification
Castronuovo
May
COVID-Twitter-BERT: A Natural Language Processing Model To Analyse COVID-19 Content On Twitter
Müller, Salathé & Kummervold
CURL: Contrastive Unsupervised Representations for Reinforcement Learning
Srinivas, Laskin & Abbeel
Lexicon-Enhancement of Embedding-based Approaches Towards the Detection of Abusive Language
Koufakou & Scott
Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture
Brix, Bahar & Ney
Diao
April
Imitation Attacks and Defenses for Black-box Machine Translation Systems
Wallace, Stern & Song
March
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Renda, Frankle & Carbin
February
AraBERT: Transformer-based Model for Arabic Language Understanding
Antoun, Baly & Hajj
Schuch et al.
Bhatkalkar et al.
On Identifiability in Transformers
Brunner et al.
January
Attention! A Lightweight 2D Hand Pose Estimation Approach
Santavas et al.
2019
December
de Vries et al.
November
High-Quality Cloud Masking of Landsat 8 Imagery Using Convolutional Neural Networks
Hughes & Kennedy
October
Localization of Fake News Detection via Multitask Transfer Learning
Cruz, Tan & Cheng
Akhmetzyanov
September
Answering questions by learning to rank -- Learning to rank by answering questions
Pîrtoacă, Rebedea & Ruseti
Cross-Lingual Machine Reading Comprehension
Cui et al.
Branwen
August
OpenGPT-2: We Replicated GPT-2 Because You Can Too
Gokaslan & Cohen
Running PyTorch on TPU: a bag of tricks
Chikishev
Towards Ethical Content-Based Detection of Online Influence Campaigns
Crothers, Japkowicz & Victor
July
Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
Wang, Wei & Brooks
Exploring the Use of Lexicons to aid Deep Learning towards the Detection of Abusive Language
Koufakou & Scott
June
A Focus on Neural Machine Translation for African Languages
Martinus & Abbott
Benchmarking Neural Machine Translation for Southern African Languages
Martinus & Abbott
Leahy
May
Neural heuristics for SAT solving
Jaszczur et al.
Single-Path NAS: Device-Aware Efficient ConvNet Design
Stamoulis et al.
April
The Lottery Ticket Hypothesis at Scale
Frankle et al.
March
SCIBERT: Pretrained Contextualized Embeddings for Scientific Text
Beltagy, Cohan & Lo
January
Large-Batch Training for LSTM and Beyond
You et al.
2018
December
November
End-to-end sound source separation conditioned on instrument labels
Slizovskaia et al.
Mapping scRNA-seq data onto cell type taxonomies
Svensson & Pachter
Sample-efficient image segmentation through recurrence
Linsley, Kim & Serre
October
September
Lee et al.
Learning what and where to attend
Linsley et al.
Maximum Entropy Fine-Grained Classification
Dubey et al.
Don't see your TRC-supported work here?
Please let us know about it by filling out this short form .