Into AI

A Paper a Week - Week 8

Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs, Ovadia et al.

Posted on February 24, 2024

Introduction For the eighth post in this series, I read “Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs” by Ovadia et al. Following last week’s paper on RAG as a knowledge injection technique, I found this study done comparing RAG with the more traditional task-specific fine-tuning. Surprisingly, the results of... [Read More]

Tags:

week8
paper

A Paper a Week - Week 7

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Lewis et al.

Posted on February 17, 2024

Introduction For the seventh post in this series, I read “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” by Lewis et al. Otherwise known as “RAG”, retrieval-augmented generation refers to the idea of providing sequence-to-sequence models with an up to date knowledge base that can be leveraged to answer questions or accomplish... [Read More]

Tags:

week7
paper

A Paper a Week - Week 6

Neural Machine Translation of Rare Words with Subword Units, Sennrich et al.

Posted on February 10, 2024

Introduction For the sixth post in this series, I read “Neural Machine Translation of Rare Words with Subword Units” by Sennrich et al. The primary idea outlined in the paper is to use byte pair encoding to compress vocabularies and improve performance on the NMT task (translating sentences from one... [Read More]

Tags:

week6
paper

A Paper a Week - Week 5

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al.

Posted on February 1, 2024

Introduction For the fifth post in this series, I read “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” by Devlin et al. Besides the GPT architecture (decoder-only), the BERT architecture (encoder-only) is the most popular transformer architecture used today due to its proficiency and capability at learning downstream tasks... [Read More]

Tags:

week5
paper

A Paper a Week - Week 4

Improving Language Understanding by Generative Pre-training, Radford et al.

Posted on January 27, 2024

Introduction For the fourth post in this series, I read “Improving Language Understanding by Generative Pre-training” by Radford et al. It felt apt to read the GPT-2 paper, which set the foundation for the most popular transformer in use today, after learning about all of the transformer components individually the... [Read More]

Tags:

week4
paper