Into AI - A Blog
  • About
  • Search
Navigation bar avatar
✕

    Into AI


    Blogging my journey through AI.
    • A Paper a Week - Week 8

      Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs, Ovadia et al.

      Posted on February 24, 2024

      Post thumbnail
      Post thumbnail
      Introduction For the eighth post in this series, I read “Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs” by Ovadia et al. Following last week’s paper on RAG as a knowledge injection technique, I found this study done comparing RAG with the more traditional task-specific fine-tuning. Surprisingly, the results of... [Read More]
      Tags:
      • week8
      • paper
    • A Paper a Week - Week 7

      Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Lewis et al.

      Posted on February 17, 2024

      Post thumbnail
      Post thumbnail
      Introduction For the seventh post in this series, I read “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” by Lewis et al. Otherwise known as “RAG”, retrieval-augmented generation refers to the idea of providing sequence-to-sequence models with an up to date knowledge base that can be leveraged to answer questions or accomplish... [Read More]
      Tags:
      • week7
      • paper
    • A Paper a Week - Week 6

      Neural Machine Translation of Rare Words with Subword Units, Sennrich et al.

      Posted on February 10, 2024

      Post thumbnail
      Post thumbnail
      Introduction For the sixth post in this series, I read “Neural Machine Translation of Rare Words with Subword Units” by Sennrich et al. The primary idea outlined in the paper is to use byte pair encoding to compress vocabularies and improve performance on the NMT task (translating sentences from one... [Read More]
      Tags:
      • week6
      • paper
    • A Paper a Week - Week 5

      BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al.

      Posted on February 1, 2024

      Post thumbnail
      Post thumbnail
      Introduction For the fifth post in this series, I read “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” by Devlin et al. Besides the GPT architecture (decoder-only), the BERT architecture (encoder-only) is the most popular transformer architecture used today due to its proficiency and capability at learning downstream tasks... [Read More]
      Tags:
      • week5
      • paper
    • A Paper a Week - Week 4

      Improving Language Understanding by Generative Pre-training, Radford et al.

      Posted on January 27, 2024

      Post thumbnail
      Post thumbnail
      Introduction For the fourth post in this series, I read “Improving Language Understanding by Generative Pre-training” by Radford et al. It felt apt to read the GPT-2 paper, which set the foundation for the most popular transformer in use today, after learning about all of the transformer components individually the... [Read More]
      Tags:
      • week4
      • paper
    • ← Newer Posts
    • Older Posts →
    • Email me
    • RSS
    • GitHub
    • Twitter

    Rob Pitkin  •  2024  •  rob-pitkin.github.io  •  Edit page

    Powered by Beautiful Jekyll