HomeAITextual content Era with GPT. How you can fine-tune a GPT mannequin...

Textual content Era with GPT. How you can fine-tune a GPT mannequin to… | by Ruben Winastwan | Jan, 2024


Should you’re working within the information science or machine studying trade, likelihood is that you simply’ve heard the time period Generative AI earlier than, which refers to AI algorithms able to creating new content material like texts, photographs, or audio. On this article, we’re going to delve into considered one of Generative AI fashions: the GPT mannequin. As you may need guessed, GPT is a foundational mannequin of ChatGPT that may generate sequences of texts.

Particularly, we’ll shortly focus on the fine-tuning and textual content technology strategy of a GPT mannequin. Whereas there are numerous established libraries and platforms on the market that we are able to use to deal with this job, they typically summary away many implementation particulars, leaving us interested in what truly occurs below the hood.

Due to this fact, we’ll discover the fine-tuning and textual content technology course of in low-level particulars. This implies we’ll cowl every part comprehensively, from information preprocessing, mannequin constructing, establishing the loss operate, the fine-tuning course of, and the logic behind textual content technology after fine-tuning the mannequin.

So, with out additional ado, let’s begin with the dataset we’ll be utilizing to fine-tune our GPT mannequin!

The dataset we’ll be utilizing is the TED-talk dataset, which we are able to obtain immediately from the HuggingFace Hub. It’s listed as having a CC-BY 4.0 license, so there’s no want to fret about copyright.

# Load all obligatory libraries
!pip set up datasets

import torch
import numpy as np
from torch import nn
from transformers import GPT2Tokenizer, GPT2Config, GPT2Model, GPT2PreTrainedModel
from torch.optim import AdamW
from datasets import load_dataset
from tqdm import tqdm
from torch.nn import purposeful as F

import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

system = 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_built() else 'cpu'

dataset = load_dataset("gigant/ted_descriptions")
print(len(dataset['train']))

'''
Output:
5705
'''

In whole, the dataset contains 5705 entries, every containing a URL (url) and an outline of a TED occasion (descr). For the aim of this text, we’ll solely use the descr



Supply hyperlink

latest articles

explore more