Fine Tune Your LLMS

1. Exploring Concepts and Costs of Fine-Tuning

Review the fine-tuning process

  • Why Fine-Tune?
    • Fine tuning improves the overall performance of a model for your particular use case. The responses to your prompts are improved when using a fine tuned model.
    • You can reduce token costs because your prompts are shorter and require fewer examples than few shot prompting or retrieval augmented generation or RAG
    • The model responds to your queries (prompts) faster with lower latency.
  • Fine-tuning process
    • Obtain the data: a minimum for 10 examples is required. 50-100 training examples will clearly improve the fine-tuning
    • Prepare the data: Format your data in the JSONL format.
    • Upload the data to the OpenAI server: Use the Files API to upload your data.
    • Create a fine-tuning job: Use the OpenAI API, SDK, or option in the Playground.
    • Evaluate the model: Use metrics like training loss and training token accuracy.
    • Use the model: Use in the model parameter of your ChatCompletions call

Understand the costs of the fine-tuning

OpenAI charges you based on tokens, the input and output tokens will be charged.

01

  • Fine-tuned prompt allows you send short tokens.
  • Getting to know the number of tokens helps 1. estimate the costs; 2. reduce the latency
  • Pricing
    • Training the fine-tuned model (base price * token * epoch)
    • Using the fine-tuned model during inference

2. Setting up your environment for fine-tuning

Explore the OpenAI API for fine-tuning

  • Fine-Tuning Endpoints
    1. Create fine-tuning job
    2. List fine-tuning job
    3. List fine-tuning events
    4. Retrieve fine-tuning job
    5. Cancel fine-tuning
1
2
3
4
5
6
7
8
9
10
11
12
13
from openai import OpenAI

client = OpenAI()
client.fine_tuning.create(
training_file="file-IntFuYDWVfJwMp6TpSrJa8aq",
model="gpt-3.5-turbo",
hyperparameters={
"n_epochs": 5
}
)

# output
FineTuningJob(id='ftjob-ts4hC5Qakf2XzytcGrTW0GRZ', created_at=1707178194, error=None, fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(n_epochs=5, batch_size='auto', learning_rate_multiplier='auto'), model='gpt-3.5-turbo-0613', object='fine_tuning.job', organization_id='org-RZLvEijW4GW0KmC3rLIAjZlu', result_files=[], status='validating_files', trained_tokens=None, training_file='file-IntFuYDWVfJwMp6TpSrJa8aq', validation_file=None)

Retrieve job status

1
2
3
4
5
6
7
8
9
# Retrieve job status
job_id = "ftjob-ts4hC5Qakf2XzytcGrTW0GRZ"

# Retrieve the state of a fine-tune
# Status field can contain: running or succeeded or failed, etc.
client.fine_tuning.jobs.retrieve(job_id)

# output
FineTuningJob(id='ftjob-ts4hC5Qakf2XzytcGrTW0GRZ', created_at=1707178194, error=None, fine_tuned_model='ft:gpt-3.5-turbo-0613:keysoft::8p3gc9SA', finished_at=1707179589, hyperparameters=Hyperparameters(n_epochs=5, batch_size=1, learning_rate_multiplier=2), model='gpt-3.5-turbo-0613', object='fine_tuning.job', organization_id='org-RZLvEijW4GW0KmC3rLIAjZlu', result_files=['file-4XHPig2LQ1VAUTnVIPqlbWJO'], status='succeeded', trained_tokens=34235, training_file='file-IntFuYDWVfJwMp6TpSrJa8aq', validation_file=None)

Supported Language and Libraries

  • Python
  • TypeScript/JavaScript
  • Node.js
  • Java
  • .NET

Use GitHub Codespaces

LinkedInLearning/fine-tune-llms–4511663: This repo is for linkedin learning course: Fine-Tune Your LLMs (github.com)

3. Preparing data for Fine-Tuning

Providing the specific name of the business in the dataset is important for the fine-tuning. Since we are teaching the model to answer questions.

Prompts to generate dataset

1
2
3
Customer Support Automation
Automating responses to customer inquiries on various platforms (email, chatbots, social media).
Collect a dataset of customer inquiries and manually crafted responses. This dataset should cover a wide range of common questions, complaints, and feedback, along with the company's standard responses. Ensure to anonymize personal information.
1
2
3
4
5
pip install openai
pip install openai[datalib]
pip install urllib3==1.26.6
pip install python-dotenv
pip install tiktoken

Import the libraries and enviornment file to gain access to the Open API Key, the key can be generated here: https://platform.openai.com/account/api-keys

1
2
3
4
5
6
7
8
9
import os
from openai import OpenAI

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

client = OpenAI(
api_key=os.environ['OPENAI_API_KEY']
)

Help Functions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import json
import tiktoken # for token counting
import numpy as np
from collections import defaultdict

encoding = tiktoken.get_encoding("cl100k_base")

#input_file=formatted_custom_support.json ; output_file=output.jsonl
def json_to_jsonl(input_file, output_file):

# Open JSON file
f = open(input_file)

# returns JSON object as
# a dictionary
data = json.load(f)

# produce JSONL from JSON
with open(output_file, 'w') as outfile:
for entry in data:
json.dump(entry, outfile)
outfile.write('\n')

def check_file_format(dataset):
# Format error checks
format_errors = defaultdict(int)

for ex in dataset:
if not isinstance(ex, dict):
format_errors["data_type"] += 1
continue

messages = ex.get("messages", None)
if not messages:
format_errors["missing_messages_list"] += 1
continue

for message in messages:
if "role" not in message or "content" not in message:
format_errors["message_missing_key"] += 1

if any(k not in ("role", "content", "name", "function_call") for k in message):
format_errors["message_unrecognized_key"] += 1

if message.get("role", None) not in ("system", "user", "assistant", "function"):
format_errors["unrecognized_role"] += 1

content = message.get("content", None)
function_call = message.get("function_call", None)

if (not content and not function_call) or not isinstance(content, str):
format_errors["missing_content"] += 1

if not any(message.get("role", None) == "assistant" for message in messages):
format_errors["example_missing_assistant_message"] += 1

if format_errors:
print("Found errors:")
for k, v in format_errors.items():
print(f"{k}: {v}")
else:
print("No errors found")

# not exact!
# simplified from https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
def num_tokens_from_messages(messages, tokens_per_message=3, tokens_per_name=1):
num_tokens = 0
for message in messages:
num_tokens += tokens_per_message
for key, value in message.items():
num_tokens += len(encoding.encode(value))
if key == "name":
num_tokens += tokens_per_name
num_tokens += 3
return num_tokens

Convert JSON to JSONL

1
json_to_jsonl('custom_support.json', 'output.jsonl')

Check File Format

https://cookbook.openai.com/examples/chat_finetuning_data_prep

1
2
3
4
5
6
7
8
9
10
11
data_path = "output.jsonl"

# Load the dataset
with open(data_path, 'r', encoding='utf-8') as f:
dataset = [json.loads(line) for line in f]

# Initial dataset stats
print("Num examples:", len(dataset))
print("First example:")
for message in dataset[0]["messages"]:
print(message)

Cost Estimation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Get the length of the conversation
conversation_length = []

for msg in dataset:
messages = msg["messages"]
conversation_length.append(num_tokens_from_messages(messages))

# Pricing and default n_epochs estimate
MAX_TOKENS_PER_EXAMPLE = 4096
TARGET_EPOCHS = 5
MIN_TARGET_EXAMPLES = 100
MAX_TARGET_EXAMPLES = 25000
MIN_DEFAULT_EPOCHS = 1
MAX_DEFAULT_EPOCHS = 25

n_epochs = TARGET_EPOCHS
n_train_examples = len(dataset)

if n_train_examples * TARGET_EPOCHS < MIN_TARGET_EXAMPLES:
n_epochs = min(MAX_DEFAULT_EPOCHS, MIN_TARGET_EXAMPLES // n_train_examples)
elif n_train_examples * TARGET_EPOCHS > MAX_TARGET_EXAMPLES:
n_epochs = max(MIN_DEFAULT_EPOCHS, MAX_TARGET_EXAMPLES // n_train_examples)

n_billing_tokens_in_dataset = sum(min(MAX_TOKENS_PER_EXAMPLE, length) for length in conversation_length)
print(f"Dataset has ~{n_billing_tokens_in_dataset} tokens that will be charged for during training")
print(f"By default, you'll train for {n_epochs} epochs on this dataset")
print(f"By default, you'll be charged for ~{n_epochs * n_billing_tokens_in_dataset} tokens")

num_tokens = n_epochs * n_billing_tokens_in_dataset
1
2
3
# gpt-3.5-turbo	$0.0080 / 1K tokens
cost = (num_tokens/1000) * 0.0080
print(cost)

Upload File

Once you have the data validated, the file needs to be uploaded using the files API in order to be used with a fine-tuning jobs

1
2
3
4
client.files.create(
file=open("output.jsonl", "rb"),
purpose="fine-tune"
)

4. Fine-Tuning a Pretrained LLM

Train a new fine-tuned model

Normal Parameters:

  • training_file
  • model: the model name you want to fine-tune
  • validation_file: to evaluate the model performance during training

Hyperparameters:

  • batch_size
  • learning_rate_multiplier
  • n_epochs

Upload data, create fine-tuning job, retrieve job status

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
client.files.create(
file=open("output.jsonl", "rb"),
purpose="fine-tune"
)

# output: FileObject(id='file-IntFuYDWVfJwMp6TpSrJa8aq'

client.fine_tuning.jobs.create(
training_file="file-IntFuYDWVfJwMp6TpSrJa8aq",
model="gpt-3.5-turbo",
hyperparameters={
"n_epochs":5
}
)

# Retrieve job status
job_id = "ftjob-ts4hC5Qakf2XzytcGrTW0GRZ"

# Retrieve the state of a fine-tune
# Status field can contain: running or succeeded or failed, etc.
client.fine_tuning.jobs.retrieve(job_id)

Use a fine-tuned model

base model:

1
2
3
4
5
6
7
8
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "This is a customer support chatbot designed to help with common inquiries.",
"role": "user", "content": "What is the return policy at Kesha's Boutique?"}
]
)
print(response.choices[0].message.content)

output:

1
I apologize, but as an AI language model, I do not have access to specific information about the return policy at Kesha's Boutique. To find out about their return policy, I recommend visiting their official website or contacting their customer service directly.

Use fine-tuned model:

1
2
3
4
5
6
7
8
9
10
fine_tuned_model = "ft:gpt-3.5-turbo-0613:keysoft::8p2M8Tzi"

response = client.chat.completions.create(
model=fine_tuned_model,
messages=[
{"role": "system", "content": "This is a customer support chatbot designed to help with common inquiries for Kesha's Boutique.",
"role": "user", "content": "What is the return policy at Kesha's Boutique?"}
]
)
print(response.choices[0].message.content)

output:

1
Our return policy allows customers to return items within 30 days of purchase for a full refund, as long as the items are in their original condition. Sale items and certain products may have different return conditions, so please check our return policy page for more details.

Develop a chatbot based on a fine-tuned model

Steps:

  1. Manage the conversation history
  2. Define a system message
  3. Send in customer support questions
  4. End the conversation on “exit”
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#sets the persona for the AI assistant using a system message
context = [{'role':'system', 'content': """This is a customer support chatbot designed to help with common
inquiries for Kesha's Boutique."""}]

def collect_messages(role, message): #keeps track of the message exchange between user and assistant
context.append({'role': role, 'content':f"{message}"})

def get_completion():
try:
response = client.chat.completions.create(
model=fine_tuned_model,
messages=context
)

print("\n Assistant: ", response.choices[0].message.content, "\n")
return response.choices[0].message.content
except openai.APIError as e:
print(e.http_status)
print(e.error)
return e.error

#Start the conversation between the user and the AI assistant/chatbot
while True:
collect_messages('assistant', get_completion()) #stores the response from the AI assistant

user_prompt = input('User: ') #input box for entering prompt

if user_prompt == 'exit': #end the conversation with the AI assistant
print("\n Goodbye")
break

collect_messages('user', user_prompt) #stores the user prompt

5. Evaluating a Fine-Tuned Model

Evaluate a fine-tuned model

When retrieving job status, you’ll get the FineTurningJob which contains result_file attribute if the job is succeeded.

1
2
3
4
5
6
# Retrieve job status
job_id = "ftjob-ts4hC5Qakf2XzytcGrTW0GRZ"

# Retrieve the state of a fine-tune
# Status field can contain: running or succeeded or failed, etc.
client.fine_tuning.jobs.retrieve(job_id)

output:

1
FineTuningJob(id='ftjob-ts4hC5Qakf2XzytcGrTW0GRZ', created_at=1707178194, error=None, fine_tuned_model='ft:gpt-3.5-turbo-0613:keysoft::8p3gc9SA', finished_at=1707179589, hyperparameters=Hyperparameters(n_epochs=5, batch_size=1, learning_rate_multiplier=2), model='gpt-3.5-turbo-0613', object='fine_tuning.job', organization_id='org-RZLvEijW4GW0KmC3rLIAjZlu', result_files=['file-4XHPig2LQ1VAUTnVIPqlbWJO'], status='succeeded', trained_tokens=34235, training_file='file-IntFuYDWVfJwMp6TpSrJa8aq', validation_file=None)

Evaluate results:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import io
import pandas as pd

#once training is finished, you can retrieve the file in "result_files=[]"
result_file = "file-4XHPig2LQ1VAUTnVIPqlbWJO"

file_data = client.files.content(result_file)

# its binary, so read it and then make it a file like object
file_data_bytes = file_data.read()
file_like_object = io.BytesIO(file_data_bytes)

#now read as csv to create df
df = pd.read_csv(file_like_object)
df

02

  • step: Number of batches the model has processed
  • train_loss: The difference between the model’s predictions and the actual targets on the training set
  • train_accuracy: The percentage of correct predictions on the training set
  • valid_loss: Like train loss but for the validation set
  • valid_mean_token_accuracy: Measures how accurately the model predicts each token in the validation set

Iterate a fine-tuned model

Iteration Options

  • Iterate on your fine-tuned model: you can fine tune a fine-tuned model
  • Iterate on your data quality: collect examples that target specific issues you’re seeing
  • Iterate on your data quantity: adding additional examples for the model to learn from
  • Iterate on your hyperparameters