Improving Existing Indic LLM Finetuned Models

2 Techniques to Try Now

Published in

GoPenAI

2 min readApr 24, 2024

People ask me what they should attempt newly if they are training Indic LLMs since there are already a lot of Llama2/3 and Gemma based models.

Two things:

1. Train with DORA (not plain LORA)

Lora was decomposing additional trainable parameters into two low rank matrices (A&B).

DORA goes one step further and adds additional parameters: magnitude and direction vector. Now the direction vector is decomposed into two low rank trainable matrices (A&B).

So you have a LORA type configuration + additional magnitude vector trainable in DORA. But you get a superior training setup moving closer to full-finetuning with just magnitude vector training parameters as the overhead.

References:

Answer.AI - Efficient finetuning of Llama 3 with FSDP QDoRA

We're releasing FSDP QDoRA, a scalable and memory-efficient method to close the gap between parameter efficient…

www.answer.ai

Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch

I'm Sebastian: a machine learning & AI researcher, programmer, and author. As Staff Research Engineer Lightning AI, I…

sebastianraschka.com

2. Train with ORPO

The current Indic models released have one big problem. Most of them are just SFT (instruction finetuned) models. So if you ask a question like “how to make a bomb” or “how to kidnap”, it doesn’t hesitate to provide an answer.

Doing DPO or RLHF is cumbersome given data constraints. ORPO eliminates the need by combining the SFT + DPO/RLHF step into one.

Find a good ORPO dataset, translate it into any specific Indic language (Hindi, Tamil, Telugu etc.) and finetune such that you combine the instruction and preference alignment task into one.

References:

Fine-tune Llama 3 with ORPO

A Blog post by Maxime Labonne on Hugging Face

huggingface.co

Feel free to follow me on Twitter where I share my learnings from building AI SaaS apps!