Improving Existing Indic LLM Finetuned Models

2 Techniques to Try Now

Ramsri Goutham
GoPenAI

--

Improving Indic LLm Finetunes
Improving Indic LLm Finetunes

People ask me what they should attempt newly if they are training Indic LLMs since there are already a lot of Llama2/3 and Gemma based models.

Two things:

1. Train with DORA (not plain LORA)

Lora was decomposing additional trainable parameters into two low rank matrices (A&B).

DORA goes one step further and adds additional parameters: magnitude and direction vector. Now the direction vector is decomposed into two low rank trainable matrices (A&B).

So you have a LORA type configuration + additional magnitude vector trainable in DORA. But you get a superior training setup moving closer to full-finetuning with just magnitude vector training parameters as the overhead.

References:

2. Train with ORPO

The current Indic models released have one big problem. Most of them are just SFT (instruction finetuned) models. So if you ask a question like “how to make a bomb” or “how to kidnap”, it doesn’t hesitate to provide an answer.

Doing DPO or RLHF is cumbersome given data constraints. ORPO eliminates the need by combining the SFT + DPO/RLHF step into one.

Find a good ORPO dataset, translate it into any specific Indic language (Hindi, Tamil, Telugu etc.) and finetune such that you combine the instruction and preference alignment task into one.

References:

Feel free to follow me on Twitter where I share my learnings from building AI SaaS apps!

--

--