Meet Orca 2, a creation of Microsoft Research that's making waves in the realm of language models.
Orca 2: The Basics
First things first, Orca 2 is a 13 billion parameter model, which may sound like a lot, but it's actually quite small compared to giants like Palm-2 and GPT-4. However, don't let its size fool you; Orca 2 has the potential to compete with models that are 5-10 times larger.
To get to know Orca 2 better, you can dive into the research paper titled "Teaching Small Language Models How to Reason." But, if you're more of a casual reader like me, I'll break it down for you.
Orca 1 vs. Orca 2
Before we explore Orca 2's unique features, let's briefly touch on Orca 1. Orca 1 was all about learning from rich signals, such as explanation traces. It laid the groundwork for what was to come.
Now, let's see how Orca 2 is different:
Learning Strategies
Orca 2 is a smart learner. It doesn't just crunch numbers; it learns various techniques, such as step-by-step, recall then generate, recall-reason-generate, and direct answer. It's like having a versatile toolkit for solving problems. Instead of just predicting outcomes like a classic Language Model (LLM), Orca 2 trains to pick the logical reasoning steps required to solve a problem. For example:
Classic LLM: 2 + 2 = ? (predict what word comes after)
But they lack the deep understanding that when 2 objects are added to 2 more objects, we get a total of 4 objects.
Orca 2 aims to give LLMs the power to reason and understand the task at hand.
Imitation Learning Technique
In the past, earlier models used an "Imitation Learning Technique," where they relied on large models to do the reasoning and then fine-tuned smaller models based on the generated text. The problem with this approach was that the student never really surpassed the teacher.
Orca 2 Training Strategy
Orca 2 does things differently. It uses the "frontier models" like GPT-3, GPT-4, and Palm-2 to generate training data, but with a twist.
Here's the strategy:
Create a detailed prompt with the task and hints on how to solve it.
Ask the frontier model to perform the task and generate a response.
Use the task and response to fine-tune the small model.
Prompt Erasure Technique
Another nifty trick Orca 2 employs is the "Prompt Erasure Technique," where it removes the hints or instructions given to the frontier model, from the data that it uses to train. So it learns from the task itself, what kind of reasoning it needs to do.
Cautious Learner
Orca 2 doesn't just execute specific reasoning; it learns how to strategize for different tasks.
Instruction Tuning vs. Explanation Tuning
Orca 2 takes "Instruction Tuning" to the next level. Earlier models used this technique for good results, but the Orca 2 team introduced "Explanation Tuning." Here's how it works:
Start with a set of general-purpose system instructions like "Think step by step" or "generate detailed answers."
Combine these with user prompts from various tasks to create a dataset of (system instructions, user prompts, LLM answers).
Use the student model to predict LLM answers based on system instructions and user prompts.
The key concept here is extracting answers with detailed explanations from LLMs based on system instructions.
In a nutshell, Orca 2 is like a bright student who not only learns but also teaches itself how to be the best at solving problems. It's a small model with big potential, and we can't wait to see where it takes us in the world of language models!
Core Maitri is an enterprise software consultancy specializing in Excel-to-Web, AI Integration, Custom Software Programming, and Enterprise Application Development services. Our approach is deeply consultative, rooted in understanding problems at their core and then validating our solutions through iterative feedback.
Comments