There is an unending tsunami of news and announcements about new large-language models (LLMs) from large tech providers and the open source community. You're wondering:
Are open source LLMs as good as proprietary ones like GPT-4?
What are the pros and cons of open source LLMs?
How should I get started?
I answer these questions in Episode 15 of Prolego's Generative AI & LLM Strategy Series on YouTube; and I provide some practical advice.
We performed a head-to-head comparison between the proprietary giant GPT-4 and an open-source darling called Phind-CodeLlama-34B-v2 (or let's just stick with 'Phind') on the same Unified Natural Language Query problem we covered in Episode 4. Here are the key takeaways:
1. GPT-4 vs. Open Source: GPT-4 is still the kingpin for generalized tasks, but open-source models like Phind aren't too far behind. They can be even more efficient in generating complex SQL queries.
2. Pros and Cons: With open source, you retain full data control and gain operational flexibility. This optionality comes at the expense of additional engineering effort.
3. How to Start: Build a prototype with GPT-4 to check the viability. If things look promising, you can then switch gears to an open-source model for optimization — be it speed, cost, or data privacy.
These examples will help you accelerate your AI program by avoiding the most common mistakes in model selection.
Enjoy!
Kevin