LLM app dev sounds scarier than it is. Luckily, open source frameworks exist with excellent guides, great documentation, and active communities. Find below a series of toys, tasks, and videos that will take you from “aware that LLMs exist” to “building an LLM-based app”. Wait, what are LLMs?
Some of this article can be practiced at a computer. Other parts can be consumed at your leisure via podcast or YouTube. Most of the article will be practiced in your head, while you go about your everyday tasks.
Risk
The data that SEP works with is proprietary. Luckily, there is a lot of support for self-hosting the services that will handle sensitive and proprietary data. It is essential that you understand not only the security constraints of your data and use case but also the user policy of any service (training, inference, chat, data prep) that you use.
Play with LLMs
Use a remote chat application
ChatGPT
Tasks
- Make an account with OpenAI
- Add a payment method
- Make an API Key. We will use this key in the next section: “Building a Context Aware App”
- Sign up for the ChatGPT 4.0 beta. There is a waiting list for their premium service. Sign up today so that next week you can play with ChatGPT 4.0 and multi-modal input/output
Exercises
- For the next couple of days, whenever you would go to google something, ask ChatGPT instead. Then google for confirmation
- Think of an abstract concept that you have a hard time explaining. Ask ChatGPT to explain it to you. Ask questions about the answer that ChatGPT provides
- Pick any paper or article. Find a part of it that you find confusing. Paste that into ChatGPT. Ask questions about the answer that ChatGPT provides
- Ask for a language translation
Run an LLM on your personal computer
Linux or MacOs (Intel or Apple Silicon (M1/M2/M3)) : Ollama
If you prefer videos to bullet points, follow Google ML Developer Expert Sam Witteveen in his Ollama walkthrough.
Tasks
- Download and install Ollama
- Open a console
- Type
ollama run mistral
. This command downloads Mistral 7B, an open source model released under Apache 2.0. Here’s the Ollama info page for Mistral 7B - Review the models available for use in the Ollama Library
Exercises
- Read the model card on Huggingface for the models you tried. Here’s Mistral7B
- If you have 48GB of RAM, download the most powerful open-source model with a permissive license. Type
ollama run mixtral
into a terminal. This command downloads mixtral.
Windows or Apple Silicon Mac (M1/M2/M3): LMStudio
Tasks
Exercises
- Scroll through LMStudio’s “New and Noteworthy section”. Pick another model to chat with, maybe Llama2.
- Read the model card on HuggingFace for the models you tried. Here is Llama2’s model card.
Get Learnt
What is HuggingFace? Think of it as the Github for deep learning models.
My two favorite Youtube resources for LLM content are Theoretical Physicist Stephen Weingartner and Google ML Developer Expert Sam Witteveen. Stephen provides simple explanations of complex concepts (research paper walkthroughs, industry landscape) and Sam provides practical examples and Google Colab notebooks with code.
- Stephen describes Mistral 7B
- Sam describes Mistral 8x7B
- Sam describes RWKV
- Stephen recommends specific LLMs in this playlist
Articles to gain intuition for LLMs
Videos to gain an intuition for LLMs
- Generative AI landscape (April 2023 but not really out of date)
- Generative AI tutorial
- How to START with AI
Build a context-aware app
Harrison Chase of LangChain uses the phrase “Context-Aware Reasoning Application” to describe LLM-powered apps. This label describes any app that leverages its awareness of the user’s context to provide value. We are moving up one layer of abstraction. Rather than rely solely on an LLM’s knowledge to answer questions in the context of user-driven chat, we are going to wrap the LLM with supporting tools.
LlamaIndex for LLM App Dev
A RAG-centric approach to context-aware app development
Get Learnt
I suggest you Get Learnt before playing with LlamaIndex. Listen with your favorite podcast provider on your way to work
- RAG Is a Hack: LlamaIndex creator Jerry Liu is interviewed. Listen for the sorts of problems that Jerry solves with RAG.
- Consider reading this deep article titled A Complete LlamaIndex Guide
Tasks
- Play with the SEC Insights, an open source sample app that the LlamaIndex team built to teach devs how to use LlamaIndex.
- Skim “How to read these docs”
- Complete their “five lines of code” Starter Tutorial. You will need to use the OpenAI key that you generated in the “Use a Remote Chat Application” section above.
Exercises
- Skim the LlamaIndex documentation intro for a conceptual grasp of LlamaIndex.
- Use LlamaIndex with a LLM hosted locally. By default, LlamaIndex uses an OpenAi-hosted model. This Colab notebook provides an example. It downloads a specified model from HuggingFace and runs it on your device.
- The example above uses the HuggingFaceLLM class to wrap interactions with the local model. If you instead wanted to use a model hosted by OpenAi, Azure, or a server on your network, you can use one of the wrapper classes in llama_index/llms. Read the first section of Using LLMs LlamaIndex doc to understand the multiple roles that an LLM can play in your context-aware reasoning application. The Model Guide has good information too.
- If LlamaIndex has captured your heart, put a ring on it and use it as framework for your app. Watch and emulate their “Building SEC Insights” End-to-End tutorial.
LangChain for LLM App Dev
You can use LangChain with either Python or Javascript/Typescript. Pick your favorite.
Get Learnt
- The Point of LangChain: LangChain creator Harrison Chase is interviewed.
- Sam Witteveen’s LangChain YoutTube playlist. Skip through to the videos that seem relevant to what you want to build. Start with LLMs & PromptTemplates with Colab
Tasks
-
The LangChain Quickstart is a delight. They provide two versions: one that uses your OpenAI access token from the “Use a Remote Chat Application” section above and another that uses a locally-hosted model. Take the OpenAI path for immediate joy. They prescribe the use of a Jupyter notebook. If you haven’t worked with a Jupyter notebook before and you want instant gratification, I recommend a Google Colab instance instead. Google Colab is a Jupyter notebook hosted on Google Cloud with neat features like free GPUs and TPUs, a visual directory navigator, and an in-browser document editor with code-highlighting. Also, Corgie mode.
-
Sign up for a LangSmith account here. From their docs, “It lets you debug, test, evaluate, and monitor chains and intelligent agents built on any LLM framework and seamlessly integrates with LangChain, the go-to open source framework for building with LLMs.” You will be placed on a waitlist that you can escape by following a link on their Github README. 🎸DEVELOPER PRIVILEGE! guitar solo noises!!!🎸
-
Run a LangChain Template from the Exercises section below until you get LangSmith Access
-
Play with LangSmith
Exercises
I think LangChain templates and the Langchain-cli are amazing! You can serve a template, browse to a frontend at localhost:8000/[template_name]/playground
and inspect each step in the chain. If your chain is ChatUI Input -> VectorDB -> LLM -> ChatUI Output
and your ChatUI Output seems borked, you can debug the chain by viewing the input and output of every stage. This becomes especially necessary when your chain includes logic for leveraging multiple LLMs, as in the Skeleton of Thought template.
- Skim the LangChain Templates announcement article.
- Run through the LangChain Template Quickstart. Harrison Chase walks through the QuickStart in this 5-minute video.
- Run a COMPLETELY OFFLINE LangChain app with the rag-chroma-private template. RAG didn’t click for me until I saw the debug output for communication with Chroma in the
playground
.
Promises and Wild Claims about LLM App Dev
- You will be confused. Find support on the Discord channels or the Github Discussion sections of these projects. I can help you directly on LinkedIn
- This article will be out of date by July 2024
- As was the case for VR/AR and Web3.0, the LLM App Dev train will slow when large companies discover that the bold investments they made without understanding the tech, accounting for risk, or defining an exit plan did not pan out
- To a greater degree than VR/AR and Web3.0, the LLM App Dev train carries tools that in their present state and not some future state creates value in education, data-retrieval, content-curation, and product development
- More impressive LLMs are released every month
- This is easier than manual memory management
This is a non-exhaustive list.
Unlock the Power of AI Engineering
From optimizing manufacturing materials to analyzing and predicting equipment maintenance schedules, see how we’re applying custom AI software solutions.
You Might Also Like