Building an Internal LLM Lab with Streamlit
Recently, I have been in conversations with several companies who are developing their own internal LLM labs and tools to test LLMs against their specific use cases. Intrigued by this, I decided to spend a few hours exploring Streamlit and building my own personal LLM lab for a few use cases. In this article, I will share my experience and provide a step-by-step guide on how to build your own LLM lab using Streamlit.
Installation
To begin, let’s set up a new virtual environment for our project. Open your terminal and enter the following commands:
“`mkdir llm-streamlit
cd llm-streamlit
python3 -m venv env
pip install streamlit“`
Next, let’s verify the installation by running a “Hello World” script. Enter the following command in your terminal:
“`streamlit hello“`
This command should open a Streamlit page in your browser, confirming that Streamlit has been successfully installed.
Creating a Simple Streamlit Page
Now that we have Streamlit set up, let’s create a simple Streamlit page. We will start by importing the necessary libraries and creating a pandas DataFrame. Paste the following code into a new Python file (main.py):
“`python
import streamlit as st
import pandas as pd
df = pd.DataFrame({‘first column’: [1, 2, 3, 4], ‘second column’: [10, 20, 30, 40]})
df
“`
To run the script and see the output in your browser, use the following command:
“`streamlit run main.py“`
This will generate a website at localhost:8501 displaying the DataFrame.
Adding Interactive Elements
Let’s now experiment with adding an interactive widget to our Streamlit page. We will use the `st.text_input` function to create a text input field for the user’s name. Update the code in main.py as follows:
“`python
import streamlit as st
import pandas as pd
df = pd.DataFrame({‘first column’: [1, 2, 3, 4], ‘second column’: [10, 20, 30, 40]})
st.text_input(“Your name”, key=”name”)
name = st.session_state.name
st.write(f”This table is created by {name}:”)
df
“`
After saving the changes, run the script again using the same command as before. You will now see a text input field where you can enter your name. The table below will update with the name you entered.
Creating Multiple Pages
In order to create multiple pages within our Streamlit app, we will modify main.py to include content in the sidebar. We will also create a second page in a separate Python file (pages/pdf.py).
Update main.py as follows:
“`python
import streamlit as st
import pandas as pd
st.markdown(“# LLM Notebook”)
st.sidebar.markdown(“# LLM Notebook”)
df = pd.DataFrame({‘first column’: [1, 2, 3, 4], ‘second column’: [10, 20, 30, 40]})
st.text_input(“Your name”, key=”name”)
name = st.session_state.name
st.write(f”This table is created by {name}:”)
st.dataframe(df)
“`
Create a new Python file named pdf.py with the following code:
“`python
import streamlit as st
st.markdown(“# PDF Extraction”)
st.sidebar.markdown(“# PDF Extraction”)
“`
Now, when you run the main.py script, you will see a sidebar with “LLM Notebook” and “PDF Extraction” sections. This allows you to navigate between different pages within your Streamlit app.
Interfacing with LLMs
To interact with LLMs, we will leverage the Langchain_Quickstart from Streamlit. First, we need to install LangChain by running the command:
“`pip install openai langchain“`
Next, we will load our OpenAI API key. There are two options for doing this: either create a file with your key at ~/.openai-api-key.txt or include your API key as an environment variable when running Streamlit. Update main.py with the following code to load the API key:
“`python
import os
def get_openai_api_key():
api_key = os.environ.get(‘OPENAI_API_KEY’)
if not api_key:
key_file = os.path.join(os.path.expanduser(“~”), “.openai-api-key.txt”)
api_key = open(key_file).read().strip()
return api_key
openai_api_key = get_openai_api_key()
“`
Now we can add the ability to query OpenAI in our Streamlit app. Update main.py as follows:
“`python
import streamlit as st
from langchain import OpenAI
from oai_utils import get_openai_api_key
openai_api_key = get_openai_api_key()
def generate_response(input_text, model_name):
llm = OpenAI(temperature=0.7, openai_api_key=openai_api_key, model_name=model_name)
st.info(llm(input_text))
st.markdown(“# LLM Notebook”)
st.sidebar.markdown(“# LLM Notebook”)
with st.form(‘my_form’):
oai_model = st.selectbox(‘Which OpenAI model should we use?’, (‘gpt-3.5-turbo’, ‘gpt-4’, ‘ada’, ‘babbage’, ‘curie’, ‘davinci’))
text = st.text_area(‘Prompt:’, ”, placeholder=”How do I use Streamlit to query OpenAI?”)
submitted = st.form_submit_button(‘Submit’)
if submitted:
generate_response(text, oai_model)
“`
This code allows us to select an OpenAI model and enter a prompt. Clicking the “Submit” button will display the response from the LLM.
Adding Functionality
To give users the ability to select which model to query against, we will utilize the `st.selectbox` widget. Update the `generate_response` function and the form in main.py as follows:
“`python
def generate_response(input_text, model_name):
llm = OpenAI(temperature=0.7, openai_api_key=openai_api_key, model_name=model_name)
st.info(llm(input_text))
st.markdown(“# LLM Notebook”)
st.sidebar.markdown(“# LLM Notebook”)
with st.form(‘my_form’):
oai_model = st.selectbox(
‘Which OpenAI model should we use?’,
(‘gpt-3.5-turbo’, ‘gpt-4’, ‘ada’, ‘babbage’, ‘curie’, ‘davinci’)
)
text = st.text_area(‘Prompt:’, ”, placeholder=”How do I use Streamlit to query OpenAI?”)
submitted = st.form_submit_button(‘Submit’)
if submitted:
generate_response(text, oai_model)
“`
With these updates, users can now select the desired OpenAI model to execute their query.
LLMs and Spreadsheet (aka Editable Dataframe)
To create a spreadsheet interface for LLMs, we will allow users to upload a CSV file and run an LLM against each row individually. Create a new Python file named spreadsheet.py and paste the following code:
“`python
import streamlit as st
import pandas as pd
st.markdown(“# Spreadsheet Interface”)
st.sidebar.markdown(“# Spreadsheet Interface”)
uploaded_file = st.file_uploader(“Upload CSV”, type=”csv”)
if uploaded_file is not None:
df = pd.read_csv(uploaded_file)
st.dataframe(df)
# Additional code for running LLM against each row
“`
This code sets up a Streamlit page with a file uploader widget for users to upload a CSV file. The uploaded file is then displayed as a DataFrame.
Recording Previous Prompts
To store previous prompts and display them in the sidebar, we will use a local JSON file. Create a new Python file named prompts.py and add the following code:
“`python
import os
import json
PROMPTS_FILE = “prompts.json”
def get_prompts():
if os.path.isfile(PROMPTS_FILE):
with open(PROMPTS_FILE, “r”) as f:
return json.load(f)
else:
return []
def render_prompt(prompt):
st.write(prompt)
def add_prompt(prompt):
prompts = get_prompts()
prompts.append(prompt)
with open(PROMPTS_FILE, “w”) as f:
json.dump(prompts, f)
“`
This code includes functions to retrieve previous prompts, render a prompt, and add a new prompt to the existing list.
In main.py, update the `generate_response` function as follows to display previous prompts in the sidebar:
“`python
import streamlit as st
from langchain import OpenAI
from oai_utils import get_openai_api_key
from prompts import get_prompts, render_prompt, add_prompt
openai_api_key = get_openai_api_key()
def generate_response(input_text, model_name):
llm = OpenAI(temperature=0.7, openai_api_key=openai_api_key, model_name=model_name)
st.info(llm(input_text))
st.markdown(“# LLM Notebook”)
st.sidebar.markdown(“# LLM Notebook”)
prompts = get_prompts()
for prompt in prompts:
render_prompt(prompt)
with st.form(‘my_form’):
oai_model = st.selectbox(
‘Which OpenAI model should we use?’,
GIPHY App Key not set. Please check settings