Introduction to Streamlit and Installation
Streamlit is an open-source Python library that makes it easy to create and share custom web applications for machine learning and data science. With Streamlit, you can quickly build interactive tools and dashboards. This guide will walk you through installing Streamlit and creating a basic application.
Prerequisites
- Ensure you have Python installed on your system. You can download it from python.org.
Installation
To get started with Streamlit, you need to install it. Follow the steps below:
Create a Virtual Environment:
It’s a good practice to create a virtual environment for your projects to manage dependencies.
python -m venv myenv
source myenv/bin/activate # On Windows, use `myenv\Scripts\activate`Install Streamlit:
Use pip, Python’s package installer, to install Streamlit.
pip install streamlit
Creating a Basic Streamlit Application
Once Streamlit is installed, you can create a simple application. Follow these steps:
Create a New Python File:
Create a new file named
app.py
.# app.py
import streamlit as st
# Title of the application
st.title('Basic Streamlit Application')
# Add a header
st.header('Welcome to your first Streamlit app')
# Add some text
st.write('Streamlit is awesome!')
# Add a simple data display
data = {
'First Column': [1, 2, 3, 4],
'Second Column': [10, 20, 30, 40]
}
st.write(data)Running the Application:
To run your Streamlit application, use the Streamlit CLI command:
streamlit run app.py
This command will start a local web server. Open your web browser and navigate to the URL provided in the terminal (usually
http://localhost:8501
).
Exploring the Web Interface
- Title and Header: You will see the title and header at the top of the page.
- Text: Below it, the text ‘Streamlit is awesome!’ will be displayed.
- Data: Finally, you will see a simple table displaying the data.
With these steps, you have successfully set up a basic Streamlit application. In the next sections of this guide, we will explore more advanced features and customization options available in Streamlit.
Setting Up the Development Environment
Prerequisites
Before we proceed, ensure you have completed the following:
- Installed Streamlit as per the instructions provided in the previous units.
- Familiarized yourself with the basic concepts of Streamlit from the “Introduction to Streamlit” unit.
Directory Structure
Set up your project directory structure as follows:
my_streamlit_app/
??? app.py
??? data/
? ??? sample_data.csv
??? requirements.txt
Creating requirements.txt
Ensure requirements.txt
exists with the necessary dependencies:
streamlit
pandas
numpy
Developing the Application
app.py
Create and open the app.py
file in your project directory. Populate it with the basic structure of a Streamlit app:
import streamlit as st
import pandas as pd
import numpy as np
# Title of the application
st.title("Basic Streamlit Application")
# Load data
def load_data():
data = pd.read_csv('data/sample_data.csv')
return data
# Main function
def main():
data = load_data()
# Display a header
st.header("Data Overview")
# Display the dataframe
st.write(data)
# Statistics
st.subheader("Statistics")
st.write(data.describe())
# Add more functionality as needed for your data-related tasks
if __name__ == "__main__":
main()
Sample Data
Add a CSV file (sample_data.csv
) inside the data
directory to work with. Ensure it contains some sample data like this:
index,value
0,10
1,20
2,30
3,40
4,50
5,60
Running the Application
With everything set up, navigate to your project directory in your command-line interface and run the following command:
streamlit run app.py
Your web browser should open the basic Streamlit application, displaying your dataset, basic statistics, and the additional functionalities you may have included.
At this point, your development environment should be properly set up, allowing you to expand and customize your Streamlit app as needed.
Loading and Displaying Data in Streamlit
Step 1: Import Required Libraries
import streamlit as st
import pandas as pd
Step 2: Load Data
@st.cache
def load_data(filepath):
data = pd.read_csv(filepath)
return data
uploaded_file = st.file_uploader("Choose a CSV file", type="csv")
if uploaded_file is not None:
data = load_data(uploaded_file)
st.write("Data Loaded Successfully!")
Step 3: Display Data
if uploaded_file is not None:
st.dataframe(data)
Step 4: Putting It All Together
import streamlit as st
import pandas as pd
@st.cache
def load_data(filepath):
data = pd.read_csv(filepath)
return data
uploaded_file = st.file_uploader("Choose a CSV file", type="csv")
if uploaded_file is not None:
data = load_data(uploaded_file)
st.write("Data Loaded Successfully!")
st.dataframe(data)
Explanation
Import Required Libraries: The Streamlit (
st
) and pandas (pd
) libraries are imported for creating the application and handling data, respectively.Load Data:
- The
load_data
function loads the data from a CSV file using pandas. - The
@st.cache
decorator ensures data is cached for quick reloads without re-running the loading process. - The
file_uploader
widget allows users to upload a CSV file. - If a file is uploaded, the
load_data
function is called to load the data.
- The
Display Data:
- The
dataframe
method of Streamlit displays the data in a tabular format. - This is executed only if an uploaded file is successfully loaded.
- The
Putting It All Together:
- Combine all steps in a single script.
- This script, when run in a Streamlit environment, will present a file uploader, load the file into a pandas DataFrame when provided, and display it within the Streamlit app.
Copy this code and run it in your Streamlit environment to load and display your data interactively.
Basic Data Visualization in Streamlit
Streamlit allows you to create interactive and informative visualizations with ease. Below is a practical implementation for adding basic data visualizations to your Streamlit application. This assumes that you have already loaded your data and that it is available as a Pandas DataFrame named df
.
Implementing Basic Data Visualizations in Streamlit
import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Assuming 'df' is already loaded in previous steps
# Sample DataFrame for demonstration
df = pd.DataFrame({
'Category': ['A', 'B', 'C', 'D'],
'Values': [23, 45, 56, 78]
})
st.title("Basic Data Visualization")
st.write("This section will display various basic data visualizations using Matplotlib and Seaborn.")
# Bar Chart
st.subheader("Bar Chart")
st.bar_chart(df[['Category', 'Values']].set_index('Category'))
# Line Chart
st.subheader("Line Chart")
st.line_chart(df[['Category', 'Values']].set_index('Category'))
# Area Chart
st.subheader("Area Chart")
st.area_chart(df[['Category', 'Values']].set_index('Category'))
# Matplotlib Figure - Scatter Plot
st.subheader("Scatter Plot")
plt.figure(figsize=(10, 6))
plt.scatter(df['Category'], df['Values'], color='red')
plt.title('Scatter Plot')
plt.xlabel('Category')
plt.ylabel('Values')
st.pyplot(plt)
# Seaborn Plot
st.subheader("Seaborn Plot")
fig, ax = plt.subplots()
sns.barplot(x='Category', y='Values', data=df, ax=ax)
ax.set(title='Seaborn Barplot', xlabel='Category', ylabel='Values')
st.pyplot(fig)
Explanation
- Bar Chart: Uses Streamlit’s built-in
st.bar_chart()
to display a bar chart with the data. - Line Chart: Uses Streamlit’s built-in
st.line_chart()
to display a line chart with the data. - Area Chart: Uses Streamlit’s built-in
st.area_chart()
to display an area chart with the data. - Scatter Plot: Uses Matplotlib to create a scatter plot. The plot is displayed using
st.pyplot()
. - Seaborn Plot: Uses Seaborn to create a bar plot, and
st.pyplot()
is used to display the figure created by Seaborn.
With this existing code, you can enhance your Streamlit application by providing interactive and insightful data visualizations. Adapt the data and visualizations according to your project requirements.
Interactive Widgets in Streamlit
In this section, we will add interactive widgets to your Streamlit application. These widgets will help your users to interact with the data dynamically. We will cover a few essential interactive widgets, including sliders, select boxes, and text inputs, to enhance user engagement.
Example Implementation
import streamlit as st
import pandas as pd
import numpy as np
# Load sample data (assuming it's a DataFrame)
data = pd.DataFrame({
'Category': ['A', 'B', 'C', 'D'],
'Value': [10, 20, 30, 40]
})
# Display DataFrame
st.write("Dataset:")
st.dataframe(data)
# Interactive Widget 1: Slider
st.write("Adjust the sliders to filter data based on value.")
min_val = st.slider('Select minimum value', min_value=int(data['Value'].min()), max_value=int(data['Value'].max()), value=int(data['Value'].min()))
max_val = st.slider('Select maximum value', min_value=int(data['Value'].min()), max_value=int(data['Value'].max()), value=int(data['Value'].max()))
filtered_data = data[(data['Value'] >= min_val) & (data['Value'] <= max_val)]
st.write("Filtered Data:")
st.dataframe(filtered_data)
# Interactive Widget 2: Select Box
st.write("Choose a category to filter the data.")
category = st.selectbox('Select category', options=data['Category'].unique())
category_filtered_data = data[data['Category'] == category]
st.write("Category Filtered Data:")
st.dataframe(category_filtered_data)
# Interactive Widget 3: Text Input
st.write("Input a custom message.")
custom_message = st.text_input('Enter message', 'Hello, Streamlit!')
st.write(f"Your message: {custom_message}")
# Additional example for combining widgets
st.write("Combine Slider and Select Box to filter data.")
combined_min_val = st.slider('Select minimum value for combined filter', min_value=int(data['Value'].min()), max_value=int(data['Value'].max()), value=int(data['Value'].min()))
combined_max_val = st.slider('Select maximum value for combined filter', min_value=int(data['Value'].min()), max_value=int(data['Value'].max()), value=int(data['Value'].max()))
combined_category = st.selectbox('Select category for combined filter', options=data['Category'].unique())
combined_filtered_data = data[(data['Value'] >= combined_min_val) & (data['Value'] <= combined_max_val) & (data['Category'] == combined_category)]
st.write("Combined Filtered Data:")
st.dataframe(combined_filtered_data)
Explanation
Loading Data:
- A sample DataFrame is created for demonstration purposes.
- The data is displayed using
st.dataframe()
.
Sliders:
- Two sliders are used to select minimum and maximum values.
- The data is filtered based on these values.
Select Box:
- A select box is used to filter the data based on category.
- The filtered data is displayed accordingly.
Text Input:
- A text input widget is provided for users to input a custom message which is then displayed.
Combining Widgets:
- A combined filtering approach leveraging both sliders and a select box to filter data is demonstrated.
This implementation incorporates interactive components to make the Streamlit application dynamic and user-friendly.
Advanced Data Visualization Techniques
For this part of the project, we will dive into creating advanced data visualizations within a Streamlit application using more complex plots and interactivity.
Streamlit Code for Advanced Visualizations
import streamlit as st
import pandas as pd
import numpy as np
import altair as alt
# Load the data
df = pd.read_csv('your_data.csv')
# Create a sidebar for user inputs
st.sidebar.header('User Inputs')
# Example user input: Selecting a column for visualization
feature = st.sidebar.selectbox('Select a feature for visualization', df.columns)
# Plot 1: Histogram with interactive bin size
bins = st.sidebar.slider('Select number of bins for histogram', min_value=10, max_value=100, value=30, step=10)
hist = alt.Chart(df).mark_bar().encode(
alt.X(f'{feature}:Q', bin=alt.Bin(maxbins=bins)),
y='count()'
).properties(
title=f'Histogram of {feature}'
)
st.altair_chart(hist, use_container_width=True)
# Plot 2: Scatter plot with color grouping
color_feature = st.sidebar.selectbox('Select a feature for color grouping in scatter plot', df.columns, index=1)
scatter = alt.Chart(df).mark_circle(size=60).encode(
x=alt.X(f'{feature}:Q'),
y=alt.Y(f'{color_feature}:Q'),
color=f'{color_feature}:N',
tooltip=[feature, color_feature]
).interactive().properties(
title=f'Scatter plot of {feature} vs {color_feature}'
)
st.altair_chart(scatter, use_container_width=True)
# Plot 3: Line chart with interactive date range
date_feature = st.sidebar.selectbox('Select a date feature for line chart', df.columns[df.dtypes == 'datetime64[ns]'])
date_range = st.sidebar.slider('Select date range', min_value=df[date_feature].min(), max_value=df[date_feature].max(), value=(df[date_feature].min(), df[date_feature].max()))
filtered_df = df[(df[date_feature] >= date_range[0]) & (df[date_feature] <= date_range[1])]
line_chart = alt.Chart(filtered_df).mark_line().encode(
x=alt.X(f'{date_feature}:T'),
y=f'{feature}:Q'
).properties(
title=f'Line Chart of {feature} over time'
)
st.altair_chart(line_chart, use_container_width=True)
Key Features Showcased:
Histogram with Interactive Bin Size:
Adjust the number of bins via a slider to see how it affects the distribution of the selected feature.Scatter Plot with Interactive Color Grouping:
Select various features to be plotted on the x and y axes, and choose a feature for color grouping. The plot updates based on your selections.Line Chart with Date Range Filter:
Select a date range to filter the data shown in the line chart, allowing for dynamic analysis of trends over time.
Running the Application
To display this advanced visualization Streamlit app, save the above code into a file named app.py
and run it using the Streamlit command:
streamlit run app.py
This command will launch your browser and show the advanced visualization techniques using your specific dataset.
Adding Interactivity to Visualizations in Streamlit
Import Libraries
import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Load Data
# Placeholder for data loading - replace with actual data loading
df = pd.read_csv('your_dataset.csv')
Interactive Visualization
Let’s create an interactive scatter plot which allows users to select different variables for the x and y axes.
# Sidebar for user input
st.sidebar.header('User Input Features')
# Get the column names for user selection
columns = df.columns.tolist()
x_axis = st.sidebar.selectbox('Select X-axis variable', columns)
y_axis = st.sidebar.selectbox('Select Y-axis variable', columns)
# Plotting the interactive scatter plot
fig, ax = plt.subplots()
sns.scatterplot(data=df, x=x_axis, y=y_axis, ax=ax)
ax.set_title(f'Scatter Plot of {x_axis} vs {y_axis}')
# Display the plot
st.pyplot(fig)
Add More Interactivity – Filter Data
We’ll add sliders to filter the data based on a chosen numerical column.
# Sidebar for filtering data
filter_column = st.sidebar.selectbox('Select column to filter', df.select_dtypes(include=['float64', 'int64']).columns)
min_value = float(df[filter_column].min())
max_value = float(df[filter_column].max())
filter_values = st.sidebar.slider(f'Select range of {filter_column}', min_value, max_value, (min_value, max_value))
# Filter the dataframe based on the user's selection
filtered_df = df[(df[filter_column] >= filter_values[0]) & (df[filter_column] <= filter_values[1])]
# Create an updated scatter plot based on filtered data
fig, ax = plt.subplots()
sns.scatterplot(data=filtered_df, x=x_axis, y=y_axis, ax=ax)
ax.set_title(f'Scatter Plot of {x_axis} vs {y_axis} (Filtered)')
# Display the updated plot
st.pyplot(fig)
Conclusion
These steps demonstrate how to add interactivity to visualizations in Streamlit using widgets like selectbox
and slider
, allowing users to dynamically choose variables for the axes and filter the dataset. You can extend these techniques to other types of plots and interactivity features.
Building a Complete Data Dashboard with Streamlit
To build a complete data dashboard using Streamlit, you will combine the functionalities of data loading, visualization, and user interactivity to present a cohesive and interactive data analysis tool. Below is the practical implementation with Streamlit:
import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load and Display Data
@st.cache
def load_data(filepath):
return pd.read_csv(filepath)
data = load_data('path/to/your/data.csv')
st.title("Data Dashboard")
st.write("A complete data dashboard using Streamlit")
st.header("Data Preview")
st.dataframe(data.head())
# Basic Data Visualization
st.header("Basic Data Visualization")
st.subheader("Choose feature for histogram")
feature = st.selectbox("Select feature", data.columns)
fig, ax = plt.subplots()
sns.histplot(data[feature], kde=True, ax=ax)
st.pyplot(fig)
# Advanced Data Visualization
st.header("Advanced Data Visualization")
st.subheader("Correlation Heatmap")
if st.checkbox("Show Heatmap"):
corr = data.corr()
fig, ax = plt.subplots()
sns.heatmap(corr, annot=True, ax=ax, cmap='coolwarm')
st.pyplot(fig)
# Adding Interactivity to Visualizations
st.header("Interactive Visualizations")
st.subheader("Scatter Plot with Filters")
x_axis = st.selectbox("Choose X-axis", data.columns)
y_axis = st.selectbox("Choose Y-axis", data.columns)
hue = st.selectbox("Choose hue", data.columns.insert(0, None))
st.write("Choose filters for the data")
min_value = st.slider("Min Value", min_value=int(data[x_axis].min()), max_value=int(data[x_axis].max()), value=int(data[x_axis].min()))
max_value = st.slider("Max Value", min_value=int(data[x_axis].min()), max_value=int(data[x_axis].max()), value=int(data[x_axis].max()))
filtered_data = data[(data[x_axis] >= min_value) & (data[x_axis] <= max_value)]
fig, ax = plt.subplots()
sns.scatterplot(data=filtered_data, x=x_axis, y=y_axis, hue=hue, ax=ax)
st.pyplot(fig)
# Summary Statistics and Metrics
st.header("Summary Statistics")
summary_metrics = data.describe().T
st.write(summary_metrics)
if st.checkbox("Show metrics for a specific feature"):
metric = st.selectbox("Select feature for metrics", data.columns)
st.write(data[metric].describe())
# Adding a Download Button
st.header("Download Filtered Data")
csv = filtered_data.to_csv(index=False)
st.download_button(
label="Download filtered data as CSV",
data=csv,
file_name='filtered_data.csv',
mime='text/csv',
)
# Run the Streamlit app
# To run, use command in terminal: streamlit run <name_of_this_script.py>
This practical implementation integrates various essential components to create a usable and interactive dashboard in Streamlit. It provides functionalities for data preview, basic and advanced visualization, interactivity to filters and plots, and summary statistics. The code also includes a download button for users to download the filtered dataset.
Testing and Debugging Your Streamlit App
When working on a Streamlit application, effective testing and debugging are vital to ensure the app’s functionality and performance. Here are some practical steps and code examples to help you test and debug your Streamlit app.
1. Unit Testing with unittest
To ensure that individual pieces of your Streamlit app function correctly, you can use Python’s built-in unittest
framework.
Example: Unit Testing a Data Processing Function
import unittest
def process_data(data):
# Simple function to demonstrate testing
return [i**2 for i in data]
class TestProcessData(unittest.TestCase):
def test_process_data(self):
self.assertEqual(process_data([1, 2, 3]), [1, 4, 9])
self.assertEqual(process_data([]), [])
if __name__ == '__main__':
unittest.main()
Running Unit Tests
To execute the tests, save the code to a file called test_example.py
and run:
python test_example.py
2. Adding Debug Statements
Inserting debug statements within your application helps you understand the flow and identify issues.
Example: Using Debug Statements in Streamlit
import streamlit as st
def main():
st.title("Debugging Example App")
data = [1, 2, 3, 4]
st.write(f"Initial data: {data}") # Debug statement
st.write(f"Processed data: {process_data(data)}") # Debug statement
def process_data(data):
st.write(f"Processing data: {data}") # Debug statement
return [i**2 for i in data]
if __name__ == "__main__":
main()
3. Using Logging
Implement logging to capture runtime information persistently.
Example: Implementing Logging
import streamlit as st
import logging
# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def main():
st.title("Logging Example App")
data = [1, 2, 3, 4]
logger.info(f"Initial data: {data}") # Logger statement
processed_data = process_data(data)
logger.info(f"Processed data: {processed_data}") # Logger statement
st.write(f"Processed data: {processed_data}")
def process_data(data):
logger.info(f"Processing data: {data}") # Logger statement
return [i**2 for i in data]
if __name__ == "__main__":
main()
4. Using st.cache
for Debugging Long Computations
Streamlit’s st.cache
helps optimize performance but can also introduce bugs if not used correctly.
Example: Implementing st.cache
with Debugging
import streamlit as st
import logging
# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@st.cache
def process_data(data):
logger.info("Running cached process_data function") # Logger statement
return [i**2 for i in data]
def main():
st.title("Cache Example App")
data = [1, 2, 3, 4]
logger.info(f"Initial data: {data}")
processed_data = process_data(data)
logger.info(f"Processed data: {processed_data}")
st.write(f"Processed data: {processed_data}")
if __name__ == "__main__":
main()
5. Using Streamlit’s st.write
for Debugging
st.write
can be used throughout your Streamlit app for simple, quick debugging.
Example: Debugging with st.write
import streamlit as st
def main():
st.title("Simple Debugging App")
data = [1, 2, 3, 4]
st.write("Initial data:", data) # Debug output
processed_data = process_data(data)
st.write("Processed data:", processed_data) # Debug output
def process_data(data):
st.write("Processing data...") # Debug output
return [i**2 for i in data]
if __name__ == "__main__":
main()
Applying these practices will help you test and debug your Streamlit application effectively, ensuring it works as intended and is free of bugs.
Deploying Your Streamlit Application
To deploy your Streamlit application, you can use Streamlit sharing or any other cloud platform like Heroku. Below is a practical guide using Streamlit sharing and Heroku for deployment.
Streamlit Sharing
Sign Up for Streamlit Sharing
Head to Streamlit Sharing and request an invite. Streamlit Sharing is a free platform provided by Streamlit for hosting applications.
Prepare Your Repository
Ensure your project is hosted on GitHub, and it should include:
app.py
or the main Python file for your Streamlit app.requirements.txt
listing all your dependencies.
Example of
requirements.txt
:streamlit
pandas
matplotlibDeploy the App
a. Go to your Streamlit Sharing account.
b. Click “New App” and link your GitHub repository.
c. Follow the instructions on-screen to fill in details like branch and the main file path.
Run the App
After configuration, hit ‘Deploy’ and your app will be live with a shareable link provided by Streamlit.
Deploy using Heroku
Prepare Your Project
Ensure your project includes:
app.py
or the main Python file for your Streamlit app.requirements.txt
listing all your dependencies.Procfile
to specify the command to run.
Example of
requirements.txt
:streamlit
pandas
matplotlibExample of
Procfile
:web: sh setup.sh && streamlit run app.py
Create a
setup.sh
setup.sh
script ensures the runtime environment is ready. Example:mkdir -p ~/.streamlit/
echo "\
[server]\n\
headless = true\n\
port = $PORT\n\
enableCORS = false\n\
\n\
" > ~/.streamlit/config.tomlDeploy to Heroku
a. Install Heroku CLI and login:
heroku login
b. Create a new Heroku app:
heroku create your-app-name
c. Push your code to Heroku:
git add .
git commit -m "Initial commit"
git push heroku masterScale the App
Make sure at least one dyno is running:
heroku ps:scale web=1
Open Your Deployed App
Open your app using:
heroku open
Your Streamlit app should now be live using the provided Heroku domain or Streamlit Sharing link.