Writing SQL queries can be time-consuming and challenging. If you are tasked with creating solutions for specific business scenarios, you need to be sure that your queries are both efficient and accurate.
ChatGPT is an AI tool that has been trained on a huge set of technical text including SQL. It can answer questions about SQL syntax, generate SQL queries, or explain the output of a given SQL query.
By using ChatGPT for SQL query generation, you can focus on higher-level tasks such as designing reports and data flows. Not only does this save you time, but it also ensures the quality and effectiveness of your data analysis.
By the end of this article, you’ll have a firm understanding of how to use ChatGPT to assist you in writing SQL queries.
Advantages of Using ChatGPT For SQL Queries
ChatGPT is a powerful AI tool that understands natural language. It can serve as a valuable tool for both beginners and seasoned developers.
Beginners can find SQL’s syntax to be complex and challenging to remember. ChatGPT can help by turning natural language requests into SQL queries. This makes learning SQL more interactive and less daunting, as you can see the translation of plain English into SQL in real-time.
Experienced developers can find that formulating complex SQL queries involving multiple tables and conditions can be time-consuming. ChatGPT can speed up the process by generating query skeletons or even complete queries based on the task described in natural language.
This also reduces the chances of syntax errors. Speaking of errors, ChatGPT can also help debug SQL queries by providing possible solutions or explanations. It can also suggest different ways to optimize a query for better performance.
If you want a more general introduction to the tool, check out these articles:
If you want to get started with using it for SQL queries, read on.
How to Get Started With ChatGPT For SQL Queries
If you don’t already have an Open AI account (it’s free!), follow these steps.
Launch a browser and go to https://chat.openai.com/.
Click the signup button.
Sign up with an email address, a Microsoft account, or a Google account.
Provide a phone number for verification.
When you sign in, you can immediately start sending messages to be answered. Your message is known as a ChatGPT prompt.
This article shows you plenty of examples to get you familiar with using the tool as you work with SQL.
How to Start Your ChatGPT Sessions For SQL Queries
Different database vendors have extended the standard SQL code with proprietary syntax.
For example, Microsoft has Transact-SQL (abbreviated as T-SQL) for interacting with a Microsoft SQL Server instance. T-SQL has syntax that will not run on another type of SQL database, such as MySQL or PostgreSQL.
PostgreSQL is an open-source extension that has extra commands and syntax that will not run on SQL Server or Oracle.
This means that when you are asking ChatGTP to provide a SQL query using the extended features of the database system, you should start your Chat session by making clear what implementation of SQL you are using.
You can start your session with a statement like “I’m working with Microsoft SQL Server.” and ask your question in the next sentence.
To make it simpler, you could use use “With SQL Server, how do I …?” when asking about a concept or term.
For the rest of this article, I will focus on standard SQL queries that work on all the major database management systems.
How to Phrase Your Intentions to ChatGPT
It’s important to clearly phrase your intentions to ChatGPT.
Start by describing the structure of your tables so that ChatGPT has an understanding of the data model you are working with. Mention the names and relationships of the tables, as well as the columns and their data types. For example:
I have two tables: orders and customers. The orders table has columns id, customer_id, product, and price. The customers table has columns id, name, and email. The customer_id in the orders table is a foreign key referencing the id column in the customers table.
Next, simply explain the specific query you want to generate using natural language (human-like text instead of code).
Be as precise as possible to help ChatGPT understand what you are looking for. For example:
Write a SQL query that finds the total revenue of each customer by calculating the sum of their orders’ prices.
ChatGPT returns a SQL query with a breakdown of the different parts. Take a look at the following example of a select statement:
The AI technology has generated a SQL query based on the table names and request you provided.
But the tool doesn’t just stop there. It points out a couple of extra things:
When you would want to use a LEFT JOIN instead.
You may want to use the ID instead of the NAME if customer names aren’t unique.
Formulating Requests Based on Query Results
To achieve better query results with ChatGPT, you can break complex requests into smaller chunks to maintain a conversational flow.
For example, suppose you wanted a more complex query that filters customers based on their city.
You can continue the chat session by providing this new information.
I have another table called customer_address with columns id, customer_id, street, city, state. The customer_id is a foreign key to the customer table.
Amend the previous SQL query to filter for customers from Chicago.
ChatGPT returns the amended query with a new WHERE clause.
If you’re inexperienced with SQL, this is a great way of building up your knowledge.
Remember to keep your requests concise and within the relevant context to ensure a smooth experience working with ChatGPT.
4 Best Practices For Query Generation
Here are some tips to ensure that the SQL Code ChatGPT produces is the correct solution:
Be specific with table and column names to avoid ambiguity.
Clearly state any conditions, filters, or sorting requirements.
Specify the desired output format, such as columns or aggregations.
If necessary, provide examples of similar complex queries.
The more precise you are in your description, the better the generated query will be.
2 Ways to Optimize For Performance
When using ChatGPT to generate SQL queries, it’s crucial to consider the performance aspect of the queries. Here are two ways to optimize SQL queries for performance:
use appropriate indexes
avoiding overly complex SQL queries
Use Appropriate Indexes
It’s possible in some extensions of SQL to specify that the query should use a specific index, however this is not considered good practice. If the database schema or the data profile changes over time, you could end up with a query that is forced to use an inefficient index.
However, you can ask ChatGPT for advice on optimum indexes for your query. Here’s an example.
I want to create one or more indexes that make this query most efficient. Please assist.
ChatGPT will recommend indexes and give the reasons behind the choices.
Its important that SQL developers don’t blindly create all suggested indexes. They may not be appropriate for the actual data. I advise that you try and test each one.
Avoid Overly Complex Queries
One of the advantages of building up your queries as we suggested in the previous section is that you are less likely to get a long complex query that is difficult to read and to maintain.
If ChatGPT gives you a query with multiple subqueries that are difficult to follow, you can tell it to provide an alternative script that uses multiple queries and one or more temporary tables.
Suppose I decided that the query filtering on “Chicago” was too complex (it’s not!). I would instruct ChatGPT like this:
I want to break the previous query into two separate queries that use a temporary table. The script must achieve the same results.
The ever-helpful ChatGPT nearly falls over itself to do your bidding. It says:
Sure, we can achieve the same results by first creating a temporary table to store the total revenue for each customer, and then querying this temporary table to filter for customers from Chicago.
The tool continues by providing two separate code sections that produce the same results as the previous examples.
How to Present Data to ChatGPT
When you want to troubleshoot your queries, you can provide sample data to ChatGPT and ask it to run the SQL.
An earlier section showed you how to describe tables and relationships. You can also provide the CREATE TABLE statements to ChatGPT.
The next step is to provide the data in a summarized or tabular format to help ChatGPT understand your database and interpret the results.
For example, you can provide the data like this:
Here is the data in these tables.
Customers
1, “John Smith”, “john@smith.com”
2, “Mary Doe”, “mary@doe.com”
Orders
1, 1, “Hoodie”, 14.99
2, 1, “T-Shirt”, 5.99
3, 2, “Jeans”, 24.99
Customer_Address
1, 1, “Hope Street”, “Chicago”, “Illinois”
2, 2, “Cherry Lane”, “Philadelphia”, “Pennsylvania”
Note that only John Smith lives in Chicago. Having provided the data, you can ask ChatGPT to provide the result of the specific query.
If you’ve asked the tool to try several variations in the session, I advise that you copy and paste the exact query you want it to “run” so that there is no ambiguity.
Enter this as the prompt: “Provide the result of this query:”
ChatGPT will use the query to calculate the total revenue of John Smith’s orders. Here is an example, and it is the correct answer!
2 Disadvantages Of Using ChatGPT
The two main disadvantages to watch out for are:
incorrect syntax
inefficient queries
Incorrect Syntax
Sometimes ChatGPT will generate SQL that does not compile or produces the wrong results. This is more likely when you have asked it to work with proprietary extensions to standard SQL.
If you are wrapping the generated queries within stored procedures, be sure to test them before you deliver the results to a data analyst or business users.
If you’re interested in a more curated way to use natural language queries for your data, check out Microsoft’s implementation in Power BI:
Inefficient Queries
When you want to ensure that your SQL queries are efficient, you examine the query execution plan.
This is the sequence of operations performed by the database management system. It’s a bit like opening the machine and looking at what’s running side.
The problem with ChatGPT is that it can’t see how your database management system is processing your SQL query.
For example, does it filter the customers by state first or perform that part last? That decision could make a big difference to the performance.
If you observe that your query is slow to product output, you will need to roll up your sleeves and examine the execution plan yourself.
Final Thoughts
You have seen real-life examples of using ChatGPT to write and amend typical SQL queries used in business.
It’s important to note that while the AI tool can provide a strong foundation and assist with generating SQL queries, it doesn’t replace the need for a robust understanding of SQL.
However, when used correctly, ChatGPT can be an invaluable assistant to SQL developers.