ChatGPT for Regular Expressions: Is This is Game Changer?

by | ChatGPT

Tired of going cross-eyed writing regular expressions (regex)? Regex programming can be powerful, but it’s no secret that it can be a pain in the b*tt to master. Fortunately, AI tools like ChatGPT can significantly simplify the process.

ChatGPT excels at generating human-like text and can help you write, test, and troubleshoot regex patterns with ease. The language model is capable of understanding regular expressions just like an expert programmer.

In this article, we’ll explore how this new technology can be used to simplify and enhance the process of working with regular expressions, making it more accessible and efficient for developers. We’ll also consider the limitations and challenges of using ChatGPT for regular expressions.

Let’s go!

How to Use ChatGPT for Regular Expressions

ChatGPT for Regular Expressions.

ChatGPT is an advanced AI language model developed by OpenAI that was unveiled to the public in November 2022.

It’s a valuable tool for a wide range of applications, including content generation, question-answering, and assisting with various programming tasks.

You can leverage ChatGPT’s natural language processing capabilities to simplify complex tasks like working with and optimizing your regex expressions. You can give it all kinds of parameters, including where to put a decimal place, what character class you want, which replace operations to use, the type of response to output, and more.

In this section, we’ll guide you through the process of using ChatGPT for various regular expression tasks, from generating patterns to testing, validating, and optimizing them. Follow these steps to effectively harness the power of ChatGPT for your regex needs.

Step 1. Describe Your RegEx Requirement

An example of a regular expression requirement in ChatGPT

Begin by providing a clear and specific description of the output you need.

If possible, include examples of both desired matches and non-matches to help ChatGPT better understand your requirements.

Example: “Generate an expression that matches email addresses. The result should match ‘[email protected]‘ and ‘[email protected]‘, but not ‘[email protected]‘ or ‘john@example’.”

Step 2. Generate the Regex Pattern

An example of a RegExp generated by ChatGPT

Chat GPT will process your request and output a regex expression based on your description.

It’s essential to carefully review the provided result to ensure it meets your requirements.

Example: ChatGPT’s response: “Here’s an expression for email addresses: ^w+([.-]?w+)*@w+([.-]?w+)*(.w{2,})+$”

Step 3. Test and Validate the Output

ChatGPT's suggestion for how to test and validate the output

To ensure the accuracy and reliability of the generated expression, test it against a comprehensive set of examples, including both positive (matching) and negative (non-matching) cases.

As you can see, it’s incredibly easy to use Chat GPT for generating regex patterns. It can quickly and effortlessly generate regex patterns to match a wide array of input formats.

Gone are the days of laboring over intricate syntax and intricate RegEx formulation. Simply provide Chat GPT with a clear and precise description of your desired criteria and watch it work its magic.

Intrigued? Keep reading to explore a variety of example prompts that demonstrate just how easy and effective it is to harness the power of ChatGPT for your regular expression needs!

7 Examples of ChatGPT Prompts for Regex Patterns

A phone with examples of ChatGPT queries for RegExp

Now that we’ve gone over how you can use Chat GPT to create regular expressions, let’s take a look at examples of inputs we asked ChatGPT.

We present seven examples of good prompts, showcasing the versatility of the AI chat bot in handling diverse technical tasks and simple things across various formats and use cases.

Example 1: Regex for Phone Numbers

Prompt: “Create a Regex that matches valid US phone numbers, including area codes. It should match numbers in the following digit formats: (123) 456-7890, 123-456-7890, and 123.456.7890, but not 1234567890 or 1234-567-890.”

Result: `^(?(d{3}))?[-. ]?(d{3})[-. ]?(d{4})$`

Example 2: Regex for URLs

Prompt: “Make a regular expression to match URLs, including both HTTP and HTTPS protocols. It should match all the URLs like ‘https://www.example.com‘ and ‘http://example.org‘, but not ‘ftp://example.net‘ or ‘www.example.com‘.”

Result: `^(https?://)?([da-z.-]+).([a-z.]{2,6})([/w .-]*)*/?$`

Example 3: Regex for Date Extraction

Prompt: “Provide a regular expression to extract dates in the format ‘YYYY-MM-DD’ from a text. The digits should match dates like ‘2023-05-04’ and ‘2021-12-31′, but not ’05-04-2023’ or ‘2023/05/04’.”

Result: `^(d{4})-(d{2})-(d{2})$`

Example 4: Regex for IPv4 Address Validation

Prompt: “Create a regular expression to validate IPv4 addresses. The digits should match addresses like ‘192.168.1.1’ and ‘8.8.8.8’, but not ‘256.0.0.0’ or ‘192.168.1.256’.”

Result: `^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`

Example 5: RegEx for Credit Card Number Matching

Prompt: “Write an expression that matches credit card numbers from Visa, Mastercard, and American Express. It should match numbers in the character class ‘4111 1111 1111 1111’, ‘5555 5555 5555 4444’, and ‘3782 822463 10005’, but not ‘1234 5678 9012 3456’.”

Result: `^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})$`

Example 6: Regex for Matching SS Numbers

Prompt: “Provide an expression to match social security numbers (SSNs) in the character class ‘XXX-XX-XXXX’. It should match numbers like ‘123-45-6789’, but not ‘123456789’ or ’12-3456-789′.”

Result: `^(?!000)(?!666)[0-8]d{2}-(?!00)d{2}-(?!0000)d{4}$`

Example 7: Regex for Validating Usernames

Prompt: “Create a regular expression to validate usernames with the following rules: must start with a letter, can contain letters, numbers, dashes, and special characters like underscores, and must be between 3 and 16 characters long. The Regex should match usernames like ‘user_123’, ‘Jane-Doe’, and ‘a3b_c’, but not ‘123_user’, ‘_user123’, or ‘user@123’.”

Result: `^[a-zA-Z][w-_]{2,15}$`

These examples demonstrate how ChatGPT can simplify and enhance regular expression tasks if you give it a good prompt, making it a powerful ally for developers working with various formatting, character class, and validation requirements.

Next, we’re going to take a look at how you can evaluate what ChatGPT generates for you.

Steps to Evaluate ChatGPT-Generated Regex

Steps to take when evaluating RegEx generated using ChatGPT

It’s important to evaluate and test ChatGPT responses to ensure they meet your requirements and avoid potential issues.

This section will guide you through the process of assessing regular expressions provided by ChatGPT and identifying potential problems.

1. Validate the Output Against Requirements

Check the output against your initial requirements and ensure it accurately captures the desired matches while excluding non-matches. Test the pattern using a variety of examples that cover different scenarios and edge cases.

Example

Let’s consider the requirement of validating email addresses. Suppose ChatGPT provides you with the following:

`^[w-]+(.[w-]+)*@[A-Za-z0-9-]+(.[A-Za-z0-9]+)*(.[A-Za-z]{2,})$`

To validate this pattern against your requirements, you’ll need to test it against a variety of email addresses, including valid and invalid examples, to ensure it works as expected.

Taking the time to check the result will help you confirm that the output meets your requirements for validating email addresses.

2. Assess the Pattern for Readability and Maintainability

Programmer assessing ChatGPT code for readability

Examine the output for readability and maintainability. A well-structured and understandable criteria is easier to debug, modify, and maintain over time.

Example

Let’s take an example that matches URLs, including both HTTP and HTTPS protocols. Suppose ChatGPT provides you with the following output:

`^(https?://)?([-w]+(.[-w]+)*.)[a-z]{2,}(/[^s]*)?$`

First, examine the Regex to ensure it’s readable and maintainable. Here’s a breakdown of the components:

  • ^: Start of the line

  • (https?://)?: Optional “http://” or “https://”

  • ([-w]+(.[-w]+)*.): Matches domain name and subdomains, allowing for hyphens and dots

  • [a-z]{2,}: Matches top-level domain, allowing for at least two lowercase letters

  • (/[^s]*)?: Optional path, matching any non-whitespace characters after a forward slash

  • $: End of the line

Upon reviewing the Regex, you can see that it uses character classes, groups, and quantifiers to create a well-structured and understandable rule.

It doesn’t contain unnecessary character classes, excessive escape characters, or sophisticated groupings that would make it difficult to read or maintain.

The Regex is organized in a way that makes it easier to debug, modify, and maintain over time, fulfilling the criterion of assessing the Regex for readability and maintainability.

3. Evaluate Performance and Efficiency

Analyze the generated output for potential performance bottlenecks, such as excessive backtracking or inefficient character classes, particularly when processing large volumes of data or in performance-sensitive environments.

Example

Consider an output to match valid IPv4 addresses, and let’s say ChatGPT provides you with the following output:

`^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]).){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$`

To evaluate the performance and efficiency of this output, you’ll want to ensure it doesn’t cause excessive backtracking or take too long to process a variety of IPv4 addresses.

That means checking the result with various IPv4 addresses, including valid and invalid examples, and edge cases like very long invalid IP addresses.

After testing the output with valid, invalid, and edge-case IPv4 addresses, you’ll want to analyze its performance and efficiency. If it returns results quickly and without causing excessive backtracking, it can be considered efficient.

4. Check for Engine-Specific Differences and Compatibility

Make sure the generated RegExp is compatible with the engine you're using

Ensure the Regex is compatible with the target engine in your programming language. Be aware of any engine-specific differences or features that might impact the rule’s behavior.

Example

Let’s assume you’re using ChatGPT to make a Regex for validating dates in the format MM/DD/YYYY. The output is:

`^(0[1-9]|1[0-2])/(0[1-9]|[12][0-9]|3[01])/(19|20)dd$`

This rule works well on engines like JavaScript and Python. However, when you attempt to use it in a different Regex engine like POSIX, you might notice differences in regular expression syntax and features.

For example, the POSIX engine uses Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE) that don’t support some modern features like shorthand character classes “d” for digits or non-capturing groups.

To ensure compatibility across different Regex engines, you’ll need to modify the rule according to the specific engine’s syntax and capabilities. In the case of POSIX ERE, you could rewrite it as:

`^(0[1-9]|1[0-2])/(0[1-9]|[12][0-9]|3[01])/(19|20)[0-9][0-9]$`

By checking for engine-specific differences and compatibility, you can ensure the generated Regex will work correctly across various platforms and languages.

5. Handle Domain-Specific Concepts

For string search criteria involving intricate or domain-specific concepts, ensure that ChatGPT’s understanding of the requirements is accurate and up-to-date. This might involve additional research or consultation with domain experts to verify that the generated rule meets the necessary criteria and captures all relevant nuances.

Example

For a string search criteria that extracts specific data from log files or parses a domain-specific language, verify that it accurately handles all relevant syntax, edge cases, and variations.

6. Iterate and Refine the Result

A female developer iterating and refining ChatGPT's results

If you identify any issues or areas for improvement in the generated Regex, work with ChatGPT to iterate and refine the rule. Provide clear and specific feedback on the problems you encountered or the aspects you’d like to improve.

Example

For this example, let’s say you’re writing a regular expression for matching phone numbers in the format `(XXX) XXX-XXXX` using ChatGPT. The generated rule is:

`^(d{3})sd{3}-d{4}$`

Upon testing it, you discover that it doesn’t account for phone numbers that include the optional international prefix. To refine the rule, you’ll need to iterate and include the country code format:

`^(+d{1,3}s)?(d{3})sd{3}-d{4}$`

Now, it includes the optional international prefix (e.g., `+1` for the United States) at the beginning of the phone number, followed by a space. The revised rule matches both phone numbers with and without country codes:

– With country code: `+1 (123) 456-7890`

– Without country code: `(123) 456-7890`

By iterating and refining the string search criteria, you can ensure that it accurately matches the desired format and accounts for any additional variations you might encounter.

Use these steps to evaluate and assess the regular expressions generated by ChatGPT and ensure that they meet your requirements, avoid potential issues, and maintain high-quality and reliable outputs for your projects.

In the next section, we’ll take a look at how you can ask ChatGPT for suggestions and best practices.

Asking ChatGPT for Suggestions and Best Practices

A developer asking ChatGPT for suggestions and best practices

ChatGPT can offer suggestions for optimizing your Regex patterns, including best practices and alternative approaches to achieve the desired results. For example:

“How can I optimize this text matching criteria for matching dates in the format YYYY-MM-DD? ^(d{4})-(d{2})-(d{2})$”

It’s not as robust as Trados Studio, but ChatGPT’s natural language processing capabilities can streamline the process of working with regular expressions and make it more efficient and enjoyable.

To ensure that ChatGPT generates the most accurate and useful Regex, consider the following tips:

  • Be clear and specific in your description.

  • Include examples of the desired matches and non-matches.

  • If necessary, specify any unique requirements or variations to consider.

These tips will come in handy, especially when using ChatGPT for advanced use cases, as you’ll see in the next section.

4 Advanced Use Cases of ChatGPT for Regular Expressions

A developer using ChatGPT for advanced use cases of RegExp

While ChatGPT can be an invaluable tool for simplifying and enhancing the regular expression development process, its capabilities extend beyond generating and validating basic Regex.

In this section, we’ll explore some advanced use cases where ChatGPT can provide valuable assistance when working with intricate regular expressions.

1. Handling Complex, Multi-Pattern Regular Expressions

In many situations, developers may need to work with complicated string search criteria that involve multiple subpatterns, conditional expressions, or nested groups. ChatGPT can help:

  1. Write complex regular expressions based on specific patterns, context, and constraints.

  2. Assist in decomposing intricate patterns. It can explain insights for each component, including syntax errors.

  3. Suggest alternative approaches or simplifications to reduce complexity and improve readability and maintainability.

2. Regular Expression Performance Optimization

The performance of regular expressions can be a critical factor in various applications, particularly when processing large volumes of data or when used in performance-sensitive environments. ChatGPT can help optimize your search criteria by:

  1. Identifying potential performance bottlenecks, such as excessive backtracking or inefficient character classes.

  2. Suggesting alternative patterns or techniques to improve performance, such as using atomic groups or possessive quantifiers.

  3. Recommending best practices for efficient Regex design.

3. Cross-Language and Cross-Engine Compatibility

Regular expression engines and syntax can differ slightly across programming languages and tools. ChatGPT can assist developers in navigating these differences by:

  1. Adapting criteria to specific programming languages, such as JavaScript, Python, or Ruby.

  2. Identifying potential compatibility issues and offering solutions to ensure consistent behavior across different Regex engines.

  3. Providing guidance on using language-specific Regex features, such as named capture groups, Unicode support, or inline modifiers.

4. Extracting and Transforming Data Using Regular Expressions

Regular expressions are often used in data extraction and transformation tasks, such as parsing log files, cleaning up data, or converting data between formats. ChatGPT can provide valuable assistance in these scenarios by:

  1. Writing string search criteria to extract specific data elements or attributes from text.

  2. Suggesting suitable replacement patterns or functions for transforming extracted data.

  3. Advising on best practices for efficient and reliable data extraction and transformation using regular expressions.

By leveraging ChatGPT’s advanced capabilities, you can tackle a wide range of challenging and complex regular expression tasks, leading to more robust and efficient solutions.

However, ChatGPT also has limitations, which is what we’re going to take a look at in the next section.

4 Limitations of Using ChatGPT for Regular Expressions

ChatGPT has limitations when compared to competitors like Google Bard

While ChatGPT can be a powerful tool for generating, testing, and optimizing Regex, it’s important to be aware of its limitations and potential challenges.

In this section, we’ll discuss some of the constraints and difficulties users might face when employing ChatGPT for regular expression tasks.

1. Incomplete or inaccurate outputs

Due to the complex nature of regular expressions and the wide range of potential use cases, ChatGPT might occasionally produce a rule that doesn’t fully capture the desired input format or misses specific edge cases.

Some examples include:

  1. Misinterpretation of requirements: ChatGPT might not always give you accurate expressions that match your intent or specific requirements, especially if the description provided is vague or ambiguous.

  2. Unhandled edge cases: ChatGPT outputs might not capture all possible edge cases or exclude all unwanted matches, which could lead to incorrect or unexpected results.

  3. Suboptimal patterns: In some cases, ChatGPT might give you a regex that works but is not the most efficient or maintainable solution.

As a developer, you must thoroughly validate the generated patterns to ensure their accuracy and effectiveness before implementing them in real-world scenarios.

2. Inability to Understand Complex or Domain-Specific Concepts

ChatGPT has an extensive knowledge base, but there may be instances where it struggles to comprehend intricate or specialized requirements for a regex. This can lead to less effective or even incorrect patterns for certain use cases.

To mitigate this issue, you have to combine your domain expertise with ChatGPT’s capabilities to refine and adjust the generated regex patterns to ensure they meet the specific needs of their projects.

3. Language and Engine Compatibility Issues

Language and engine compatibility issues are another challenge when using ChatGPT for generating regular expressions. This can manifest as:

  1. Engine-specific differences: ChatGPT might not always account for subtle differences between regex engines in various programming languages or tools, which could result in patterns that do not work as intended or exhibit unexpected behavior.

  2. Unsupported features: ChatGPT’s output might use features or syntax not supported by the user’s target language or regex engine, leading to compatibility issues.

To address these compatibility issues, you should have a basic understanding of the nuances of the target language or engine and be prepared to adapt what Chat GPT outputs accordingly.

Thorough testing and validation across different environments are essential to ensure the patterns function as intended and maintain the desired level of accuracy and efficiency.

4. Dependence on Clear and Precise User Input

The effectiveness of ChatGPT-generated regular expressions is highly dependent on the clarity and precision of the user input. As a user, you may face:

  1. Communication challenges: The quality of the regex patterns generated by ChatGPT is heavily dependent on the clarity and specificity of your input. Vague, ambiguous, or incomplete descriptions can lead to unsatisfactory results.

  2. Requirement iteration: You might need to refine your descriptions or provide additional examples and constraints to achieve the desired regex pattern, which could be a time-consuming process.

To maximize the utility of ChatGPT for regex generation, you should take the time to craft detailed and unambiguous prompts that clearly outline the intended format and requirements.

In cases where the initial output is not satisfactory, you may need to refine your inputs or provide additional context to help guide ChatGPT toward a more accurate Regex.

By understanding ChatGPT’s limitations and challenges, you can employ ChatGPT more effectively and efficiently for your regular expression tasks.

You’re going to have to check the regex patterns generated by ChatGPT and be prepared to iterate on requirements or seek alternative solutions when necessary.

Like To Receive ChatGPT Tips, Tricks & Hacks Delivered Straight to Your Inbox?

Sign up for our exclusive newsletter below, join 3,647+ others in staying ahead of the competition with ChatGPT!

Final Thoughts

Taking a deeper look into ChatGPT's capablities

ChatGPT has emerged as a valuable tool for simplifying and enhancing the process of working with regular expressions.

Its advanced natural language processing capabilities enable users to make and optimize regex patterns with ease, transforming what can often be a complex and time-consuming task into a more fun and efficient experience.

However, you should be aware of the challenges associated with the technology and be prepared to iterate on their requirements, test the generated regex patterns thoroughly, and be mindful of potential compatibility issues across different languages and engines.

If you use ChatGPT while acknowledging its constraints, you can harness the power of AI to master the art of regular expressions, which will lead to more efficient, robust, and maintainable solutions in your projects!

To learn more about how you can incorporate ChatGPT into your daily life, check out the video below:

author avatar
Sam McKay, CFA
Sam is Enterprise DNA's CEO & Founder. He helps individuals and organizations develop data driven cultures and create enterprise value by delivering business intelligence training and education.

Related Posts