Chapter 11: Error Handling, Robustness, and Retries

Welcome back, intrepid data explorer! So far, we’ve learned how to set up LangExtract, define schemas, and perform extractions with various LLM providers. You’re getting good at asking LLMs to do your bidding!

But here’s a little secret: even the smartest LLMs and the most robust libraries aren’t perfect. In the real world, things can go wrong. Network glitches, API rate limits, unexpected model behavior, or even a moment of LLM “confusion” can lead to failed extractions or malformed output. If we’re building applications that rely on these extractions, we need them to be as reliable as possible.

This chapter is all about making your LangExtract pipelines resilient. We’ll dive into understanding common failure modes, how LangExtract helps by default, and how you can implement custom error handling and retry mechanisms to ensure your applications keep running smoothly, even when the unexpected happens. By the end, you’ll be equipped to build extraction workflows that are not just smart, but also tough and reliable.

Before we begin, make sure you’re comfortable with:

Setting up LangExtract and configuring an LLM provider (from Chapter 2).
Defining extraction schemas (from Chapter 5).
Performing basic extractions (from Chapter 6).

Let’s get ready to fortify our code!

Understanding Errors in LLM Extraction

When you’re working with Large Language Models, especially through APIs, several types of errors can pop up. Knowing what they are helps you anticipate and handle them effectively.

Common Sources of Extraction Failures

API Errors (LLM Provider Side):
- Rate Limits: You’ve made too many requests in a short period. The LLM provider might respond with a 429 Too Many Requests status code.
- Authentication Issues: Your API key is missing, invalid, or expired, leading to 401 Unauthorized errors.
- Server Errors: The LLM provider’s servers are experiencing issues, resulting in 5xx errors (e.g., 500 Internal Server Error, 503 Service Unavailable).
- Billing Issues: Your account might have hit its spending limit, or there’s a payment problem.
Network Errors:
- Your internet connection dropped, or there’s a transient issue between your machine and the LLM provider’s servers. This often manifests as connection timeouts or DNS resolution failures.
Model Errors (LLM Behavior):
- Malformed Output: The LLM might generate JSON that doesn’t strictly adhere to your defined schema, or it might produce invalid JSON altogether. This is a common challenge, as LLMs are generative and not always perfectly precise.
- Hallucination/Incorrect Extraction: The LLM might extract incorrect information or invent data, even if the output format is correct. This isn’t strictly an error in the programmatic sense, but a quality issue that robust systems need to account for.
- Context Window Limits: For very long documents, if not handled correctly (e.g., with chunking), the input might exceed the LLM’s maximum token limit, leading to an error.
LangExtract Internal Errors:
- While LangExtract is designed to be robust, rare internal issues or unexpected states could theoretically occur.

LangExtract’s Built-in Robustness

LangExtract is designed with some resilience in mind. When you call lx.extract(), it doesn’t just return your structured data; it returns an ExtractionResult object. This object is your window into what happened during the extraction process.

The ExtractionResult object has a crucial attribute: errors. This is a list that will contain details about any issues encountered during the extraction. If the list is empty, it usually means the extraction was successful (at least from LangExtract’s perspective of communicating with the LLM and receiving a response).

Let’s visualize a typical extraction flow with error handling:

flowchart TD A["Start Extraction Process"] --> B{"Call lx.extract()"}; B -->|Success| C["Receive ExtractionResult"]; C --> D{"Check ExtractionResult.errors"}; D -->|Errors exist| E["Handle Errors"]; D -->|No errors| F["Process Extracted Data"]; B -->|API/Network Failure| G["Catch Exception"]; G --> E; E --> H{"Retry?"}; H -->|Yes| B; H -->|No| I["Log & Fail"];

This diagram illustrates that failures can happen at the lx.extract() call level (API/Network) or after a result is received, requiring inspection of ExtractionResult.errors.

Step-by-Step Implementation: Adding Error Handling and Retries

Let’s build a more robust extraction function step by step. We’ll start with a basic extraction, then add try-except blocks, and finally implement a retry mechanism with exponential backoff.

For this example, let’s assume we want to extract a simple Person object.

First, make sure you have LangExtract installed and your LLM provider API key set up as an environment variable (e.g., OPENAI_API_KEY for OpenAI, GOOGLE_API_KEY for Google models).

# Verify LangExtract is installed, as of 2026-01-05
pip install "langextract>=0.1.0" "openai>=1.0.0" # Or google-generativeai

Step 1: Basic Extraction (Review)

Let’s define our schema and a simple extraction function.

Create a new Python file, e.g., robust_extractor.py.

# robust_extractor.py
import os
import langextract as lx
from typing import List, Dict, Any

# Define a simple schema for a Person
person_schema = {
    "name": "string",
    "age": "integer",
    "city": "string"
}

def extract_person_info(text: str) -> Dict[str, Any] | None:
    """
    Extracts person information from text using LangExtract.
    """
    print(f"Attempting extraction for: '{text}'")
    try:
        # We'll use the default LLM provider configured via environment variables
        # For example, if OPENAI_API_KEY is set, it will use OpenAI.
        # As of LangExtract 0.1.0 (2026-01-05), the default provider is inferred.
        result = lx.extract(
            text_or_document=text,
            instructions="Extract information about a person.",
            schema=person_schema
        )

        if result.errors:
            print("Extraction completed with errors:")
            for error in result.errors:
                print(f"  - {error}")
            return None
        
        if result.data:
            print("Extraction successful!")
            return result.data[0] # Assuming one person for simplicity
        else:
            print("Extraction returned no data.")
            return None

    except Exception as e:
        print(f"An unexpected error occurred during extraction: {e}")
        return None

if __name__ == "__main__":
    sample_text = "Alice is 30 years old and lives in New York."
    person_data = extract_person_info(sample_text)
    if person_data:
        print(f"Extracted Data: {person_data}")

    print("\n--- Testing with problematic text ---")
    problem_text = "This text has no person information at all."
    person_data_problem = extract_person_info(problem_text)
    if person_data_problem:
        print(f"Extracted Data: {person_data_problem}")
    else:
        print("No person data extracted (as expected).")

Explanation:

We import langextract as lx and define our person_schema.
The extract_person_info function wraps the lx.extract() call in a try-except block to catch general exceptions.
Crucially, after lx.extract(), we check result.errors. This is where LangExtract reports issues like the LLM failing to produce valid JSON or other internal problems.
We then check result.data to see if any structured data was actually extracted.
The if __name__ == "__main__": block allows us to run this script directly to test.

Run this script: python robust_extractor.py

You should see output indicating successful extraction for “Alice” and likely “no data extracted” for the problematic text (which is a valid outcome, not necessarily an error).

Step 2: Adding Robust Retries with Exponential Backoff

What if the LLM API is temporarily unavailable or hits a rate limit? Our current try-except just fails immediately. Let’s add retry logic. We’ll use a while loop, a retry counter, and time.sleep() for a simple exponential backoff.

Modify robust_extractor.py:

# robust_extractor.py (continued)
import os
import langextract as lx
import time
from typing import List, Dict, Any

# ... (person_schema remains the same) ...

def extract_person_info_robust(text: str, max_retries: int = 3, initial_delay: int = 1) -> Dict[str, Any] | None:
    """
    Extracts person information from text with retry logic using LangExtract.
    Implements a simple exponential backoff.
    """
    print(f"Attempting robust extraction for: '{text}'")
    retries = 0
    while retries < max_retries:
        try:
            print(f"  - Extraction attempt {retries + 1}...")
            result = lx.extract(
                text_or_document=text,
                instructions="Extract information about a person.",
                schema=person_schema
            )

            if result.errors:
                print("  - Extraction completed with errors:")
                for error in result.errors:
                    print(f"    - {error}")
                
                # Check if it's a retryable error (e.g., API rate limit, transient server error)
                # LangExtract's errors often indicate LLM-specific issues.
                # For network/API level errors, the 'except' block handles it.
                # Here, we assume LLM-generated errors might be retryable too,
                # if they are not due to fundamental prompt issues.
                
                retries += 1
                if retries < max_retries:
                    delay = initial_delay * (2 ** retries) # Exponential backoff
                    print(f"  - Retrying in {delay} seconds...")
                    time.sleep(delay)
                continue # Go to the next retry attempt

            if result.data:
                print("  - Robust extraction successful!")
                return result.data[0]
            else:
                print("  - Robust extraction returned no data.")
                return None

        except Exception as e:
            # This catches network errors, authentication errors, etc.,
            # that prevent LangExtract from even getting an ExtractionResult.
            print(f"  - An unexpected error occurred: {e}")
            retries += 1
            if retries < max_retries:
                delay = initial_delay * (2 ** retries)
                print(f"  - Retrying in {delay} seconds due to exception...")
                time.sleep(delay)
            else:
                print(f"  - Max retries ({max_retries}) reached. Failing.")
                return None
    
    print(f"  - All {max_retries} attempts failed for text: '{text}'")
    return None

if __name__ == "__main__":
    # ... (previous test cases) ...

    print("\n--- Testing Robust Extraction ---")
    # To simulate a failure, you could temporarily disable your internet,
    # or use an invalid API key, or point to a non-existent LLM provider.
    # For a realistic test without actually breaking things, we'll just run it.
    # If a real transient error occurs, you'll see the retry logic in action.
    sample_text_robust = "Bob is 25 and hails from London."
    person_data_robust = extract_person_info_robust(sample_text_robust, max_retries=4, initial_delay=0.5)
    if person_data_robust:
        print(f"Final Extracted Data: {person_data_robust}")
    else:
        print("Failed to extract person data after multiple attempts.")

    # Example of a text that might cause LLM to struggle or return no data
    print("\n--- Testing Robust Extraction with difficult text ---")
    difficult_text = "The quick brown fox jumps over the lazy dog."
    person_data_difficult = extract_person_info_robust(difficult_text, max_retries=2)
    if person_data_difficult:
        print(f"Final Extracted Data: {person_data_difficult}")
    else:
        print("Failed to extract (or no data found) after multiple attempts.")

Explanation of Changes:

We introduced extract_person_info_robust with max_retries and initial_delay parameters.
A while loop controls the retry attempts.
Inside the try block, we attempt the extraction.
If result.errors is not empty, we log the errors and increment retries. If retries is still less than max_retries, we calculate an exponential delay (initial_delay * (2 ** retries)) and time.sleep() before the loop continues for the next attempt.
The except Exception as e: block now catches broader issues (network, API key) and applies the same retry logic.
If max_retries is reached, the function gives up and returns None.

Run this updated script: python robust_extractor.py

You might not see the retry logic in action unless a real, transient error occurs. However, the code is now prepared for such scenarios. For production systems, you’d typically use a dedicated library like tenacity (see References) for more sophisticated retry policies, but this manual implementation clearly demonstrates the underlying principles.

Step 3: Post-Extraction Schema Validation

Even if LangExtract returns data without result.errors, the LLM might have hallucinated or slightly deviated from the schema’s intent (e.g., returned a string for an integer field, but LangExtract’s internal parsing might have coerced it). For critical applications, adding an explicit post-extraction validation step is a best practice. Pydantic is excellent for this.

First, install Pydantic:

pip install "pydantic>=2.0" # As of 2026-01-05, Pydantic v2 is standard

Now, let’s update our robust_extractor.py to use Pydantic for validation.

# robust_extractor.py (continued)
import os
import langextract as lx
import time
from typing import List, Dict, Any
from pydantic import BaseModel, ValidationError, Field

# Define our schema using Pydantic
class Person(BaseModel):
    name: str = Field(description="The full name of the person.")
    age: int = Field(description="The age of the person in years.")
    city: str = Field(description="The city where the person lives.")

# Update the LangExtract schema to match Pydantic's structure
# LangExtract can often infer from Pydantic, but explicit is clear.
# For LangExtract's schema parameter, we typically provide a dictionary.
person_lx_schema = {
    "name": "string",
    "age": "integer",
    "city": "string"
}

def extract_person_info_validated(text: str, max_retries: int = 3, initial_delay: int = 1) -> Person | None:
    """
    Extracts person information with retry logic and post-extraction Pydantic validation.
    """
    print(f"Attempting validated extraction for: '{text}'")
    retries = 0
    while retries < max_retries:
        try:
            print(f"  - Extraction attempt {retries + 1}...")
            result = lx.extract(
                text_or_document=text,
                instructions="Extract information about a person.",
                schema=person_lx_schema # Use the dictionary schema for lx.extract
            )

            if result.errors:
                print("  - Extraction completed with errors from LangExtract:")
                for error in result.errors:
                    print(f"    - {error}")
                # We still retry if LangExtract itself reports errors
                retries += 1
                if retries < max_retries:
                    delay = initial_delay * (2 ** retries)
                    print(f"  - Retrying in {delay} seconds...")
                    time.sleep(delay)
                continue

            if result.data:
                extracted_raw_data = result.data[0] # Get the first extracted item
                try:
                    # Attempt to validate with Pydantic
                    validated_person = Person(**extracted_raw_data)
                    print("  - Robust extraction successful and validated!")
                    return validated_person
                except ValidationError as ve:
                    print(f"  - Pydantic validation failed after extraction: {ve}")
                    # Validation failure might indicate LLM gave malformed data.
                    # We can retry, assuming it might do better next time.
                    retries += 1
                    if retries < max_retries:
                        delay = initial_delay * (2 ** retries)
                        print(f"  - Retrying in {delay} seconds due to validation error...")
                        time.sleep(delay)
                    continue # Go to next retry attempt
            else:
                print("  - Robust extraction returned no data.")
                return None

        except Exception as e:
            print(f"  - An unexpected general error occurred: {e}")
            retries += 1
            if retries < max_retries:
                delay = initial_delay * (2 ** retries)
                print(f"  - Retrying in {delay} seconds due to exception...")
                time.sleep(delay)
            else:
                print(f"  - Max retries ({max_retries}) reached. Failing.")
                return None
    
    print(f"  - All {max_retries} attempts failed for text: '{text}'")
    return None

if __name__ == "__main__":
    # ... (previous test cases) ...

    print("\n--- Testing Validated Extraction ---")
    validated_text = "Charlie is 45 years old and lives in Paris."
    validated_person_data = extract_person_info_validated(validated_text, max_retries=4, initial_delay=0.5)
    if validated_person_data:
        print(f"Final Extracted and Validated Data (Type: {type(validated_person_data)}): {validated_person_data.model_dump()}")
    else:
        print("Failed to extract and validate person data after multiple attempts.")

    # Example of text that might cause a validation error (e.g., age as string)
    # This is hard to force with a good LLM, but imagine if LLM returned {"age": "forty five"}
    print("\n--- Testing Validated Extraction with potentially tricky data ---")
    tricky_text = "David, who is twenty-eight, works in Berlin."
    tricky_person_data = extract_person_info_validated(tricky_text, max_retries=2)
    if tricky_person_data:
        print(f"Final Extracted and Validated Data: {tricky_person_data.model_dump()}")
    else:
        print("Failed to extract and validate (or no data found) after multiple attempts.")

Explanation of Validation Changes:

We define a Pydantic Person model, which provides strong typing and validation capabilities.
Inside extract_person_info_validated, after result.data is received, we attempt to instantiate our Person Pydantic model with the raw extracted data.
If ValidationError occurs, it means the LLM’s output, though potentially valid JSON, didn’t conform to our strict Pydantic model. We catch this, log it, and can choose to retry or handle it differently.
The function now returns a Person object (if successful) or None.

This layered approach to error handling and validation provides a much more robust extraction pipeline, ready for the demands of production environments.

Mini-Challenge: Customizing Retry Behavior

You’ve seen how to implement a basic retry mechanism. Now, let’s customize it.

Challenge: Modify the extract_person_info_validated function to:

Only retry on specific types of errors. For instance, if result.errors contains a message indicating a rate limit error (you’d need to inspect the error string for keywords), retry. If it’s a ValidationError from Pydantic, retry. But if LangExtract reports a fundamental schema mismatch that isn’t transient, perhaps you don’t retry, or you log it as a critical failure.
Add a maximum total time limit for all retries, not just a maximum number of attempts. If the total time spent retrying exceeds, say, 30 seconds, the function should give up.

Hint:

For specific error types, you’ll need to check the error string within result.errors or the type of the Exception caught.
For the time limit, use time.time() to record the start time and check against time.time() - start_time.

What to Observe/Learn: You’ll learn how to make your retry logic smarter and more responsive to different types of failures, preventing endless retries for non-transient issues and ensuring your application doesn’t get stuck waiting indefinitely.

Common Pitfalls & Troubleshooting

Building robust systems means anticipating problems. Here are a few common pitfalls when dealing with LangExtract and LLM extraction, along with troubleshooting tips:

Ignoring ExtractionResult.errors:
- Pitfall: Many developers only check if result.data is None or empty. However, result.data might be empty because of an error that result.errors explicitly details (e.g., “LLM failed to produce valid JSON”). Ignoring result.errors means you miss valuable diagnostic information.
- Troubleshooting: Always inspect result.errors. Log its contents whenever it’s not empty. This is your primary source of truth for what went wrong on the LangExtract/LLM side.
Over-retrying or Under-retrying:
- Pitfall: Retrying too aggressively (e.g., no delay, too many attempts) can hit rate limits faster or prolong issues. Under-retrying (e.g., no retries at all) makes your system brittle to transient network or API hiccups.
- Troubleshooting:
  - Implement exponential backoff with jitter (randomness added to delay) to avoid “thundering herd” problems if many processes retry simultaneously. Libraries like tenacity handle this elegantly.
  - Set a reasonable max_retries (e.g., 3-5 for transient issues).
  - Consider a total timeout for all retry attempts, as in the mini-challenge.
  - Distinguish between retryable errors (network, rate limits, transient LLM misfires) and non-retryable errors (invalid schema definition, authentication failure that persists).
Lack of Post-Extraction Validation:
- Pitfall: Assuming the LLM will always return data that perfectly matches your schema’s types and constraints, even if LangExtract reports no errors. LLMs can be creative! They might return a string when an integer is expected, or miss a required field.
- Troubleshooting: Use a robust validation library like Pydantic for critical data. This adds an extra layer of defense, converting raw LLM output into strongly typed, validated objects. If validation fails, you can log the specific errors, retry, or flag the item for human review.
Misunderstanding LLM Provider Errors:
- Pitfall: Not knowing what different HTTP status codes or error messages from your LLM provider mean.
- Troubleshooting: Refer to the official documentation of your specific LLM provider (e.g., OpenAI API Reference, Google Cloud Generative AI Docs) for a detailed breakdown of their error codes and recommended handling strategies.

Summary

Phew! We’ve covered a lot of ground in making your LangExtract pipelines truly robust. Here’s a quick recap of the key takeaways:

Anticipate Failures: LLM interactions are inherently prone to various errors, including API issues, network problems, and model-specific output quirks.
Inspect ExtractionResult.errors: This is your first stop for understanding what went wrong on the LangExtract/LLM side. Never ignore it!
Implement Retry Logic: For transient errors, a well-designed retry mechanism with exponential backoff significantly improves reliability. Libraries like tenacity can simplify this.
Validate Post-Extraction: Even if LangExtract reports success, always validate the LLM’s output against your expected schema using tools like Pydantic for critical applications. This guards against subtle LLM “creativity.”
Distinguish Error Types: Learn to differentiate between retryable and non-retryable errors to build smarter, more efficient error handling.

By applying these principles, you’re not just extracting data; you’re building a reliable, production-ready system that can gracefully handle the challenges of the real world.

What’s Next?

With a solid foundation in robustness, our next adventure will take us into Chapter 12: Performance Tuning and Optimization. We’ll explore how to make your LangExtract pipelines not just reliable, but also blazing fast and cost-effective!

References

LangExtract GitHub Repository: The primary source for the library’s features and updates.
- https://github.com/google/langextract
Pydantic Documentation (v2): Official documentation for data validation with Pydantic.
- https://docs.pydantic.dev/latest/
Tenacity Library: A general-purpose retry library for Python.
- https://tenacity.readthedocs.io/en/latest/
OpenAI API Reference: For understanding specific error codes from OpenAI models.
- https://platform.openai.com/docs/api-reference/errors
Google Cloud Generative AI Documentation: For understanding specific error codes from Google’s generative models.
- https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/error-messages

This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.