Introduction to Robust Error Handling

Welcome back, future AI architect! In the previous chapters, we’ve explored the fascinating world of any-llm – Mozilla’s unified interface for Large Language Models. You’ve learned how to set up your environment, make basic completion calls, and configure different LLM providers. But what happens when things don’t go as planned? What if an API key is wrong, the network flickers, or a model is overloaded?

That’s where robust error handling comes into play! Just like a sturdy bridge needs to withstand unexpected winds and tremors, your AI applications need to gracefully handle errors and exceptions. Ignoring errors can lead to brittle applications that crash unexpectedly, provide poor user experiences, or even incur unnecessary costs.

In this chapter, we’ll dive deep into any-llm’s approach to error handling. We’ll learn how to anticipate, catch, and intelligently respond to various issues that can arise when interacting with LLMs. By the end, you’ll be equipped to build more resilient and production-ready AI systems.

Before we begin, make sure you have a basic understanding of Python’s try...except blocks and how to make a simple any-llm completion call, as covered in Chapter 3 and 4.

Understanding any-llm’s Exception Hierarchy

When you interact with various LLM providers, each might return errors in its own unique format. This is precisely where any-llm shines, by abstracting these differences and presenting a unified, consistent set of exceptions. This consistency makes your code cleaner and easier to maintain, as you don’t need to write provider-specific error logic.

At its core, any-llm provides a hierarchy of exception classes, all inheriting from a base AnyLLMError. This structure allows you to catch specific errors for granular control or broader errors for general handling.

Let’s visualize a conceptual any-llm exception hierarchy using a class diagram:

classDiagram class AnyLLMError { +message: str +status_code: int +provider_error_code: str } AnyLLMError <|-- AnyLLMConnectionError : "Child Class" AnyLLMError <|-- AnyLLMAuthenticationError : "Child Class" AnyLLMError <|-- AnyLLMRateLimitError : "Child Class" AnyLLMError <|-- AnyLLMInvalidRequestError : "Child Class" AnyLLMError <|-- AnyLLMAPIError : "Child Class" AnyLLMAPIError <|-- AnyLLMTimeoutError : "Child Class"

Conceptual any-llm Exception Hierarchy

As you can see, AnyLLMError is the parent, with more specific errors branching off. AnyLLMAPIError often carries details like status_code and provider_error_code, which are crucial for debugging.

Common Error Types in any-llm

While the exact exception names might evolve, here are common categories you’ll encounter and how any-llm typically represents them:

  1. AnyLLMConnectionError:

    • What it is: Occurs when there’s a problem establishing or maintaining a network connection with the LLM provider. This could be due to your internet connection, the provider’s servers being down, or DNS issues.
    • Why it’s important: These are often transient errors, meaning they might resolve themselves if you retry the request after a short delay.
  2. AnyLLMAuthenticationError:

    • What it is: Indicates that your API key is missing, invalid, or expired. The LLM provider refused your request because it couldn’t verify your identity.
    • Why it’s important: This usually requires user intervention (e.g., setting the correct API key) and is not typically resolved by retries.
  3. AnyLLMRateLimitError:

    • What it is: The LLM provider has rejected your request because you’ve exceeded the allowed number of requests within a given timeframe. Providers implement rate limits to prevent abuse and ensure fair usage.
    • Why it’s important: Like connection errors, these are often transient. Implementing a retry strategy with exponential backoff is crucial here.
  4. AnyLLMInvalidRequestError:

    • What it is: Your request payload (e.g., the prompt, model name, temperature setting) contains an error that the LLM provider cannot process. This means your input is malformed or violates the provider’s rules.
    • Why it’s important: This usually points to a bug in your application’s logic or prompt engineering. Retrying without fixing the request won’t help.
  5. AnyLLMAPIError (General API Error):

    • What it is: A broad category for various issues on the LLM provider’s side that don’t fit into the more specific categories above. This could include internal server errors, model failures, or other unexpected issues.
    • Why it’s important: Sometimes transient, sometimes indicative of a larger issue with the provider or model. It often contains more detailed error messages from the provider.
  6. AnyLLMTimeoutError:

    • What it is: A specific type of AnyLLMAPIError (or sometimes a direct subclass of AnyLLMError) that occurs when the LLM provider takes too long to respond, exceeding a predefined timeout period.
    • Why it’s important: Can be transient due to network latency or model overload. Retries can be effective.

By understanding these distinctions, you can write more intelligent and robust error handling logic.

Strategies for Handling Errors

Now that we know the types of errors, how do we handle them effectively?

  1. Graceful Degradation: When an LLM call fails, can your application still function, perhaps with reduced capabilities? For example, if a creative generation fails, can you fall back to a simpler, cached response or inform the user politely?
  2. Retries with Exponential Backoff: For transient errors like AnyLLMConnectionError or AnyLLMRateLimitError, simply retrying the request after a short delay can often resolve the issue. Exponential backoff means you increase the delay between retries exponentially (e.g., 1s, then 2s, then 4s, 8s…). This prevents you from overwhelming the service and gives it time to recover.
  3. Logging: Always log errors! Detailed logs are your best friend for debugging in development and monitoring in production. Include timestamps, error types, messages, and any relevant request details (but be mindful of sensitive information).
  4. Alerting: For critical errors in production, integrate with an alerting system (e.g., PagerDuty, Slack, email) to notify your team immediately.

Step-by-Step Implementation: Handling any-llm Exceptions

Let’s put these concepts into practice. We’ll start with a basic any-llm completion call and then progressively add error handling.

First, ensure you have any-llm-sdk installed. We’ll install with a common provider, for example, mistral. As of December 2025, the installation command is:

pip install 'any-llm-sdk[mistral]'

And set your API key as an environment variable (replace YOUR_MISTRAL_API_KEY with your actual key):

export MISTRAL_API_KEY="YOUR_MISTRAL_API_KEY"

Or, if you’re using a different provider, ensure its respective API key environment variable (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY) is set.

Now, let’s create a Python file named error_handling_example.py.

Step 1: Basic try...except for AnyLLMError

We’ll start by catching the most generic any-llm exception. This is a good first step to prevent your application from crashing due to any issue from any-llm.

# error_handling_example.py
import os
import logging
from any_llm import completion, AnyLLMError

# Configure logging for better visibility
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def get_llm_response(prompt: str, provider: str = "mistral") -> str | None:
    """
    Attempts to get a completion from an LLM provider,
    handling generic any-llm errors.
    """
    logging.info(f"Attempting to get LLM response for provider: {provider}")
    try:
        response = completion(
            model="mistral-large", # Or "gpt-4o", "claude-3-opus-20240229", etc.
            messages=[{"role": "user", "content": prompt}],
            provider=provider
        )
        return response.choices[0].message.content
    except AnyLLMError as e:
        logging.error(f"An any-llm error occurred: {e}")
        return None
    except Exception as e:
        # Catch any other unexpected Python errors
        logging.critical(f"An unexpected non-any-llm error occurred: {e}")
        return None

if __name__ == "__main__":
    test_prompt = "What is the capital of France?"
    response_content = get_llm_response(test_prompt)

    if response_content:
        print(f"\nLLM Response: {response_content}")
    else:
        print("\nFailed to get a response from the LLM.")

Explanation:

  1. We import logging, completion, and AnyLLMError.
  2. logging.basicConfig sets up basic logging to show INFO messages and above.
  3. The get_llm_response function wraps the completion call in a try...except block.
  4. except AnyLLMError as e: catches any error specifically raised by any-llm. We log it as an error.
  5. except Exception as e: is a fallback for any other Python error that might occur, which is logged as critical. While useful as a catch-all, it’s generally better to catch more specific exceptions when possible.

To Test:

  1. Run the script: python error_handling_example.py
  2. It should print the capital of France.
  3. To simulate an error: Temporarily unset or mess up your MISTRAL_API_KEY environment variable (e.g., export MISTRAL_API_KEY="INVALID_KEY" in your terminal, then run the script). You should see an AnyLLMError logged. Remember to set it back to a valid key afterward!

Step 2: Handling Specific any-llm Exceptions

Now, let’s refine our error handling to differentiate between common problems. This allows us to provide more targeted feedback or implement specific recovery strategies.

Modify your error_handling_example.py to include more specific except blocks:

# error_handling_example.py (continued)
import os
import logging
import time
from any_llm import completion
from any_llm import (
    AnyLLMError,
    AnyLLMConnectionError,
    AnyLLMAuthenticationError,
    AnyLLMRateLimitError,
    AnyLLMInvalidRequestError,
    AnyLLMAPIError # Catches general API errors, including timeouts if not explicitly caught
)

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def get_llm_response_specific_errors(prompt: str, provider: str = "mistral") -> str | None:
    """
    Attempts to get a completion from an LLM provider,
    handling specific any-llm errors.
    """
    logging.info(f"Attempting to get LLM response for provider: {provider}")
    try:
        response = completion(
            model="mistral-large",
            messages=[{"role": "user", "content": prompt}],
            provider=provider
        )
        return response.choices[0].message.content
    except AnyLLMAuthenticationError:
        logging.error("Authentication failed. Please check your API key for the provider.")
        return None
    except AnyLLMRateLimitError:
        logging.warning("Rate limit exceeded. Please wait and try again later.")
        return None
    except AnyLLMConnectionError:
        logging.error("Network connection issue. Please check your internet connection or provider status.")
        return None
    except AnyLLMInvalidRequestError as e:
        logging.error(f"Invalid request parameters: {e}. Check your prompt or model configuration.")
        return None
    except AnyLLMAPIError as e:
        logging.error(f"An API error occurred with the LLM provider: {e.status_code} - {e.message}. "
                      f"Provider error code: {getattr(e, 'provider_error_code', 'N/A')}")
        return None
    except AnyLLMError as e: # Catch any other general any-llm error not specifically handled
        logging.error(f"An unexpected any-llm error occurred: {e}")
        return None
    except Exception as e:
        logging.critical(f"An entirely unexpected Python error occurred: {e}")
        return None

if __name__ == "__main__":
    # ... (keep the previous if __name__ block for testing)
    print("\n--- Testing with specific error handling ---")
    test_prompt = "Explain the concept of quantum entanglement in simple terms."
    response_content = get_llm_response_specific_errors(test_prompt)

    if response_content:
        print(f"\nLLM Response (specific errors): {response_content}")
    else:
        print("\nFailed to get a response from the LLM with specific error handling.")

Explanation:

  1. We imported specific any-llm exception classes.
  2. The except blocks are now ordered from most specific to most general. This is important because Python will catch the first matching except block. For instance, AnyLLMAuthenticationError is caught before AnyLLMError.
  3. Each except block provides a more informative log message tailored to the specific error. For AnyLLMAPIError, we try to extract status_code and message for better debugging.

To Test:

  1. Authentication Error: Again, temporarily invalidate your MISTRAL_API_KEY. You should now see the “Authentication failed…” message.
  2. Invalid Request Error: You could try passing an unsupported model to simulate this, though any-llm might abstract some of these. A more direct simulation would involve providing a prompt that’s too long or malformed if the provider’s API has strict limits. For this exercise, assume if you pass a model name that doesn’t exist, it might trigger an AnyLLMInvalidRequestError or AnyLLMAPIError.

Step 3: Implementing Retries with Exponential Backoff

For transient errors like AnyLLMConnectionError and AnyLLMRateLimitError, retries are essential. Let’s build a simple retry mechanism into our function.

# error_handling_example.py (continued)
import os
import logging
import time
import random # For adding jitter to backoff
from any_llm import completion
from any_llm import (
    AnyLLMError,
    AnyLLMConnectionError,
    AnyLLMAuthenticationError,
    AnyLLMRateLimitError,
    AnyLLMInvalidRequestError,
    AnyLLMAPIError
)

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def get_llm_response_with_retries(prompt: str, provider: str = "mistral", max_retries: int = 3) -> str | None:
    """
    Attempts to get a completion from an LLM provider,
    handling specific any-llm errors with retries for transient issues.
    """
    logging.info(f"Attempting to get LLM response for provider: {provider}")
    
    for attempt in range(1, max_retries + 1):
        try:
            logging.info(f"Attempt {attempt}/{max_retries} for LLM completion.")
            response = completion(
                model="mistral-large",
                messages=[{"role": "user", "content": prompt}],
                provider=provider
            )
            return response.choices[0].message.content
        except (AnyLLMConnectionError, AnyLLMRateLimitError) as e:
            # These are transient errors, so we'll retry
            wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1) # Exponential backoff with jitter
            logging.warning(f"Transient error encountered: {e}. Waiting {wait_time:.2f} seconds before retrying...")
            time.sleep(wait_time)
        except AnyLLMAuthenticationError:
            logging.error("Authentication failed. Please check your API key for the provider. Not retrying.")
            return None
        except AnyLLMInvalidRequestError as e:
            logging.error(f"Invalid request parameters: {e}. Check your prompt or model configuration. Not retrying.")
            return None
        except AnyLLMAPIError as e:
            # For general API errors, check if it's potentially transient or a server error
            if getattr(e, 'status_code', 500) >= 500: # Server-side errors might be transient
                wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1)
                logging.warning(f"Server-side API error ({e.status_code}): {e.message}. Waiting {wait_time:.2f} seconds before retrying...")
                time.sleep(wait_time)
            else:
                logging.error(f"Non-retryable API error occurred: {e.status_code} - {e.message}. Not retrying.")
                return None
        except AnyLLMError as e:
            logging.error(f"An unexpected any-llm error occurred: {e}. Not retrying.")
            return None
        except Exception as e:
            logging.critical(f"An entirely unexpected Python error occurred: {e}. Not retrying.")
            return None
    
    logging.error(f"Failed to get LLM response after {max_retries} attempts.")
    return None

if __name__ == "__main__":
    # ... (keep the previous if __name__ blocks for testing)
    print("\n--- Testing with retries and exponential backoff ---")
    test_prompt = "What are the benefits of learning Python for AI development?"
    response_content = get_llm_response_with_retries(test_prompt, max_retries=3)

    if response_content:
        print(f"\nLLM Response (with retries): {response_content}")
    else:
        print("\nFailed to get a response from the LLM after multiple retries.")

Explanation:

  1. We added import random for “jitter” (a small random component) to the backoff time. This helps prevent many clients from retrying at the exact same moment, which can cause a “thundering herd” problem.
  2. The for attempt in range(1, max_retries + 1): loop manages our retry attempts.
  3. AnyLLMConnectionError and AnyLLMRateLimitError are grouped together, as they both benefit from retries.
  4. wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1) calculates the exponential backoff. For attempt 1, it’s 2^0 + jitter (1s + jitter); for attempt 2, it’s 2^1 + jitter (2s + jitter), and so on.
  5. time.sleep(wait_time) pauses execution.
  6. For AnyLLMAPIError, we add a check: if the status_code is 5xx (server error), we consider it potentially transient and retry. Otherwise, it’s treated as a non-retryable error (e.g., a 4xx client error).
  7. Non-retryable errors (authentication, invalid request, or truly unexpected errors) cause the function to return None immediately.
  8. If all retries fail, a final error message is logged, and None is returned.

This robust function is now much more resilient to temporary network glitches or provider-side load.

Mini-Challenge: Simulate and Handle a Rate Limit

Let’s put your new knowledge to the test!

Challenge: Modify the get_llm_response_with_retries function (or create a new one based on it) to simulate a AnyLLMRateLimitError on the first attempt, but then succeed on a subsequent retry.

Hint: Inside your get_llm_response_with_retries function, at the very beginning of the try block, you can temporarily add a condition like if attempt == 1: raise AnyLLMRateLimitError("Simulated rate limit"). Observe how your exponential backoff and retry logic handles this. Make sure to remove this simulation code after you’ve completed the challenge!

What to observe/learn:

  • How the warning log for the rate limit appears.
  • How the time.sleep correctly pauses execution.
  • That the function eventually succeeds after the simulated failure.
  • The importance of random.uniform(0, 1) (jitter) in making the wait times slightly unpredictable.
# Mini-Challenge: Add this to your error_handling_example.py
if __name__ == "__main__":
    # ... previous test blocks ...

    print("\n--- Mini-Challenge: Simulating Rate Limit ---")

    # Define a new function or modify the existing one temporarily
    def get_llm_response_simulate_rate_limit(prompt: str, provider: str = "mistral", max_retries: int = 3) -> str | None:
        """
        Simulates a rate limit error on the first attempt and retries.
        """
        logging.info(f"Attempting LLM response for provider: {provider} with rate limit simulation.")
        
        for attempt in range(1, max_retries + 1):
            try:
                logging.info(f"Simulation Attempt {attempt}/{max_retries} for LLM completion.")
                
                # --- SIMULATION CODE START ---
                if attempt == 1:
                    logging.warning("Simulating AnyLLMRateLimitError on first attempt!")
                    raise AnyLLMRateLimitError("Simulated rate limit exceeded for testing purposes.")
                # --- SIMULATION CODE END ---

                response = completion(
                    model="mistral-large",
                    messages=[{"role": "user", "content": prompt}],
                    provider=provider
                )
                return response.choices[0].message.content
            except (AnyLLMConnectionError, AnyLLMRateLimitError) as e:
                wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1)
                logging.warning(f"Transient error encountered: {e}. Waiting {wait_time:.2f} seconds before retrying...")
                time.sleep(wait_time)
            except AnyLLMAuthenticationError:
                logging.error("Authentication failed. Not retrying.")
                return None
            except AnyLLMInvalidRequestError as e:
                logging.error(f"Invalid request parameters: {e}. Not retrying.")
                return None
            except AnyLLMAPIError as e:
                if getattr(e, 'status_code', 500) >= 500:
                    wait_time = (2 ** (attempt - 1)) + random.uniform(0, 1)
                    logging.warning(f"Server-side API error ({e.status_code}): {e.message}. Waiting {wait_time:.2f} seconds before retrying...")
                    time.sleep(wait_time)
                else:
                    logging.error(f"Non-retryable API error occurred: {e.status_code} - {e.message}. Not retrying.")
                    return None
            except AnyLLMError as e:
                logging.error(f"An unexpected any-llm error occurred: {e}. Not retrying.")
                return None
            except Exception as e:
                logging.critical(f"An entirely unexpected Python error occurred: {e}. Not retrying.")
                return None
        
        logging.error(f"Failed to get LLM response after {max_retries} attempts.")
        return None

    challenge_prompt = "Tell me a short story about a brave squirrel."
    challenge_response = get_llm_response_simulate_rate_limit(challenge_prompt, max_retries=3)

    if challenge_response:
        print(f"\nLLM Response (Challenge Success): {challenge_response}")
    else:
        print("\nLLM Response (Challenge Failed) after retries.")

Run your script and observe the logs! You should see the simulated rate limit, a pause, and then a successful completion.

Common Pitfalls & Troubleshooting

Even with robust error handling, developers can encounter issues. Here are a few common pitfalls:

  1. Catching Exception too broadly: While except Exception as e: acts as a safety net, relying on it too heavily can hide specific problems. Always try to catch specific any-llm exceptions first. If you catch Exception too high up, you might be retrying for an AnyLLMAuthenticationError (which won’t help) or missing crucial debugging info.
  2. Not implementing retries for transient errors: Forgetting exponential backoff for AnyLLMRateLimitError or AnyLLMConnectionError will make your application brittle and prone to failure under moderate load or network instability.
  3. Ignoring error messages: AnyLLMAPIError and AnyLLMInvalidRequestError often contain valuable message and status_code attributes. Log these details! They are crucial for understanding why an error occurred.
  4. Infinite retries or too many retries: While retries are good, an infinite loop or too many attempts can overwhelm the API provider, consume your credits, or make your application unresponsive. Always set a max_retries limit.
  5. Hardcoding API keys: Never embed your API keys directly in your code. Always use environment variables or a secure configuration management system. This is a security best practice that prevents accidental exposure in version control.

Summary

Phew! You’ve just mastered a crucial aspect of building reliable AI applications. Let’s recap what we’ve covered:

  • Why error handling matters: It makes your applications resilient, user-friendly, and easier to debug.
  • any-llm’s unified exception hierarchy: It simplifies handling errors across different LLM providers, with a base AnyLLMError and specific subclasses.
  • Common any-llm error types: We explored AnyLLMConnectionError, AnyLLMAuthenticationError, AnyLLMRateLimitError, AnyLLMInvalidRequestError, and AnyLLMAPIError, understanding their causes and implications.
  • Effective error handling strategies: Including graceful degradation, retries with exponential backoff (and jitter!), and comprehensive logging.
  • Hands-on implementation: You built Python code demonstrating basic, specific, and retry-enabled error handling for any-llm calls.
  • Mini-Challenge: You practiced simulating and handling a rate limit error.
  • Common pitfalls: You learned to avoid broad Exception catches, neglecting retries, ignoring error details, excessive retries, and hardcoding API keys.

By applying these principles, you’re well on your way to developing robust and professional AI solutions with any-llm.

What’s next? In the upcoming chapter, we’ll explore how to handle multiple LLM requests efficiently using asynchronous programming with any-llm, allowing your applications to perform many tasks concurrently without getting bogged down!


References


This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.