Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

why my ptyhon alert_api.list_alerts took longer and longer to return, eventually it never return

Scott Han March 23, 2025

 

why my ptyhon alert_api.list_alerts took longer and longer to return, eventually it never return and blocked forever I have to restart my 24*7 python script to reset it.

1 answer

1 vote
Egor
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 25, 2025

Hey Scott, 
Thanks for reaching out to Atlassian Community!

Opsgenie support doesn’t troubleshoot external scripts directly, but I can share some general recommendations.

If your alert_api.list_alerts call is gradually slowing down or getting stuck over time, it may be due to:

  • Unmanaged pagination: Make sure you're correctly handling paginated results using offset and limit, especially if you're querying large volumes of alerts.

  • Unbounded loops or memory leaks in the script: Consider adding timeouts and retries to avoid the script hanging indefinitely.

  • Long polling or blocking requests: Use request timeouts in your HTTP client (e.g., timeout= parameter in requests.get()).

  • No cleanup or resource release: If the script runs 24/7, ensure you're properly closing connections or sessions.

To help with stability, consider adding:

  • Logging per request

  • Timeout and retry logic

  • Monitoring for thread/process health

If the issue persists, we recommend checking the script’s logic, optimizing the query filters to reduce load, and restarting the process on a fixed schedule.

Best Regards,
Egor

Scott Han March 25, 2025

Thanks Egor for the reply, I am not using pagination in my python script, as my result usually contains only one alert, (two or three alerts maximum). I am using opsgenie python API not opsgenie  REST API, not sure how can I set the time out for the alert_api.list_alerts. Also the script has been continuously running without issue for more than half year without issue so the unbounded loops or memory leak should not be a concern here. Also could you please tell me how to turn on the log for  python api alert_api.list_alerts, so I can see what is going on. 

Many thanks again for your help.

 

Egor
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 26, 2025

Hey Scott, 

Thanks for sharing those details!

Since you're using the Opsgenie Python SDK, it's important to know that while it wraps the REST API, it may not expose full control over HTTP behavior by default (like setting timeouts or logging).

Here are a few recommendations:

1. Set a Timeout Manually
The SDK doesn't provide a built-in timeout parameter for alert_api.list_alerts(), but you can fork or extend the SDK to inject a timeout by modifying the underlying API client. Alternatively, switch to using requests directly for critical calls where you need fine control.

 

2. Enable Logging
To enable HTTP-level debugging in the Opsgenie Python SDK, you can turn on logging for the underlying urllib3library:

import logging
import http.client as http_client

http_client.HTTPConnection.debuglevel = 1

logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
logging.getLogger("urllib3").setLevel(logging.DEBUG)

This will show raw request/response data, including when a call gets stuck.

Even if your script has been stable for months, external network issues or rare API timeouts could still cause a hang. Adding logging and a timeout fallback is a good safety net for long-running scripts.

I hope this helps! Also, this is a custom solution that is not supported by support. Please implement any changes on your own risk.

Best Regards,
Egor

Like Steffen Opel _Utoolity_ likes this
Scott Han March 26, 2025

Hi Egor, many thanks for your timely and helpful reply. I will have a try

Best regards,

Scott 

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
STANDARD
TAGS
AUG Leaders

Atlassian Community Events