why my ptyhon alert_api.list_alerts took longer and longer to return, eventually it never return and blocked forever I have to restart my 24*7 python script to reset it.
Hey Scott,
Thanks for reaching out to Atlassian Community!
Opsgenie support doesn’t troubleshoot external scripts directly, but I can share some general recommendations.
If your alert_api.list_alerts
call is gradually slowing down or getting stuck over time, it may be due to:
Unmanaged pagination: Make sure you're correctly handling paginated results using offset
and limit
, especially if you're querying large volumes of alerts.
Unbounded loops or memory leaks in the script: Consider adding timeouts and retries to avoid the script hanging indefinitely.
Long polling or blocking requests: Use request timeouts in your HTTP client (e.g., timeout=
parameter in requests.get()
).
No cleanup or resource release: If the script runs 24/7, ensure you're properly closing connections or sessions.
To help with stability, consider adding:
Logging per request
Timeout and retry logic
Monitoring for thread/process health
If the issue persists, we recommend checking the script’s logic, optimizing the query filters to reduce load, and restarting the process on a fixed schedule.
Best Regards,
Egor
Thanks Egor for the reply, I am not using pagination in my python script, as my result usually contains only one alert, (two or three alerts maximum). I am using opsgenie python API not opsgenie REST API, not sure how can I set the time out for the alert_api.list_alerts. Also the script has been continuously running without issue for more than half year without issue so the unbounded loops or memory leak should not be a concern here. Also could you please tell me how to turn on the log for python api alert_api.list_alerts, so I can see what is going on.
Many thanks again for your help.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hey Scott,
Thanks for sharing those details!
Since you're using the Opsgenie Python SDK, it's important to know that while it wraps the REST API, it may not expose full control over HTTP behavior by default (like setting timeouts or logging).
Here are a few recommendations:
1. Set a Timeout Manually
The SDK doesn't provide a built-in timeout
parameter for alert_api.list_alerts()
, but you can fork or extend the SDK to inject a timeout by modifying the underlying API client. Alternatively, switch to using requests
directly for critical calls where you need fine control.
2. Enable Logging
To enable HTTP-level debugging in the Opsgenie Python SDK, you can turn on logging for the underlying urllib3
library:
import logging
import http.client as http_client
http_client.HTTPConnection.debuglevel = 1
logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
logging.getLogger("urllib3").setLevel(logging.DEBUG)
This will show raw request/response data, including when a call gets stuck.
Even if your script has been stable for months, external network issues or rare API timeouts could still cause a hang. Adding logging and a timeout fallback is a good safety net for long-running scripts.
I hope this helps! Also, this is a custom solution that is not supported by support. Please implement any changes on your own risk.
Best Regards,
Egor
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.