I'm currently trying to download an attached image from a Confluence page using the python API. However, I've been getting 404 errors due to an issue with the request URL.
Here's my code:
from atlassian import Confluence
# define variables here
...
# Initialize Confluence client
confluence = Confluence(
url=base_url, password=confluence_token, username=username, cloud=True
)
# This works:
page = confluence.get_page_by_id(page_id, expand='body.storage')
# This doesn't:
downloads = confluence.download_attachments_from_page(
page_id, path=image_dir
)
The get_page_by_id() request before the download request works, but the second request fails, apparently because it's trying to access a URL that doesn't exist.
HTTPError: HTTP error occurred while downloading attachments:
404 Client Error: Not Found for url:
https://safetymasterehs.atlassian.net/wiki/https://safetymasterehs.atlassian.net/wiki/download/attachments/8716303/Screenshot%202024-02-29%20at%2022.42.33.png?version=1&modificationDate=1743358284752&cacheVersion=1&api=v2
The first section of the URL it's trying to download the png from is repeated twice, but if I make the obvious fix by deleting the first section of the URL and pasting it into a browser, the download starts immediately. However, I'm not sure why the rest API is putting together an incorrect URL. I've tried specifying a file by passing it a filename=[file name here] argument, but got the same result.
Is this a bug in the API, or is there something wrong with my setup?
Full traceback:
JSONDecodeError Traceback (most recent call last) File /usr/local/lib/python3.10/site-packages/requests/models.py:971, in Response.json(self, **kwargs) 970 try: --> 971 return complexjson.loads(self.text, **kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError File /usr/local/lib/python3.10/site-packages/simplejson/__init__.py:525, in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, use_decimal, **kw) 521 if (cls is None and encoding is None and object_hook is None and 522 parse_int is None and parse_float is None and 523 parse_constant is None and object_pairs_hook is None 524 and not use_decimal and not kw): --> 525 return _default_decoder.decode(s) 526 if cls is None: File /usr/local/lib/python3.10/site-packages/simplejson/decoder.py:370, in JSONDecoder.decode(self, s, _w, _PY3) 369 s = str(s, self.encoding) --> 370 obj, end = self.raw_decode(s) 371 end = _w(s, end).end() File /usr/local/lib/python3.10/site-packages/simplejson/decoder.py:400, in JSONDecoder.raw_decode(self, s, idx, _w, _PY3) 399 idx += 3 --> 400 return self.scan_once(s, idx=_w(s, idx).end()) JSONDecodeError: Expecting value: line 3 column 1 (char 10) During handling of the above exception, another exception occurred: JSONDecodeError Traceback (most recent call last) File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:3571, in Confluence.raise_for_status(self, response) 3570 try: -> 3571 j = response.json() 3572 error_msg = j["message"] File /usr/local/lib/python3.10/site-packages/requests/models.py:975, in Response.json(self, **kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError --> 975 raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) JSONDecodeError: Expecting value: line 3 column 1 (char 10) During handling of the above exception, another exception occurred: HTTPError Traceback (most recent call last) File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:1473, in Confluence.download_attachments_from_page(self, page_id, path, start, limit, filename, to_memory) 1472 # Fetch the file content -> 1473 response = self.get(str(download_link), not_json_response=True) 1475 if to_memory: 1476 # Store in BytesIO object File ~/.local/lib/python3.10/site-packages/atlassian/rest_client.py:441, in AtlassianRestAPI.get(self, path, data, flags, params, headers, not_json_response, trailing, absolute, advanced_mode) 428 """ 429 Get request based on the python-requests module. You can override headers, and also, get not json response 430 :param path: (...) 439 :return: 440 """ --> 441 response = self.request( 442 "GET", 443 path=path, 444 flags=flags, 445 params=params, 446 data=data, 447 headers=headers, 448 trailing=trailing, 449 absolute=absolute, 450 advanced_mode=advanced_mode, 451 ) 452 if self.advanced_mode or advanced_mode: File ~/.local/lib/python3.10/site-packages/atlassian/rest_client.py:413, in AtlassianRestAPI.request(self, method, path, data, json, flags, params, headers, files, trailing, absolute, advanced_mode) 411 return response --> 413 self.raise_for_status(response) 414 return response File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:3575, in Confluence.raise_for_status(self, response) 3574 log.error(e) -> 3575 response.raise_for_status() 3576 else: File /usr/local/lib/python3.10/site-packages/requests/models.py:1021, in Response.raise_for_status(self) 1020 if http_error_msg: -> 1021 raise HTTPError(http_error_msg, response=self) HTTPError: 404 Client Error: Not Found for url: https://safetymasterehs.atlassian.net/wiki/https://safetymasterehs.atlassian.net/wiki/download/attachments/8716303/Screenshot%202024-02-29%20at%2022.42.33.png?version=1&modificationDate=1743358284752&cacheVersion=1&api=v2 During handling of the above exception, another exception occurred: HTTPError Traceback (most recent call last) Input In [3], in <cell line: 9>() 7 image_dir = 'test_confluence/images' 8 filename = attachments_container['results'][0]['title'] ----> 9 downloads = confluence.download_attachments_from_page(page_id, path=image_dir)#, filename=filename) 10 print("Found files:", glob.glob(os.path.join([image_dir, '*']))) 11 for image in glob.glob(os.path.join([image_dir, '*'])): File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:1495, in Confluence.download_attachments_from_page(self, page_id, path, start, limit, filename, to_memory) 1493 raise PermissionError(f"Permission denied when trying to save files to '{path}'.") 1494 except requests.HTTPError as http_err: -> 1495 raise requests.HTTPError( 1496 f"HTTP error occurred while downloading attachments: {http_err}", 1497 response=http_err.response, 1498 request=http_err.request, 1499 ) 1500 except Exception as err: 1501 raise Exception(f"An unexpected error occurred: {err}") HTTPError: HTTP error occurred while downloading attachments: 404 Client Error: Not Found for url: https://safetymasterehs.atlassian.net/wiki/https://safetymasterehs.atlassian.net/wiki/download/attachments/8716303/Screenshot%202024-02-29%20at%2022.42.33.png?version=1&modificationDate=1743358284752&cacheVersion=1&api=v2
--------------------------------------------------------------------------- JSONDecodeError Traceback (most recent call last) File /usr/local/lib/python3.10/site-packages/requests/models.py:971, in Response.json(self, **kwargs) 970 try: --> 971 return complexjson.loads(self.text, **kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError File /usr/local/lib/python3.10/site-packages/simplejson/__init__.py:525, in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, use_decimal, **kw) 521 if (cls is None and encoding is None and object_hook is None and 522 parse_int is None and parse_float is None and 523 parse_constant is None and object_pairs_hook is None 524 and not use_decimal and not kw): --> 525 return _default_decoder.decode(s) 526 if cls is None: File /usr/local/lib/python3.10/site-packages/simplejson/decoder.py:370, in JSONDecoder.decode(self, s, _w, _PY3) 369 s = str(s, self.encoding) --> 370 obj, end = self.raw_decode(s) 371 end = _w(s, end).end() File /usr/local/lib/python3.10/site-packages/simplejson/decoder.py:400, in JSONDecoder.raw_decode(self, s, idx, _w, _PY3) 399 idx += 3 --> 400 return self.scan_once(s, idx=_w(s, idx).end()) JSONDecodeError: Expecting value: line 3 column 1 (char 10) During handling of the above exception, another exception occurred: JSONDecodeError Traceback (most recent call last) File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:3571, in Confluence.raise_for_status(self, response) 3570 try: -> 3571 j = response.json() 3572 error_msg = j["message"] File /usr/local/lib/python3.10/site-packages/requests/models.py:975, in Response.json(self, **kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError --> 975 raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) JSONDecodeError: Expecting value: line 3 column 1 (char 10) During handling of the above exception, another exception occurred: HTTPError Traceback (most recent call last) File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:1473, in Confluence.download_attachments_from_page(self, page_id, path, start, limit, filename, to_memory) 1472 # Fetch the file content -> 1473 response = self.get(str(download_link), not_json_response=True) 1475 if to_memory: 1476 # Store in BytesIO object File ~/.local/lib/python3.10/site-packages/atlassian/rest_client.py:441, in AtlassianRestAPI.get(self, path, data, flags, params, headers, not_json_response, trailing, absolute, advanced_mode) 428 """ 429 Get request based on the python-requests module. You can override headers, and also, get not json response 430 :param path: (...) 439 :return: 440 """ --> 441 response = self.request( 442 "GET", 443 path=path, 444 flags=flags, 445 params=params, 446 data=data, 447 headers=headers, 448 trailing=trailing, 449 absolute=absolute, 450 advanced_mode=advanced_mode, 451 ) 452 if self.advanced_mode or advanced_mode: File ~/.local/lib/python3.10/site-packages/atlassian/rest_client.py:413, in AtlassianRestAPI.request(self, method, path, data, json, flags, params, headers, files, trailing, absolute, advanced_mode) 411 return response --> 413 self.raise_for_status(response) 414 return response File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:3575, in Confluence.raise_for_status(self, response) 3574 log.error(e) -> 3575 response.raise_for_status() 3576 else: File /usr/local/lib/python3.10/site-packages/requests/models.py:1021, in Response.raise_for_status(self) 1020 if http_error_msg: -> 1021 raise HTTPError(http_error_msg, response=self) HTTPError: 404 Client Error: Not Found for url: https://safetymasterehs.atlassian.net/wiki/https://safetymasterehs.atlassian.net/wiki/download/attachments/8716303/Screenshot%202024-02-29%20at%2022.42.33.png?version=1&modificationDate=1743358284752&cacheVersion=1&api=v2 During handling of the above exception, another exception occurred: HTTPError Traceback (most recent call last) Input In [3], in <cell line: 9>() 7 image_dir = 'test_confluence/images' 8 filename = attachments_container['results'][0]['title'] ----> 9 downloads = confluence.download_attachments_from_page(page_id, path=image_dir)#, filename=filename) 10 print("Found files:", glob.glob(os.path.join([image_dir, '*']))) 11 for image in glob.glob(os.path.join([image_dir, '*'])): File ~/.local/lib/python3.10/site-packages/atlassian/confluence.py:1495, in Confluence.download_attachments_from_page(self, page_id, path, start, limit, filename, to_memory) 1493 raise PermissionError(f"Permission denied when trying to save files to '{path}'.") 1494 except requests.HTTPError as http_err: -> 1495 raise requests.HTTPError( 1496 f"HTTP error occurred while downloading attachments: {http_err}", 1497 response=http_err.response, 1498 request=http_err.request, 1499 ) 1500 except Exception as err: 1501 raise Exception(f"An unexpected error occurred: {err}") HTTPError: HTTP error occurred while downloading attachments: 404 Client Error: Not Found for url: https://safetymasterehs.atlassian.net/wiki/https://safetymasterehs.atlassian.net/wiki/download/attachments/8716303/Screenshot%202024-02-29%20at%2022.42.33.png?version=1&modificationDate=1743358284752&cacheVersion=1&api=v2
I'd suggest to follow the approach in this Stackoverflow answer: https://stackoverflow.com/questions/60038509/how-to-download-a-confluence-page-attachment-with-python
Thank you, this works. Still no luck with download_attachments_from_page(), but the requests version is good enough.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.