Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Seeking Assistance with Reading Indexed Data from Attachments (PDF, DOCX, XLSX) Using Jira API

f0ffaf
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
January 5, 2024

Hello,

I am currently working on a project where I'm using the Jira API to read data indexed in attached files (such as PDF, DOCX, XLSX, etc.). I'm able to successfully retrieve data from text files. However, when I try to fetch data from other file types like DOCX, PDF, and XLSX, I'm unable to retrieve the correct data. It seems I cannot load content in any language, including English, from these files. The loaded content does not include the main text of the documents, but rather seems to only contain metadata related to these file extensions.

However, I am aware that there are ways to correctly load this data through Jira plugins. Could you advise me on how I should make API calls to correctly retrieve the indexed contents of these file types? Here is an example of the code I am using:

 

 

fileContentUrl = "https://" + MASTERURL + ".atlassian.net/rest/api/3/attachment/content/" + file['id']
response_content = requests.request(
"GET",
fileContentUrl,
headers=headers,
auth=auth
)
response_content.encoding = 'utf-8'
print(response_content.text)

 

 

Any insights or suggestions on how to resolve this issue would be greatly appreciated. Thank you!

 

 

1 answer

0 votes
Sunny Ape
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 8, 2024

Hello @f0ffaf 

You've misunderstood what the Get attachment content endpoint does. It returns a stream of binary data, in bytes, from the attachment, which you then 'reconstitute' back into a copy of that attachment.

The endpoint does not have any capability to 'read' what is inside attached documents like PDFs and then translate what that back into what you are calling 'language' (the words / text etc).

If you Google "jira cloud rest api get attachment content" you will find all the times this same question has been asked before.

Suggest an answer

Log in or Sign up to answer