Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Want to extract data from confluence pages to a json file

Ankita Patel June 28, 2023

I used the curl command but was unable to do it gives the information only for the main page and not for the remaining pages. Also it gives a limit of 1000 words. But I want to actually extract the data from confluence pages to store it to somewhere else.

I have used the curl command: 

curl -u abcd:xxxapi-tokenxxx -X GET "https://your-confluence-instance/wiki/rest/api/space/content?start=0&limit=1000" | python3 -m json.tool >  spaceat.content

1 answer

1 vote
Andrii Maliuta
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 28, 2023

Hello Ankita Patel

limit here is not WORD limit - it is limit of results (maximum 500)

When you use /rest/api/space/content you get ALL the content in your confluence in portions of max 500 - default = 50 items.

You need to know how the data in Confluence is structured and how to use REST API https://developer.atlassian.com/cloud/confluence/rest/v1/intro/#about so that you could get pages from one space or several spaces, from CQL query, from user etc. - then you can get any data in JSON format with the structure you need.

/rest/api/space/content - all content
/rest/api/space/content /{pageID} - page bi ID
/rest/api/space/content/{pageID}/child/page - children
/rest/api/space/content/{pageID}/child/descendant - all the hierarchy under root page
etc

Ankita Patel July 5, 2023

it keeps giving me the error:"

Expecting value: line 1 column 1 (char 0)

"

Andrii Maliuta
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 5, 2023

@Ankita Patel ,

Please check the response - it must be that response is not valid JSON - you are not authenticated or some other issue. You can check the HTTP response to see what is the issue.

Ankita Patel July 5, 2023

Based on the response I received, it appears that the API request was successful, but the response body indicates that there are no results. The "results" field is an empty array, and the "size" field is 0, indicating that there is no content available for the specified Confluence page. But the page does exists.

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
STANDARD
TAGS
AUG Leaders

Atlassian Community Events