Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

How can I use `rest/api/content` to download all the 800k pages of my Confluence wiki without timing

Franck Dernoncourt July 20, 2022

I  want to download all the 800k pages of my Confluence wiki.

I'd like to use:

curl -u wikiusername:wikipassword https://wiki.hostname.com/rest/api/content?start=1`

and simply increase start from 1 to 800000.

However, the response time increases as start increases, and from ~80000 begins to timeout:

startresponse time (seconds)
10.4
1,0002.5
10,0009
50,000112
100,000timeout

How can I use rest/api/content to download all the 800k pages of my Confluence wiki without timing out?

1 answer

0 votes
Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 25, 2022

This is because when you make the REST call, the server is trying to build the response in memory before it can send it back.  If you make it too large, the process will fail.

You're going to need to page through what you're trying to download, you can't do it in one massive great chunk (unless you increased the server memory to something massive)

I'd also want to quickly question why?  What is this download going to do for you?  I'm thinking there may be a better option (like parsing a backup)

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events