I want to download all the 800k pages of my Confluence wiki.
I'd like to use:
curl -u wikiusername:wikipassword https://wiki.hostname.com/rest/api/content?start=1`
and simply increase start
from 1
to 800000
.
However, the response time increases as start
increases, and from ~80000
begins to timeout:
start | response time (seconds) |
---|---|
1 | 0.4 |
1,000 | 2.5 |
10,000 | 9 |
50,000 | 112 |
100,000 | timeout |
How can I use rest/api/content
to download all the 800k pages of my Confluence wiki without timing out?
This is because when you make the REST call, the server is trying to build the response in memory before it can send it back. If you make it too large, the process will fail.
You're going to need to page through what you're trying to download, you can't do it in one massive great chunk (unless you increased the server memory to something massive)
I'd also want to quickly question why? What is this download going to do for you? I'm thinking there may be a better option (like parsing a backup)
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.