Hello!
We're working on producing a tool that will help us keep our documentation up to date. In order to do this, we are crawling a space's pages to check the 'last updated' portion of each page. While testing this, we found an inconsistency with the Confluence API.
To be more specific about the inconsistencies we were seeing, we are using this endpoint:
/rest/api/content/search?cql=space = <SOME_SPACE> AND type = page&start=0&limit=51&expand=ancestors,body.storage,history.lastUpdated
To crawl to the next page, we are using the ['_links']['next']. Once there are no more 'next' links, we stop crawling.
What we have found is that we are getting non-deterministic results from this crawl. In other words, we get different pages from one run to the next.
Additionally, we also see the same page returned multiple times. Is there a possible misuse in our interaction the API or is this possibly a bug?
Thanks in advance for any help you can give us.
Hi @Kota Justin -- there is a bug filed for this issue that you can see here: https://jira.atlassian.com/browse/CONFCLOUD-69726. Visit and upvote the bug.
We've chat with one Marketplace Partner (vendor) who may have found that adding an `ORDER BY` clause in the CQL query does keep things in order, and eliminate duplicate results; however, that hasn't been fully verified.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks for pointing that out, and I fixed it, the link is now accessible.
Regards,
Earl
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Update on this. ORDER BY fixed the issue for us. I will update again if it stops working.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.