Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

How to extract a Confluence space - the whole page tree data for building a RAG chatbot

Amol Sinha
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
April 12, 2025

Hi guys,

I am trying to build a conversational chatbot for my org based on our confluence space and for that, I need to extract the whole space and then build a RAG implementation pipeline, which I am planning to build on Azure.

Please tell me how to effectively extract the whole page tree. I don't have admin access, so the option to export the whole space is not available to me. 

Are there any other ways that have worked for you?

1 answer

1 vote
David Nickell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 13, 2025

If something can be done, it can usually be done with REST API Calls.  

Its been a while since I've done any Confluence REST Calls.   This chart comes from makeing a call for the spaces in my instance, then the pages within each Space.   There is plenty of parent/child/verison/author/date type information available.  

My example is somewhat limited;  I don't have access to any large confluence installations at the moment.  

The Confluence REST API documentation is here:  https://developer.atlassian.com/cloud/confluence/rest/v2/intro/#about

Hope this gets you started.  You will need an API key, but I do not think Admin Rights are needed.

 

Confluence Report Sample.png

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events