Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

How can I use Confluence storage format to export forum data , attachments and images.

Engineers Australia October 21, 2018

Hi ,  

I am looking for ways to export data out of confluence using storage format and archive them. 

I am happy to query the confluence tables to get data and format it. The data would include pages,blogs, attachments, forums and images.

I was hoping to find a way to use storage format to read and export data out of confluence and to know if this is at all possible. 

Using Rest API and third party tools are not an option for me right now. 

Kindly advise. 

 

Thanks ,

Nitin

2 answers

0 votes
Tobias Anstett _K15t_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 21, 2018

Hi Nitin,

If your use case is limited to archiving I would propose to use the default space XML export here as it contains all the data and can be recovered.

Anything else - not matter if you are using an commercial app or not e.g. to export the space to html and have it offline accessible - will not contain all data. E.g. each app on your system decides how to store their data on its own. You will never no all keys required and locations used.

Best, Tobias

Engineers Australia October 22, 2018

Hi Tobias, 

Thanks for your response . I have tried Full(XML) space exports but when you unzip it you would see that it actually contains references to confluence-objects and attachments and so a call to database becomes incumbent to extract full information. 

I realized that storage format is also not an option as it also contains object and attachment references only. 

I think I would have to stick to Rest API with selective querying to DB to get page meta-data to manage load.  Soap RPC is also an option but I do not have much experience with that. 

Thanks,

Nitin 

Tobias Anstett _K15t_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 23, 2018

Can you describe your use case / goal in more detail? 

Engineers Australia October 25, 2018

Hi Tobias, 

I am working towards exporting the pages and attachments from Confluence to cloud. It may be archived as text file or uploaded to a new system. Owing to limited network bandwidth I may am not sure of the viability of using Rest. And ETL direct from DB may not be accurate. 

Use case is to extract Page data and meta-data with attachments and export it to cloud. The business user should be able to catalog and document the information too. It is a single instance on version 6.x external facing and xlarge instance size wise.

Thanks 

Nitin 

Tobias Anstett _K15t_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 25, 2018

Hi Nitin,

When you talking about exporting from Confluence to cloud – what does cloud mean

a) Confluence Cloud by Atlassian

b) Confluence Server hosted in the Cloud by your own

c) static HTML

d) something totally different

Best, Tobias

Engineers Australia October 28, 2018

Hi Tobias, 

 

It is to be exported to Microsoft Azure or AWS . I do not have further info on how will it be used in cloud.

 

Thanks 

Nitin 

0 votes
AhmadDanial
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 21, 2018

Hi, Nitin.

As mentioned in the official documentation, the storage format refers to the XHTML (more accurately, XML) representation of the Confluence page. So, it does not contain the actual data itself for you to export from. In case of extracting the information from the source, you can try to get the title of the page and the storage format as per the query below:

SELECT c.title AS Title, b.body AS Body FROM content AS c RIGHT join bodycontent AS b ON c.contentid = b.contentid AND c.contenttype = 'PAGE' WHERE title IS NOT NULL;

Let me know if that would be something that is ideal for you. If you need more details, feel free to share it here so we can discuss it further.

Engineers Australia October 22, 2018

Hi Ahmad, 

Thanks for your response.

Yes I am thinking around those lines to extract page meta-data. Once Atlassian Support helped me diagnose an attachment problem using storage format so I was like maybe there is more to it then just meta-data.  

Would you be able to advise if we can achieve something using SOAP API although I know you promote use of REST . 

Thanks,

Nitin  

AhmadDanial
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 25, 2018

Hello, Nitin.

You are welcome. Apologies for the radio silence. Anyway, SOAP API has been deprecated since Confluence 5.5 as mentioned in the Confluence XML-RPC and SOAP APIs. As mentioned in the article, it may not work properly. 

May I know if you have specific restrictions to utilize SOAP rather than REST in this case?

Engineers Australia October 28, 2018

Hi Ahmed , 

 

Not really . I still have been given a vague idea on what actual requirement would be and so just trying to do my homework on the options available.

My Initial architecture diagram needs to include all the available options for implementation and rollback. It needs to have a comparison of the options available and their merits. I know for sure that the data to be migrated is huge and so one fixed approach may not work. 

Thanks,

Nitin 

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events