Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Export All Attachments from a Wiki

Keith Stephens June 5, 2018

I need to decommission a wiki. The users would like to extract all attachments from the wiki beforehand. I've seen this post:

https://confluence.atlassian.com/doc/retrieving-file-attachments-from-a-backup-187758.html?_ga=2.173187171.1969030308.1528258905-1702516973.1458178821

but the approach described would be very labour intensive for an entire wiki. I was wondering if there is any way to extract PDFs as PDF, Word files as Docx, PowerPoints as Pptx, etc?

Confluence Server V5.5.2 (I know...)

Thanks - Keith Stephens

3 answers

0 votes
Tony Rice
Contributor
June 2, 2023

Try this python script: https://github.com/rtphokie/confluence_attachment_extract

 

It takes and XML export, parses it and copies out the attachment files with their original filenames placing them in directories named for the pages the file was attached to.

0 votes
Keith Stephens June 26, 2018

OK, raised a support ticket with Confluence (https://getsupport.atlassian.com/servicedesk/customer/portal/14/CSP-230358) and got a very helpful response.

The long & the short of it is that the SQL at the bottom of this post can be used to get the full path of each attachment and generate a series of copy statements that can be posted into a DOS prompt to get the attachments out of the wiki.

I hadn't appreciated that the file system of the wiki will show just a numbered file for each attachment (and version of the attachment) but doesn't show the type of file. The query extracts the type of file as well.

Remember, my wiki is V5.5.2 and the query will need to be adjusted for later versions of Confluence, but it should provide a start.

Needless to say, I copied the database & attachment folder from the production wiki to a dev SQL Server....

Sample copy statement generated:

copy /y C:\ewiki\confluence\data\attachments\ver003\0\137\3637250\204\158\3408454\10158083\1 "C:\ewiki\Attachments\FWADecisions25May11.doc"

SQL:

Select

Concat(

'copy /y ', 'C:\ewiki\confluence\data\attachments\ver003\',

spc.SPACEID % 250,

'\',

FLOOR(spc.SPACEID / 1000) % 250,

'\',

spc.SPACEID,

'\',

att.PAGEID % 250,

'\',

FLOOR(att.PAGEID / 1000) % 250,

'\',

att.PAGEID,

'\',

Case

When att.PREVVER is NULL Then att.ATTACHMENTID

Else att.PREVVER

End,

'\',

att.ATTVERSION,

' "',

'C:\ewiki\Attachments\',

att.TITLE,

'"'

) As DiskLoc

 

From

usr_externalwiki.SPACES As spc

Inner Join usr_externalwiki.CONTENT As cnt

On spc.SPACEID = cnt.SPACEID

Inner Join usr_externalwiki.ATTACHMENTS As att

On cnt.CONTENTID = att.PAGEID

 

Where

 

 spc.SPACEID is not null

And spc.SPACESTATUS = 'CURRENT'

And cnt.CONTENTTYPE = 'PAGE'

And cnt.PREVVER is null

And cnt.CONTENT_STATUS = 'current' --when uncommented will include only the most recent version of the attachment

And att.PREVVER is null

Order by 1

0 votes
Fabienne Gerhard
Community Champion
June 5, 2018

Hi @Keith Stephens

it's not perfect but did you think about using a WebDAV solutiong like mentioned in this ticket.

I know it would be great if you could just use the space attachements macro and have a simple download button there...

Keith Stephens June 5, 2018

Thanks for the response. I believe WebDAV is deprecated.

The space attachments macro may be worth a look, but in this wiki it's fairly obvious where the attachments are. I'm just trying to save the drudge work by having a nice simple way of getting them all out.

Fabienne Gerhard
Community Champion
June 5, 2018

totally agree with you according WebDAV - but as there is no Download Button in Space attachments macro (like it is in site attachments macro...) it may be the fastest solution to get them out.

If you find a good solution would be great to let me know.

Keith Stephens June 26, 2018

Fabienne - found the answer - check the Atlassian Community.

Like mgrimm likes this

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events