As a user of out company wiki based on Confluence I want to find and itentify pages that are empty apart from a title so than I can decide whether to trash them or ping peers so that they can fill them with our knowledge.
Here's a macro that will do it. Not the most efficient macro, but it sounds like you just need to run this once to get a list. I can't vouch for it being perfect, but I did a quick test and it seems to work.
## Macro title: Find Empty Pages ## Macro has a body: N ## Body processing: n/a ## Output: HTML ## ## Developed by: Matthew J. Horn ## Date created: 07/31/2013 ## @noparams #set ($pageListArray = []) #set ($spaceHome = $space.getHomePage()) #macro ( process $rp ) #set ($pagelist = $rp.getSortedChildren() ) ## returns List<Page> #foreach( $child in $pagelist ) #set($p = $pageListArray.add( $child ) ) #if( $child.hasChildren() ) #process ( $child ) #end #end #end #process ( $spaceHome ) <table class="confluenceTable"> <tbody> <tr> <th class="confluenceTh">Title</th> <th class="confluenceTh">Size</th> </tr> #foreach( $child in $pageListArray) ## child is of type Page <tr> <td class="confluenceTd">$child.getTitle()</td> <td class="confluenceTd">$child.getBodyAsStringWithoutMarkup().length()</td> </tr> #end </tbody> </table>
Matthew, this is working well....however how can i see this at complete instance level? I have some 20 odd spaces and want to know how many empty pages i have in the space.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello,
maybe a late answer , but we had the same problem / requests from our users. I will post my solution, maybe somebody can use it.
I checked Matthew J. Horn great answer, but it has some problems :
* it lists all the pages in a space, not only the empty ones
* only the page title is used , links to pages would be convenient
I did some tests and noted that it is hard to identify empty pages with 100% certainty when using the size of the content as a String. Note I didn't find a better way to check for an empty page, so that idd seems the best tool at our disposal.
So I tested with empty pages which have some layout (like sections) , the usage of macro's (without other text) : with or without ouput, adding very little text , very small images, etc...
I found that if we use a threshold of 10 (length of the String) almost all of the non-empty pages are filtered out , some false positives can remain
So starting from Matthew J. Horn solution I made this:
## Macro title: Find Empty Pages
## Macro has a body: N
## Body processing: n/a
## Output: HTML
##
## Original by: Matthew J. Horn : https://community.atlassian.com/t5/Confluence-questions/Find-and-identify-all-empty-pages-in-Confluence/qaq-p/131649
## Updated by: Loïc Dewerchin
## Date created: 07/31/2013
## @noparams
#set ($pageListArray = [])
#set ($spaceHome = $space.getHomePage())
#macro ( process $rp )
#set ($pagelist = $rp.getSortedChildren() ) ## returns List<Page>
#foreach( $child in $pagelist )
#set($p = $pageListArray.add( $child ) )
#if( $child.hasChildren() )
#process ( $child )
#end
#end
#end
#process ( $spaceHome )
<ac:macro ac:name="note">
<ac:rich-text-body>
<p>Add a warning about possible false positives</p>
</ac:rich-text-body>
</ac:macro>
<table class="confluenceTable">
<tbody>
<tr>
<th class="confluenceTh">Page</th>
<th class="confluenceTh">Author</th>
<th class="confluenceTh">Creation date</th>
<th class="confluenceTh">Update date</th>
</tr>
#foreach( $child in $pageListArray) ## child is of type Page
#if( $child.getBodyAsStringWithoutMarkup().length() <= 10 )
<tr>
<td class="confluenceTd"><a href="$child.getUrlPath()">$child.getTitle()</a></td>
<td class="confluenceTd">$child.getCreatorName()</td>
<td class="confluenceTd">$child.getCreationDate()</td>
<td class="confluenceTd">$child.getLastModificationDate()</td>
</tr>
#end
#end
</tbody>
</table>
This user macro only shows the empty pages , and provides the link to the page + some additional info.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Great Macro...
Is it possible to limit the search by labels?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Stefan
did u tried an simple select on the database?
"select contentid from bodycontent where body is NULL" will show you all contentid's which don't have a body
"select * from content where contentid = XYZ" should list you some more information of that page(s).
Sure, you can combine those sql-querys within some joins or subselects, but thats sth i'm not into :-)
Kind regards
André
EDIT:
Hmm Confluence is tricky...
Body-column is CLOB and can't be combined out of the box...
I searched around and made some try+error and found:
SQL: select contentid from bodycontent where to_char(substr(body,0,100)) is NULL;
that should list all pages/contentid's where the first 100 chars are NULL :-)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Andre, thank you for your reply. I forgot to say that I do not have database access at the moment. This could be a solution anyway but I would prefer a solution integrated on the advanced page for a given space for example. This seems not to exists yet?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I'm admittedly rather late to the party, but this would be a good use case for ScriptRunner for Confluence's Search Extractors.
A custom search extractor with the following code:
import com.atlassian.confluence.pages.Page
import org.apache.lucene.document.Field
import org.apache.lucene.document.StringField
if (searchable instanceof Page) {
Page page = searchable as Page
if (page.bodyAsStringWithoutMarkup.isEmpty() || page.bodyAsStringWithoutMarkup.isAllWhitespace()) {
document.add(new StringField("empty", "true", Field.Store.YES))
}
}
Will find all pages where the body is either empty or all whitespace. Of course, you can tweak the above script to match your own ideas about what constitutes an "empty" page.
You'll need to rebuild Confluence's indexes afterward, but then a simple Confluence search for empty : true should find any empty pages.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I have modified the code and this gets me list of all the spaces with empty page names.
## Macro title: Find Empty Pages ## Macro has a body: N ## Body processing: n/a ## Output: HTML ## ## Developed by: Matthew J. Horn ## Date created: 07/31/2013 ## @noparams ## Modified by: Pranjal Shukla on 13/1/2016 #set ($spaces = $spaceManager.getAllSpaces()) #foreach( $space in $spaces ) #set ($spaceHome = $space.getHomePage()) #set ($pageListArray = []) #macro ( process $rp ) #set ($pagelist = $rp.getSortedChildren() ) ## returns List<Page> #foreach( $child in $pagelist ) #set($p = $pageListArray.add( $child ) ) #if( $child.hasChildren() ) #process ( $child ) #end #end #end #process ( $spaceHome ) <h1>$space.getName()</h1> <table class="confluenceTable"> <tbody> <tr> <th class="confluenceTh">Title</th> <th class="confluenceTh">Size</th> </tr> #foreach( $child in $pageListArray) ## child is of type Page #if( $child.getBodyAsStringWithoutMarkup().length()==0 ) <tr> <td class="confluenceTd">$child.getTitle()</td> <td class="confluenceTd">$child.getBodyAsStringWithoutMarkup().length()</td> </tr> #end #end </tbody> </table> #end
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.