Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

How do I need to configure an external search engine to scan a Confluence installation?

Sorin Sbarnea (Citrix)
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 13, 2012

How do I need to configure an external search engine to scan a Confluence installation?

2 answers

2 votes
Dennis Kromhout van der Meer
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 13, 2012

Be sure that the pages you want to be indexed by an external search engine (like Google) are accessible by an anonymous user. You can use Google Webmaster Tools to add your Confluence instance to the Google search index.

If I misinterpreted your question, please elaborate on what you're trying to achieve.

1 vote
Matthew J. Horn
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 13, 2012

When you say "external search engine", are you referring to a site like google.com, or are you talking about a search appliance that resides on another server within your organization?

Sorin Sbarnea (Citrix)
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 14, 2012

Exactly, I was trying to configure SearchBlox to crawl jira.

Matthew J. Horn
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 14, 2012

Should just be able to point it to the server root. As long as you have no robots.txt file blocking access, it should be able to index the confluence site.

Sorin Sbarnea (Citrix)
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 14, 2012

It is not so easy, I do not want the spider to index all the previous version of the documents. The default robots.txt does allow this and it pollutes the indexes.

Matthew J. Horn
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 14, 2012

can you configure the robots.txt to exclude paths to previous versions of teh doc? For example, our site uses space names to differentiate between versions, so we could exclude URLs that match the older versions' space names.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events