Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Separate out authentic spaces from duplicate spaces

sudhakar185 June 18, 2020

Within our confluence instance, we have a challenge to separate the official corporate spaces/pages from duplicate versions made by other users using the same content. I was thinking if we can use water marking to mark the official spaces/pages. And if we take that approach can the search also be configured to return those pages with higher ranking/priority? Today users are getting 100s of results back that have the key word, when they search to find content and no one is sure which page is official vs duplicate version. Please help with some suggestions.

1 answer

0 votes
Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 18, 2020

There's a few things we might be able to do here, but I'd like to know a bit more of the background before I start talking about some of them because it may be that I pick on something that is ruled out by something.

The wider question is probably "what are your processes", but there's really three questions inside that which will probably give me the context I'm looking for:

  • How do you currently know what is "authentic"?  (Please don't hesitate to give us more than one thing here - examples might be technical like "it's in a particular space" or "it has a label of x", right the way through to a very human and equally useful-to-know "I just know it is")
  • Why is content being duplicated by people?
  • Do you have usage guidelines for content creators, and/or a process or team for cleaning and maintaining your content?  (You may hear the phrase "gardening" and the best wikis I've seen had a team whose responsibilities included "wiki gardening")
sudhakar185 June 18, 2020

Hi Nic,

Appreciate you taking time to check and respond.

Confluence in my company is being used both as a intranet platform and content collaboration tool. Folks from all the departments in the company were adding content without any guidelines for almost 4 years. All the spaces/pages are at the same level and search is returning thousands of results making it a nightmare for the user to choose from.

I am currently working to put put together the IA to create individual sites for each department and give them their own categories and sub categories under which their spaces will be moved in an organized way. But all of these sites tie back to the main home page/landing page which will be more or less the intranet kind of page. I am trying to put together content curation workflows (creator/editor/ reviewer/approver) for various department to post content to this intranet type of page. In this process, I was trying to remove duplicate content as a one time effort. But, should also come up with a way to prevent future duplication of content. Archive all the stale spaces as needed. And refine search capabilities to provide user with an option to limit results to their site only.

Marking the authentic content is the pending exercise which has to be done with help from content owners from each department. So, that is where I was thinking about watermark option to mark authentic pages appropriately but at the same time random users should not be able to use those watermarking. Not sure if this is possible or you may be able to suggest better options.

Content duplication example/scenario: Currently, If user searches for "sick leave policy" he is seeing results from HR department spaces and also similar spaces created by random users under each department because that user is either unaware of HR space already existing or had access limitations due to which they cannot see it and created their own version. HR is unaware of this mess.

Currently, there are no usage guidelines for content creators nor  a process or team for cleaning and maintaining the content. I think, I am starting to come up with this gardening strategy and rules to organize current content and prevent future clutters by coming up with some rules/recommended practices.

I think, I was able to provide you with responses to all of you questions using this lengthy explanation. My apologies if this is not clear enough. Please let me know if you have more questions.

Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 19, 2020

That's great background.  Answers the questions I asked, gives us both the context and tells us what your current thinking about it all is.  On top of that, I think you're doing exactly the right thing - looking at how to fix the mess you have (innocently and with good intentions) arrived in at the same time as working out how best to not only stop making it worse, but change the way people think and use it.

Believe me, I've seen this so many times with wikis.  I find it's worse with document stores as they're even harder to identify duplication and remove it.

So, watermarks - I see where you're going with this, you want something that indicates a page is "canon" (or "official").  In a very healthy wiki, all pages are canon, because people don't duplicate and there is an active culture of curation/gardening, but when you don't have that (yet!), something to indicate whether a page is canon or not is very very useful, especially when trying to get going on the curation!

But a watermark is just a way to show information or status, what we need here is a solid way to categorise pages.  If we can do that, then a watermark is quite a good way of indicating what the organisation thinks of it, but not the only one.

There are several things I would be looking at here, but a LOT of it is based on one standard function - labels. 

There's a long essay if I were to go over everything in detail, so I'm going to start with a brief set of headline ideas:

  • Train your authors/editors (not huge intensive cover training all the things, we're talking 15 minutes).  Specifically
    • Teach them about the "include" macro (and excerpt if that could help)
    • Train them to use labels, understand your labelling guidelines and rules and not just on their own pages - ask them to label other people's pages if they see fit.  More on that later
    • Train them to search before writing.  There is a problem with this one - I may be an Atlassian fan-boy, but when people tell me "Confluence search is rubbish", I totally agree.  Again, a bit more on that later
  • Come up with a clean and obvious guideline for the structure of content and avoid "dumping ground" spaces.  It sounds like you already have that - even the mention of an "HR Space" tells me you're thinking of that.
  • One hard rule about content structure and classification I would have though - "if it's in a personal space, it is NOT official".  (Barring maybe a personal profile)
  • Encourage collaboration and editing.  There's a full history of pages, so you can always go back.  The reason I learned to love wikis was that when I read a page and thought "that's badly phrased" "I have a better example" or even when I was just hurt by the grammar, spelling or punctuation, I was positively encourage to just go in and fix it.  None of this "ask the author/team who owns it", "raise an issue to get it fixed".  Belt edit and fix the problem.
  • Labels
    • Come up with a short list of classification labels, and educate and encourage people to use and add them.  For example, Official, should-be-merged, duplicate-for-review
    • If you find your labels are still being misused, consider something like https://marketplace.atlassian.com/apps/1211162/labels-cage?hosting=server&tab=overview to try to keep a lid on it.  (Ideally, I'd want something that still lets people use any label they want, but have a set of named official labels that only certain people can add or amend)
    • Look for something that can highlight pages based on their labels.  The best system I saw imposed a small informational panel across the top of every page (the way that worked could easily do "watermarking css" instead).  The panel did not appear on pages with the label "official".  A howling "this page is probably not valid" warning appeared across all pages in personal spaces and any page without one of the classification labels.  And, you can guess that the classification labels drove the output of the panel when they were used.
  • Consider usage.
    • Look at the analytics and tracking apps that can tell you what people are reading, referring to and editing and updating.
    • Think about asking for simple ratings from users (yes, the "rate macro" is from Adaptavist and I work here, so a plug.  But I'll say it's not a good general solution because it relies on people remembering to add it to pages.  YMMV)
    • Is "gamification" something that might help?  One place I worked gave out cake to the team that merged the most pages every month, at least in the earlier phases of the clean up.  That gradually changed to "if you have no public pages in your personal space, you get cake" as we didn't want to overdo the merges.
  • Search
    • Up until Confluence 7, we actually had a way to bias Confluence searches properly, overriding the usually terrible weightings Atlassian inflicted on us.  https://developer.atlassian.com/server/confluence/search-decorator-module/ is far less useful, but can help.
    • Seriously, it's still bad, consider the search apps that might help in the marketplace, or even just pointing a decent search-engine at it and making that available to your users
sudhakar185 June 22, 2020

Thank you Nic. This is truely very helpful. Do you still recommend watermarking? Any good plug-ins that can be used with confluence? I will strive to work toward these directions provided by you. Are you have any general rules that are followed by the content gardeners in confluence, which you may be able to share?

I wish to stay connected with you via LinkedIn, if that is ok with you. If not, I respect your preference.

May bother you if I need advise. Stay Safe.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events