Hi community,
We have many MS Word documents (most in the 10-30 page range) that we'd like to convert to Confluence pages, but the resulting pages lose quite a bit of the formatting and would create a huge amount of manual clean up for us. These word files include tables (with and without merged cells), images, numbered lists, bullet lists with hierarchies, and more. Specifically, the loss of text alignments and tabs will likely cause us the most aggravation since the spacing in most of these documents is critical. Copy/pasting tab characters seems to work well enough, but copy/pasting has other limitations that might be just as bad.
I'm not a programmer, but I thought that we might have better results if, rather than importing or copy/pasting, we convert the word files to code (HTML or XML or ???) and then use a source code editor plugin to copy it into a Confluence page's source code, but that keeps throwing errors when we try it - I guess because the languages don't exactly match up. Maybe this could work if we did it the right way or made a couple tweaks to the process?
So in a nutshell, my question is: what's the best way nowadays to convert word documents into Confluence pages so that you lose as little content/formatting as possible? If initial set up takes a while but creates a repeatable process, it would be worth it because we have many documents.
Thanks so much for any help!
Confluence stores content in what they call Confluence storage format - it is 'XHTML-based', but not pure XML or normal HTML, as it contains special tags related to Confluence functionality.
Some of the issues could be CSS related too, particularly for text alignments and tabs.
What do your documents look like if you use a stand alone Word to HTML converter? Try using one that is designed to help people publish content drafted in Word so it can be copied into a generic Web Content Management System - they'll strip some the incompatible formatting from Word.
But if this is causing you enough pain and retaining the formatting is important to your business, I would consider engaging a developer to help solve this.
Thanks James. I did only some limited research into converters thinking that the Save As feature in Word basically did the same thing, but that is not at all true as I'm finding out!
Today I tried https://documentconverter.pro/ and both the desktop app and online app are working much better than other methods so far. The desktop app also has the option to do multiple files at once which will certainly come in handy. If anyone has recommendations for the most useful word to html converters, I'm all ears.
Thanks James!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I know this is an older discussion, but we have the same issue. Seems it should be easier to copy from Word and paste into Confluence without having to jump through hoops. Has anyone come up with a good solution?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
First thing to keep in mind, that this is HTML, so you have to think in what is possible. For example, tabs are a foreign concept in HTML. And you shouldn't really be using them much in Word either (too many people use Word like a typewriter).
And even if your try to use HTML to paste into the source editor, even if it doesn't through errors, it will often strip out most manual formatting.
The editor in Confluence is limited by design. But there are things you can replicate with custom CSS and user macros. But it will take some work. And if you want to control the format tightly then you need to be come a power user.
Bottom line, you will have to stop thinking in Word and move to thinking in Confluence and using its macros and methods for formatting content.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Yes, that's the idea. It's just the conversion to Confluence that is the challenge now. Once we're up and running, we plan on making full use of Confluence's macros and other formatting features. Thanks Bill for the input.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Generally my process for Word docs is to import them, then use Regex in the Source Editor to clean out all the low-level formatting, then go from there.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Bill_Bailey.
I found myself on the same situation as Miguel.
we have a very large amount of existing documents that we are looking forward to import into confluence. But Format is very important.
could you please elaborate on your comment about your word process import process using Regex?
Thanks in advance.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Bill_Bailey.
I found myself on the same situation as Miguel or Jaime
We too have a large amount of existing documents that we are looking forward to import into confluence. But Format is very important.
could you please elaborate on your comment about your word process import process using Regex?
Or any other workaround please
Thanks in advance.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
There is in a source editor you can install, that give you a source editor. I think use the source editor to use Regex to clean up the imported HTML. It is best to start with clean HTML when working with imported content.
Once you have clean HTML, you can adjust the formatting using Confluence tools.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi @Bill Bailey
How can we clean out low level formatting with Regex? Can you give us some example or a guidance page?
Thanks a lot!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.