Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Convert HTML to Atlassian Document Format (ADF)

Venkat A January 27, 2022

Is there API or C# dll to convert JSON to ADF format?

2 answers

0 votes
Alan Bushey
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
August 22, 2023

I developed this in PowerShell v.5.1 (it uses hashtables since PS does not have a JSON type). It only performs a basic conversion:

function Convert-HTML2ADF {
PARAM (
[Parameter(ValueFromPipeline = $true)][string]$HTMLString
)

Begin {
function ConvertTo-ADF {
PARAM (
[Parameter(ValueFromPipeline = $true)][hashtable]$source
)

Process {
$Dest = "{"
$destarr = @();
foreach($key in $source.Keys) {
if ($source[$key] -is [hashtable]) {
$destarr += "`"$key`": $($source[$key] | ConvertTo-ADF)";
} else {
if ($source[$key] -is [System.Array]) {
$arrconc = "`"$key`": ["
$arr = @();
foreach($member in $source[$key]) {
if ($member -is [hashtable]) {
$arr += $member | ConvertTo-ADF;
} else {
$arr += "$member"; #should never be used
}
}
$arrconc += ($arr -join ',');
$arrconc += ']';
$destarr += $arrconc
} else {
if ($source[$key] -is [string] -and 'true','false' -notcontains $source[$key]) {
$destarr += "`"$key`": `"$($source[$key])`"";
} else {
$destarr += "`"$key`": $($source[$key])";
}
}
}
}
$Dest += ($destarr -join ',');
$Dest += '}';
[char[]]'Â‪¬Ž‰¤Î¼¥' | ForEach-Object { $Dest = $Dest -replace "$_", '';}
return $Dest -replace "\\u([0-9a-fA-F]{4})", "";
}
}

function HTML2ADF {
PARAM (
[Parameter(ValueFromPipeline = $true)]$HTML
)
Process {
[hashtable]$ADF = @{
"version"= 1;
"type" = "doc";
"content" = @();
}
[xml]$x = $HTML;
$out = $null;
Foreach ($n in $x.ChildNodes) {
if ($n -is [System.Xml.XmlElement]) {
$out = RecursiveHTML2ADF -pn $n -Parent $ADF
}
}

return $out #| ConvertTo-ADF
}
}

function RecursiveHTML2ADF([System.Xml.XmlElement]$pn, [hashtable]$Parent, [hashtable[]]$marks = @(), [hashtable[]]$attrs = @()) {

Foreach ($n in $pn.ChildNodes) {
if ($n -is [System.Xml.XmlElement] -or $n -is [System.Xml.XmlText]) {
$childOjb = CheckNode -n $n -marks $marks -attrs $attrs -parenttype $Parent["type"];
$child = $childOjb.Child;
$marks = $childOjb.Marks;
$attrs = $childOjb.Attrs;

if ($n.HasChildNodes) {
if ($child -eq $null) {
$child = RecursiveHTML2ADF -pn $n -Parent $parent -marks $marks -attrs $attrs;
} else {
$child = RecursiveHTML2ADF -pn $n -Parent $child -marks $marks -attrs $attrs;
}
}

if ($child -ne $null -and ($parent | ConvertTo-ADF) -ne ($child | ConvertTo-ADF)) {
$Parent["content"] += $child;
}
#if ('b','b' -notcontains $n.Name.ToLower()) { #patch for marks
}
}
return $Parent;
}

function CheckNode([System.Xml.XmlNode]$n, [hashtable[]]$marks, [hashtable[]]$attrs, [string]$parenttype) {
[hashtable]$out = $null;
switch ($n.Name.ToLower()) {
'p' {
$out = @{
"type"= "paragraph";
"content" = [hashtable[]]@(); #insert text
}; #new paragraph
$marks = @();
break;
}
{'strong','b' -contains $_} {
$marks += @{"type" = "strong"};
break;
}
{'em','i' -contains $_} {
$marks += @{"type" = "em"};
break;
}
{'sub','sup' -contains $_} {
$marks += @{
"type" = "subsup";
"attrs" = @{
"type" = $_;
}
}
}
'#text'{
if ($marks.length -gt 0) {
$out = @{
"type" = "text";
"text" = $n.Value;
"marks" = $marks;
}
} else {
$out = @{
"type" = "text";
"text" = $n.Value;
}
}
if ("tableCell", "listItem" -contains $parenttype) {
$out = @{
"type"= "paragraph";
"content" = [hashtable[]]@($out); #insert text
}; #new paragraph
}
break;
}
'ul' {
$out = @{
"type" = "bulletList";
"content" = @(); #insert listitem
};
break;
}
'li' {
$out = @{
"type" = "listItem";
"content" = @(); #insert bulletList, codeBlock with no marks, mediaSingle, orderedList, paragraph with no marks
}
}
'table' {
$out = @{
"type" = "table";
"attrs" = @{}
"content" = [hashtable[]]@();
}
}
'tr' {
$out = @{
"type" = "tableRow";
"content" = @();
}
}
{'td','th' -contains $_} {
$out = @{
"type" = "tableCell"; # or "tableHeader"
"attrs" = @{}
"content" = @(); #insert: blockquote, bulletList, codeBlock, heading, mediaGroup, orderedList, panel, paragraph, rule
}

}
}
return [pscustomobject]@{"Child" = $out; "Marks" = $marks; "Attrs" = $attrs;};
}
}

Process {
return $HTMLString | HTML2ADF | ConvertTo-ADF;
}
}

Michael Russo May 30, 2024

Alan,

This is much awesome-ness.  Is this maintained/available anywhere on GitHub?

 

Only change I really had to consider in Powershell 5 to import this as a module was this line:

[char[]]'Â‪¬Ž‰¤Î¼¥' | ForEach-Object { $Dest = $Dest -replace "$_", '';}

 

Changed to:

[char[]]@([char]0x00C2, [char]0x00E2, [char]0x202A, [char]0x00AC, [char]0x017D, [char]0x02C6, [char]0x00A4, [char]0x00CE, [char]0x00BC, [char]0x00A5) | ForEach-Object { $Dest = $Dest -replace "$_", '';}

 

Thanks!!!

Venkat A January 27, 2022

Hi @Pramodh M 

Looks like still this issue enhancement is not done yet . Also i am looking particularly in .NET . Please let me know if you have anything. 

Customer Portal
Contributor
November 24, 2022

@Venkat ADo you find solution for this problem?

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
STANDARD
TAGS
AUG Leaders

Atlassian Community Events