Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Find Excel XLS or CSV file attachments within Jira and confluence- to check for PI.

andrew.edwards
Contributor
May 18, 2023

Hi, our organisation wants to find and remove any PI - personal information uploaded into Jira or confluence.  This is hard, as anyone can load anything in descriptive fields.  The typical method we have seen our staff do, is to attach a CSV or XLS file to a Jira task.  I'm looking for a method for finding all issue types with a CSV or XLS file attached.  Any extra APPs will have to allow data residency to be stored in Australia.  I got excited about 'scriptrunner' but their data residency options are limited to EU or US locations.  Would appreciate any suggestions.  thank you

2 answers

1 accepted

2 votes
Answer accepted
Ken McClean
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 19, 2023

For anyone who finds this and is curious, here's how we'd do it with ScriptRunner:



def maxResults = 100
def startAt = 0
def attachmentArr = []
def stillPaginating = true

//Define the global variables

while (stillPaginating == true){
   
    //We need to paginate through the results
    //We're iterating by 100 results at a time
   
    def getIssueKeys = get("/rest/api/3/search?jql=project%20is%20not%20EMPTY&maxResults=${maxResults}&startAt=${startAt}")
        .header('Content-Type', 'application/json')
        .asObject(Map)
    //We start by returning all of the issues in the instance with a JQL search
   
    getIssueKeys.body.issues.each{ issueKey ->
    //Next we're iterating through each result (issue) that was returned
   
            def issueObject = get("/rest/api/3/issue/${issueKey.key}/")
            .header('Content-Type', 'application/json')
            .asObject(Map)
            //We query the system for more information about the issue in question
           
                issueObject.body.fields.attachment.each { attachmentID ->
                //Specifically, we're after the ID of any attachment on the issue
               
                         def attachment = get("/rest/api/3/attachment/${attachmentID.id}")
                        .header('Content-Type', 'application/json')
                        .asObject(Map)
                        //Once we have the ID of the attachment, we can use that ID to get more information about the attachment
                       
                            if(attachment.body.filename.toString.contains(".csv") || attachment.body.filename.toString.contains(".xlsx")){
                           attachmentArr.add("Issue ${issueKey.key} has an attachment: ${attachment.body.filename}")
                            }
                                 
                }//End attachment-each loop
        }//End getIssueKeys loop
   
            if(getIssueKeys.body.total < maxResults){
            stillPaginating = false;
        }
        //Finally, we check to see if we're still paginating
        //If the total number of results is less than the total number of maximum possible results,
        //We must be at the end of the line and can stop paginating by terminating the loop
       
}

return attachmentArr
//Return the results
1 vote
Ken McClean
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 18, 2023

Hi Andrew:

Is this jira server or jira cloud? I can write you a Python script do it, if you're just looking for a report.

- Ken 

andrew.edwards
Contributor
May 18, 2023

We are using jira cloud.  Yes we are just looking for a report, then will do a manual review of XLS and CSV attachments for PI

Ken McClean
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 19, 2023

Here's a Python script that will do it.

Please note: the steps required to generate the authorization token:

1.  Go to https://id.atlassian.com/manage-profile/security/api-tokens and generate a token
2. Go to https://www.base64encode.net/ and encode it in this style:
YourLoginEmail@domain.com:APIToken


You can read more about that process here:
https://developer.atlassian.com/cloud/jira/platform/basic-auth-for-rest-apis/


Here is the script:


 

import requests

max_results = 100
start_at = 0
attachment_arr = []
still_paginating = True

yourDomain = "<domain>"

headers = {
    'Authorization': 'Basic <base-64 encoded string>',
    'Content-Type': 'application/json',
    'Accept': 'application/json',
}


# Define the global variables

while still_paginating:
    # We need to paginate through the results
    # We're iterating by 100 results at a time
   
    response = requests.get(f"https://{yourDomain}.atlassian.net/rest/api/3/search?jql=project%20is%20not%20EMPTY&maxResults={max_results}&startAt={start_at}",
                            headers=headers)
                           
   #print(response.content)
    issue_keys = response.json().get("issues")
    # We start by returning all of the issues in the instance with a JQL search
   
    for issue in issue_keys:
        # Next, we're iterating through each result (issue) that was returned
       
        issue_key = issue.get("key")
        issue_response = requests.get(f"https://{yourDomain}.atlassian.net/rest/api/3/issue/{issue_key}",
                                      headers=headers)
                                     
        #print(issue_response.content)
        issue_data = issue_response.json()
        # We query the system for more information about the issue in question
       
        attachments = issue_data.get("fields", {}).get("attachment", [])
        for attachment in attachments:
            # Specifically, we're after the ID of any attachment on the issue
           
            attachment_id = attachment.get("id")
            attachment_response = requests.get(f"https://{yourDomain}.atlassian.net/rest/api/3/attachment/{attachment_id}",
                                               headers=headers)
                                               
            #print(attachment_response.content)
            attachment_data = attachment_response.json()
            # Once we have the ID of the attachment, we can use that ID to get more information about the attachment
           
            filename = attachment_data.get("filename")
            if filename and (".csv" in filename or ".xlsx" in filename):
                attachment_arr.append(f"Issue {issue_key} has an attachment: {filename}")
   
    if len(issue_keys) < max_results:
        still_paginating = False
    # Finally, we check to see if we're still paginating
    # If the total number of results is less than the total number of maximum possible results,
    # we must be at the end of the line and can stop paginating by terminating the loop

print(attachment_arr)
# Print the results
Like # people like this
andrew.edwards
Contributor
May 21, 2023

Thank you Ken, much appreciated, I will give it a run.

Like Susan Waldrip likes this

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
PREMIUM
PERMISSIONS LEVEL
Product Admin
TAGS
AUG Leaders

Atlassian Community Events