Hello Atlassian Community,
I am working on a SQL query in PostgreSQL to evaluate document data across four rolling 12-month periods. The objective is to use the ouput to create a U chart that illustrates the defectiveness of documents based on specific criteria. Each document (identified by contentid) is evaluated for defectiveness within these periods based on three criteria:
A document is deemed defective if it fails to meet one of the above criteria.
A document passes if it meets two or more of the criteria above. The challenge I’m facing involves aggregating this data correctly over each rolling period and calculating the defectiveness based on the criteria mentioned.
The Specific Challenges Are:
The Query I’ve Attempted:
WITH Periods AS (
SELECT 'Q1 2022 to Q4 2022' AS Period, DATE '2022-01-01' AS StartDate, DATE '2022-12-31' AS EndDate
UNION ALL
SELECT 'Q2 2022 to Q1 2023', DATE '2022-04-01', DATE '2023-03-31'
UNION ALL
SELECT 'Q3 2022 to Q2 2023', DATE '2022-07-01', DATE '2023-06-30'
UNION ALL
SELECT 'Q4 2022 to Q3 2023', DATE '2022-10-01', DATE '2023-09-30'
),
DocumentEvaluation AS (
SELECT
p.Period,
c.contentid,
-- Determine if the document version is greater than 3
(MAX(c.version) > 3) AS VersionPass,
-- Determine if the document has been updated within the last 12 months of the period end date
BOOL_OR(MAX(c.lastmoddate) > (p.EndDate - INTERVAL '1 year')) AS UpdatePass,
-- Determine if the document has 70 or more views within the last 12 months of the period end date
COUNT(CASE WHEN ao."LAST_VIEW_DATE" BETWEEN (p.EndDate - INTERVAL '1 year') AND p.EndDate THEN 1 ELSE 0 END) >= 70 AS ViewPass
FROM
public.content c
LEFT JOIN public."AO_92296B_AORECENTLY_VIEWED" ao ON c.contentid = ao."CONTENT_ID",
Periods p
WHERE
-- Include documents created at any time up to the end date of the period
c.creationdate <= p.EndDate
GROUP BY
p.Period, c.contentid
),
DefectiveDocuments AS (
SELECT
Period,
contentid,
-- A document is defective if it fails any of the criteria
(NOT VersionPass OR NOT UpdatePass OR NOT ViewPass) AS IsDefective
FROM
DocumentEvaluation
)
SELECT
Period,
-- Count the total number of documents evaluated in the period as SampleSize
COUNT(*) AS SampleSize,
-- Sum up the documents marked as defective
SUM(CASE WHEN IsDefective THEN 1 ELSE 0 END) AS Defects
FROM
DefectiveDocuments
GROUP BY
Period
ORDER BY
Period;
My Questions Are:
Useful link: Confluence Data Model
I appreciate any guidance or examples you can provide to help address these challenges. Thank you in advance for your assistance!