Compliance for Confluence DC to Cloud Migration Guide


Who is this guide for?

This guide is for Confluence administrators who are migrating from Compliance for Confluence Data Center to Compliance for Confluence Cloud as part of a broader move from Confluence Data Center to Confluence Cloud.

By the end of this guide you will have:

  • Exported your existing page classifications from Data Center

  • Installed Compliance for Confluence Cloud

  • Migrated your Confluence content to Cloud (using the CCMA or another method)

  • Restored page classifications in Cloud using bulk-classification


Important

Compliance classifications are not automatically carried over when you migrate Confluence content. You will need to export classification data from Data Center and re-apply it in Cloud. This guide walks you through how to do that.

 

This migration involves three phases. The table below summarises what each phase covers.

Phase

Steps

What you’ll do

Before migration

1–2

Review your DC setup and export classification data

During migration

3

Migrate Confluence content using Atlassian’s CCMA tool

After migration

4–7

Install Compliance Cloud, verify data, restore classifications and validate.


Quick navigation


Phase 1: Before you migrate


Step 1: Review your current Compliance DC setup

Take stock of your current Compliance configuration in Data Center. You can find your classification levels under Settings cog → Compliance → Classification. The Statistics section gives you a count of classified pages per space.

Work through the following checklist:

  • Classification levels: Note the levels, their names and order.

  • Defaults and Sensitive Data Settings: Note any default classifications, sensitive data detectors and enforced classification settings

  • Automation rules: List all rules you have configured in DC so you can rebuild them in Cloud.

  • Timeline: Plan for a test migration before your full migration.

Tip: Run a Sensitive Data Detection scan before migrating to identify your most critical spaces and highly sensitive pages. This helps you prioritise validation after the migration. Find out more here.


Step 2: Export page classification data from Data Center

Compliance classifications are stored as page content properties in Confluence. Before you migrate, export this data so you have a reliable snapshot to restore from in Cloud.

2.1 Choose your export method: there are two ways to export your classification data

  1. REST API (recommended for most migrations)

  • Handles large instances with thousands of pages

  • Can be run per space or across all spaces at once

  • Outputs a clean CSV or JSON file you can use to reapply classifications later

  1. Manual CSV export (for smaller instances or spot-checking)

  • Use the Export to CSV option from Compliance Search

  • Best suited to smaller instances or if you only need to check a handful of spaces

The rest of this step assumes you are using the Compliance REST API. You may wish to find out more about this here before starting: https://docs.appfox.io/confluence-compliance/rest-api-dc

 

2.2 Identify the Compliance content property key

Compliance stores the classification for each page in a content property with the key classification. Your export script will need to use this key to fetch the correct data.

2.3 Run the export script

The script below retrieves the classification for every page in your instance and writes the results to a CSV file.

While we hope this script is useful, we cannot provide our usual level of support. Whilst you can raise a ticket with our friendly Service Desk, there isn’t a warranty and we aren’t maintaining the code.

 

Before running the script, update the configuration section with your own values:


The following script will produce the necessary data.

Python script to export out classifications
import asyncio
import csv
import json
from urllib.request import Request, urlopen
from urllib.parse import urlencode
from urllib.error import HTTPError
# ===== Configuration =====
BASE_URL = "https://your-confluence-instance"
PAT = "your-personal-access-token"
OUTPUT_FILE = "confluence_pages.csv"
LIMIT = 1
MAX_CONCURRENT = 5
MAX_RETRIES = 3
# Optional: limit to a single space. Set to None for all spaces.
SPACE_KEY = None
# =========================
HEADERS = {
    "Authorization": f"Bearer {PAT}",
    "Accept": "application/json",
}
def _fetch_pages(start):
    params = {
        "type": "page",
        "limit": LIMIT,
        "start": start,
        "expand": "space,metadata.properties.classification",
    }
    if SPACE_KEY:
        params["spaceKey"] = SPACE_KEY
    url = f"{BASE_URL}/rest/api/content?{urlencode(params)}"
    req = Request(url, headers=HEADERS)
    for attempt in range(MAX_RETRIES):
        try:
            with urlopen(req) as resp:
                return json.loads(resp.read())
        except HTTPError as e:
            if e.code >= 500 or e.code == 429:
                wait = 2 ** attempt
                print(f"Retry {attempt + 1}/{MAX_RETRIES} for start={start} (HTTP {e.code}), waiting {wait}s")
                import time; time.sleep(wait)
            else:
                raise
    raise RuntimeError(f"Failed after {MAX_RETRIES} retries for start={start}")
def parse_page(page):
    classification = (
        page.get("metadata", {})
            .get("properties", {})
            .get("classification", {})
            .get("value", {})
            .get("name", {})
            .get("value", "")
    )
    return [page["id"], page["space"]["key"], page["title"], classification]
async def main():
    total = 0
    start = 0
    with open(OUTPUT_FILE, "w", newline="", encoding="utf-8") as f:
        writer = csv.writer(f)
        writer.writerow(["pageid", "spacekey", "title", "classification"])
        while True:
            offsets = list(range(start, start + LIMIT * MAX_CONCURRENT, LIMIT))
            results = await asyncio.gather(
                *(asyncio.to_thread(_fetch_pages, s) for s in offsets)
            )
            done = False
            for data in results:
                pages = data.get("results", [])
                if not pages:
                    done = True
                    break
                for page in pages:
                    writer.writerow(parse_page(page))
                total += len(pages)
            print(f"Fetched {total} pages so far...")
            if done:
                break
            start += LIMIT * MAX_CONCURRENT
    print(f"Wrote {total} pages to {OUTPUT_FILE}")
if __name__ == "__main__":
    asyncio.run(main())

The output will be a CSV file with one row per classified page (see below). You will use this file in Step 5 to restore classifications in Cloud.

pageid

spacekey

title

classification

12345

HR

HR Policies and Procedures

Confidential

67890

FIN

Q4 Financial Forecast

Restricted


Phase 2: During the migration


Step 3: Migrate Compliance for Confluence DC content to Compliance Cloud

Migrate your Confluence spaces and pages from Data Center to Cloud using your chosen method. The Confluence Cloud Migration Assistant (CCMA) is the most common approach. Instructions on how to do this is out of scope of this document, but you need to install the app from the Atlassian Marketplace and the Atlassian documentation is here: Confluence Cloud Migration Assistant | Atlassian Support. It is recommended that you run a test before running the full migration, and that you schedule a migration window.


The CCMA app prompts you to install marketplace apps before migration. When you open the app, you will be asked to ‘Assess your apps’:

image-20260309-135734.png

Once you have selected Compliance for Confluence from the list, you will then be asked to ‘Prepare your apps’. You’ll be able to click install from here:

image-20260309-142809.png

The final stage of this is to ‘Agree to app migration’. There will be a message which comes up saying ‘This app does not have an automated migration path’, which just means that Compliance data will not automatically carry across and you need to complete Step 4 below.

image-20260309-143122.png


Once your Confluence content is in Cloud, you may wish to check your page properties to ensure that the classification data was preserved during the migration. The quickest way is to open the following URL in your browser, substituting your domain and the page ID:

https://{your-domain}.atlassian.net/wiki/rest/api/content/{page_id}/property/classification

You can find a page’s ID by opening the page in Confluence and copying the number from the URL.

 

Phase 3: After your migration

Step 4: Configure Compliance for Confluence Cloud

You may wish to browse through our Getting Started documentation to get an introduction to navigating our Cloud version before starting this step.


4.1 Recreate classification levels (Cloud configuration)

Recreate your classification levels in Cloud to match your Data Center setup as closely as possible. This makes it much easier to bulk-reapply classifications in the next step.

For each level, configure its name, description, colour and display order.

Once you have created your levels, you will need to configure your level schemes, which is a new addition to Compliance Cloud. Edit the default level schemes to ensure all levels are present and ordered from least to most sensitive.

4.2 Configure the global settings and any space level settings.

Recreate any global settings and space-specific configuration you noted in Step 1. You may find this page helpful: Configuring Global Options.

4.3 Recreate your Sensitive Data Detectors and automation rules

Recreate your Sensitive Data Detectors and configure your detector schemes to apply them. This is a Cloud feature which allows you to group different detectors together, and replaces the on/off toggles that there were in Data Center. Once you have done this, you can start rebuild your automation rules.

Tip: There are some additional default detectors in Cloud that you might find useful, but there are some which were on Data Center which are not provided by default on Cloud. You will need to recreate any detectors you need by copying the RegEx from your DC detector to a new Cloud detector.


Step 5: Restore page classifications in Cloud

With your classification levels and rules in place, you can now restore classifications to your migrated pages. There are two approaches: using the bulk classification tools in the Compliance admin UI, or running a script against the Cloud REST API.


Option A: Bulk classification via the admin UI

The Compliance admin interface includes bulk classification tools that let you apply classifications to many pages at once without writing any code. You can approach this:

  • By space: Open the Compliance admin view, select a space, and apply a default classification to all pages..

  • By filter: Filter pages by space, label, existing classification, or other metadata, then apply a classification to all filtered results at once.

Tip: A useful pattern for large spaces is to apply a default classification to the entire space first (for example, all HR pages → Confidential), then use filters to adjust any pages that need a different level

image-20260305-124248.png
Setting up your classification


Option B: Scripted bulk classification via REST API

For larger instances a scripted approach is preferable. This is especially useful when you want to restore classifications exactly as they were in Data Center, using the export file you created in Step 2.

You may use the following script for this approach. This script will overwrite any existing page levels on your Cloud instance. Running this script immediately after the CCMA migrates the page information is therefore recommended.

While we hope this script is useful, we cannot provide our usual level of support. Whilst you can raise a ticket with our friendly Service Desk there isn’t a warranty and we aren’t maintaining the code.

Before running the script, update the configuration section with your own values:

Python script to apply DC classifications to Compliance Cloud
import json
import time
import urllib.request
import urllib.parse
import urllib.error
import base64
# ============================================================================
# Configuration
# ============================================================================
CONFLUENCE_BASE_URL = "https://your-domain.atlassian.net"
CONFLUENCE_EMAIL = "email@example.com"
# The documentation for getting a Confluence API key is here:
# https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/
CONFLUENCE_API_TOKEN = "your_confluence_api_token"
# This should not be changed
COMPLIANCE_BASE_URL = "https://ac-cloud.com"
# Get the Compliance API key from your instance. Documented here: 
# https://docs.appfox.io/confluence-compliance/rest-api
COMPLIANCE_API_KEY = "your_compliance_api_key"
# Optional: set to a space key like "ENG" to only process pages in that space.
# Set to None to process all current pages.
SPACE_KEY_FILTER = None
# If True, do not apply the level. Just print what would happen.
DRY_RUN = True
# Retry settings for transient errors
MAX_RETRIES = 3
RETRY_DELAY_SECONDS = 2
# ============================================================================
# HTTP helpers
# ============================================================================
def basic_auth_header(email, api_token):
    raw = f"{email}:{api_token}".encode("utf-8")
    encoded = base64.b64encode(raw).decode("utf-8")
    return f"Basic {encoded}"
def make_request(method, url, headers=None, data=None, retries=MAX_RETRIES):
    headers = headers or {}
    if data is not None:
        data = json.dumps(data).encode("utf-8")
        headers["Content-Type"] = "application/json"
    attempt = 0
    while True:
        attempt += 1
        req = urllib.request.Request(url=url, method=method, headers=headers, data=data)
        try:
            with urllib.request.urlopen(req) as response:
                body = response.read().decode("utf-8")
                if body:
                    return response.getcode(), json.loads(body)
                return response.getcode(), None
        except urllib.error.HTTPError as e:
            status = e.code
            body = e.read().decode("utf-8", errors="replace")
            if status in (429, 500, 502, 503, 504) and attempt <= retries:
                print(f"Retrying {method} {url} after HTTP {status} (attempt {attempt}/{retries})")
                time.sleep(RETRY_DELAY_SECONDS * attempt)
                continue
            print(f"HTTP error {status} for {method} {url}")
            print(body)
            raise
        except urllib.error.URLError as e:
            if attempt <= retries:
                print(f"Retrying {method} {url} after network error: {e} (attempt {attempt}/{retries})")
                time.sleep(RETRY_DELAY_SECONDS * attempt)
                continue
            raise
# ============================================================================
# Compliance API
# ============================================================================
def get_compliance_levels():
    url = f"{COMPLIANCE_BASE_URL}/compliance/api/v1/level"
    headers = {
        "accept": "application/json",
        "x-api-key": COMPLIANCE_API_KEY,
    }
    status, data = make_request("GET", url, headers=headers)
    if status != 200:
        raise RuntimeError(f"Failed to get compliance levels. HTTP {status}")
    levels_by_name = {}
    for level in data:
        name = level.get("name", "").strip().lower()
        if name:
            levels_by_name[name] = level
    print(f"Loaded {len(levels_by_name)} compliance levels")
    return levels_by_name
def apply_page_level(page_id, level_id):
    url = f"{COMPLIANCE_BASE_URL}/compliance/api/v1/page-level"
    headers = {
        "accept": "application/json",
        "x-api-key": COMPLIANCE_API_KEY,
    }
    body = {
        "pageId": str(page_id),
        "levelId": level_id,
    }
    if DRY_RUN:
        print(f"DRY RUN: would apply levelId={level_id} to pageId={page_id}")
        return
    status, data = make_request("POST", url, headers=headers, data=body)
    print(f"Applied page level to page {page_id} (HTTP {status})")
    if data is not None:
        print(json.dumps(data, indent=2))
# ============================================================================
# Confluence API
# ============================================================================
def confluence_headers():
    return {
        "Accept": "application/json",
        "Authorization": basic_auth_header(CONFLUENCE_EMAIL, CONFLUENCE_API_TOKEN),
    }
def get_space_id_from_key(space_key):
    """
    Looks up a Confluence space ID from a space key using the v2 spaces endpoint.
    """
    url = f"{CONFLUENCE_BASE_URL}/wiki/api/v2/spaces?keys={urllib.parse.quote(space_key)}"
    status, data = make_request("GET", url, headers=confluence_headers())
    if status != 200:
        raise RuntimeError(f"Failed to look up space '{space_key}'. HTTP {status}")
    results = data.get("results", [])
    if not results:
        raise RuntimeError(f"No space found for key '{space_key}'")
    if len(results) > 1:
        print(f"Warning: multiple spaces found for key '{space_key}', using the first one")
    space_id = results[0].get("id")
    if not space_id:
        raise RuntimeError(f"Space '{space_key}' did not include an id")
    return str(space_id)
def get_pages(space_id=None):
    """
    Yields current Confluence pages using cursor-based pagination.
    Uses _links.next and follows the cursor from that URL.
    """
    if space_id:
        url = f"{CONFLUENCE_BASE_URL}/wiki/api/v2/pages?space-id={urllib.parse.quote(space_id)}"
    else:
        url = f"{CONFLUENCE_BASE_URL}/wiki/api/v2/pages"
    while url:
        status, data = make_request("GET", url, headers=confluence_headers())
        if status != 200:
            raise RuntimeError(f"Failed to get pages. HTTP {status}")
        results = data.get("results", [])
        for page in results:
            yield page
        next_link = data.get("_links", {}).get("next")
        if next_link:
            if next_link.startswith("http://") or next_link.startswith("https://"):
                url = next_link
            else:
                url = CONFLUENCE_BASE_URL + next_link
        else:
            url = None
def get_page_properties(page_id):
    url = f"{CONFLUENCE_BASE_URL}/wiki/api/v2/pages/{page_id}/properties"
    status, data = make_request("GET", url, headers=confluence_headers())
    if status != 200:
        raise RuntimeError(f"Failed to get properties for page {page_id}. HTTP {status}")
    return data.get("results", [])
def get_page_classification(page_id):
    properties = get_page_properties(page_id)
    for prop in properties:
        if prop.get("key") == "classification":
            value = prop.get("value", {})
            name_value = value.get("name", {}).get("value")
            if name_value:
                return name_value.strip()
    return None
# ============================================================================
# Main processing
# ============================================================================
def main():
    print("Starting page classification sync")
    levels_by_name = get_compliance_levels()
    space_id = None
    if SPACE_KEY_FILTER:
        print(f"Looking up space ID for space key: {SPACE_KEY_FILTER}")
        space_id = get_space_id_from_key(SPACE_KEY_FILTER)
        print(f"Using space ID: {space_id}")
    total_pages = 0
    pages_with_classification = 0
    pages_skipped_no_classification = 0
    pages_skipped_no_matching_level = 0
    pages_applied = 0
    pages_failed = 0
    for page in get_pages(space_id=space_id):
        total_pages += 1
        page_id = page.get("id")
        title = page.get("title", "(untitled)")
        print(f"\nProcessing page {page_id}: {title}")
        try:
            classification = get_page_classification(page_id)
            if not classification:
                print(f"SKIP page {page_id}: no classification property found")
                pages_skipped_no_classification += 1
                continue
            pages_with_classification += 1
            print(f"Found classification: {classification}")
            normalized = classification.strip().lower()
            level = levels_by_name.get(normalized)
            if not level:
                print(f"SKIP page {page_id}: no matching compliance level found for '{classification}'")
                pages_skipped_no_matching_level += 1
                continue
            level_id = level["id"]
            level_name = level["name"]
            print(f"Matched compliance level: {level_name} ({level_id})")
            apply_page_level(page_id, level_id)
            pages_applied += 1
        except Exception as e:
            print(f"FAILED page {page_id}: {e}")
            pages_failed += 1
    print("\nFinished")
    print(f"Total pages seen: {total_pages}")
    print(f"Pages with classification: {pages_with_classification}")
    print(f"Pages skipped (no classification): {pages_skipped_no_classification}")
    print(f"Pages skipped (no matching level): {pages_skipped_no_matching_level}")
    print(f"Pages applied: {pages_applied}")
    print(f"Pages failed: {pages_failed}")
if __name__ == "__main__":
    main()

 

Step 6: Validate and communicate

Before you wrap up, verify that classifications have been applied correctly and let your users know about the changes.

6.1 Functional checks

  • Labels on pages: Open a sample of pages and confirm the expected classification label is visible.

  • Enforcement rules: Try creating a new page in a space where classification is required. Confirm the user is prompted to classify it.

You can get a handy overview of this information through your the Compliance Search and Audit Logs are useful here.

image-20260305-123342.png
The Compliance Search page in Cloud

 

Tip: Compliance for Confluence Cloud now includes AI classification, which can help you identify and suggest classifications for pages that were missed or difficult to categorise manually. Find out more here: Classifying Pages with AI.

 

6.2 Communicate to users

You can now let your users know that Compliance for Confluence Cloud is now active!

 

Need help?

If you run into issues at any point during this migration, please raise a ticket with our helpful Service Desk team, who are always to help.

You may also find it useful to explore our full documentation pages.