This article is a highlight of my work at FocusGTS.
Abstract: In the realm of Adobe Experience Manager (AEM), nested layout components can often lead to complex scenarios that are not typically addressed in standard documentation. This article delves into a real-world case encountered at a client’s site, discussing the necessary queries, data manipulation, and the engineering decisions that followed.
Nested Nightmares: Tackling AEM Migration Challenges
There are numerous sites that talk about how to run Queries in AEM, but many do not always address real-world use cases and what you may still need to do to manipulate your data. In this article, I’m going to talk about a situation I encountered working with a client of mine, the needed queries, and the required data manipulation to help us make business decisions, and execute our engineering work.
Note: I have changed the names of all paths, components, etc.
Why Nested Layout Components in AEM Cause Headaches
Many AEM authors don’t care about the underlying technology. A CMS that makes it easy to build pages out of the box, though every company I’ve worked for who uses AEM uses it in a very heavily customized way with unique features that create their own specific challenges when it comes to upgrading versions of AEM or migrating to the Adobe Cloud Services.
For this client, their authors relied heavily on nesting layout components within each other to create the design of extra space on a page or to control column width or even padding around components. Yes, there are and were better ways to build these pages, however, that’s another discussion for another topic of content strategy and author training that I won’t get into here. These sites consisted of hundreds of thousands of pages with no ability to proactively reauthor them in a timely or cost-effective manner.
The bottom line was that the client’s engineering team had a challenge where their migration scripts would fail on certain pages with excessive nested elements. So we needed a way to identify which pages would be impacted and flag those separately to handle outside of our standard scripts.
To do this we would need to:
- Identify our sites for migration
- Identify what components were used in each site
- Identify which of these layout components were nested
- Flag those that are nested as potentially problematic.
Step One: Identify Component Paths in AEM Instance
To do this we go into AEM’s CRX, select Tools then Query from the dropdown menu. Change the Type to SQL2 instead of XPath and use a query like the one below:
SELECT *
FROM [nt:unstructured] AS comp
INNER JOIN [cq:Page] AS page ON ISDESCENDANTNODE(comp, page)
INNER JOIN [cq:PageContent] AS jcrcontent ON ISCHILDNODE(jcrcontent, page)
WHERE ISDESCENDANTNODE(page, '/content/my/site/path')
AND jcrcontent.[cq:lastReplicationAction] = 'Activate'
AND (
comp.[sling:resourceType] = 'foundation/components/content/name1'
OR comp.[sling:resourceType] = 'foundation/components/content/name2'
OR comp.[sling:resourceType] = 'foundation/components/content/name3'
-- Continue adding all components
)
For each site path, this generated tens to hundreds of thousands or more of components with their path that looked like the following nested and not nested examples.
Not Nested Example:
/content/my/site/some/page/about/queries/jcr:content/content-par/component-frame/component-column-par/column-2/difcomponent3/difcomponent2/other/form
Nested Example:
/content/my/site/some/page/about/queries/jcr:content/content-par/component-frame/ component-column-par/column-1/difcomponent2/difcomponent2/component-frame/ component-column-par/column-2/othercomponent
Step Two: Setup Data Template
From here we exported this data to Excel, though you can use Google Docs as well.
Each site report was kept in its own specific document to segment its results as well as ensure we stayed within manageable file and cell size limits.
We set up each spreadsheet with the following cell headers:
Raw Path | Components | Unique List | Final | Total | Is it nested? | How Many Nested? | Paths no JCR | Unique Paths with NLC | Count per Page | Total With Nest | Paths with NO NLC | Unique Paths with No NLC | Total No Nest |
Step Three: Formulas and Data Manipulation
Now with our data in Column A of an Excel file, we need to identify which components were used in the specific sites we used the formula:
=REGEXREPLACE(INDEX(SPLIT(A2, "/"), COLUMNS(SPLIT(A2, "/"))), "\d", "") in column B and then simply =UNIQUE(B2:B) in Column C to ensure we listed only the values once.
To identify which components on pages were nested we used the formula below as this allows us to find a True or False if a Component-Frame was within another Component-Frame
=REGEXMATCH(A2, "component-frame.*component-frame")
To count the number of pages that were nested, returned True, we use:
=COUNTIF(F:F, TRUE)
To Identify the content paths without the JCR portion we use:
=LEFT(A2, FIND("jcr:", A2) - 1)
And to identify which Unique Paths had Nested Layout Components and the Amount of Nested Layout Components in each we use:
=QUERY(F2:H, "SELECT H, COUNT(F) WHERE F = TRUE GROUP BY H LABEL COUNT(F) ''")
To count the total Nested Layout Components we used:
=COUNTA(I2:I)
Paths with no Nested Layout Components:
=FILTER(H2:H, ISERROR(MATCH(H2:H, I:I, 0)))
Identify the Unique Paths:
=UNIQUE(L2:L)
Then the Count:
=COUNTA(M2:M)
So how did this data help us make a business decision?
We knew that the migration script would fail if a page had over a certain threshold of Nested Layout Components within it.
For example, if our threshold had been 30 or more Nested Layout Components would trigger the scripts to fail, then using the data in column “Count Per Page” after Unique Paths with Nested Layout Components we could identify all pages with over 30 Nested Layout Components on them and exclude those from our initial script run.
Alternatively, for pages where we had several hundred occurrences of nesting, we could flag those to authors to let them know to review the page(s) and be aware that they may not migrate as expected in the current state they were in.