With the advent of SDL Web 8.5, we’ve seen many nice improvements to the software. We’re able to move items in the blueprint hierarchy, we can now set up a whole content delivery environment with a number of powershell scripts, … but in our most recent upgrade, we’ve found a big problem.
Let’s start at the beginning. A few years back, I helped a client build a website which was heavily relying on dynamic data. Each page in their website was constructed using broker queries and dynamic component presentations.
We knew that, when we load a lot of dynamic data, we needed to get our code right in order to get fast page load times (we don’t want our pages to take minutes to load now do we).
Using DD4T running on a Tridion 2013 SP1 environment, we managed to pull it off : blazing fast load times, a fully personalized website, loads of broker queries, life was good.
Over time, we migrated from DD4T 1 to DD4T 2 without too much problems, but recently, we hit some serious problems.
Since support for SDL Tridion 2013 was stopping, we wanted to move everything over to a new and shiny SDL Web 8.5 installation. We installed the new CM, upgraded our databases, installed some microservices, updated the dlls of our DD4T website, and everything compiled, pages still looked the same. Job well done (we thought)…
After some initial testing, we found that pages would load correctly, and once caching kicked in, pages loaded fast. The big problem we noticed were the initial load times. Where in our 2013 environment, initial load times were also relatively fast (with an IIS worker process startup and first .NET compilations, which takes around 15 seconds, the webpages started to show up after 20 seconds on a first load. Subsequent requests were handled in milliseconds (since then most of the content was served from cache).
In our new setup, we noticed that our initial page load for some data intensive pages were taking over a minute to load. After that first request, pages were served in milliseconds again.
Since this load time was unacceptable, and the previous setup (using Tridion 2013) proved we could get quick load times, we started our search for a bottleneck. The codebase hadn’t changed, the only big change we had at first was moving from an in-process broker API to using the Microservices to get the broker content.
When trying to find what caused such load times, we noticed that in some cases (when we were loading multiple pages at once for the first time, so little or no content was cached), our DB servers CPU spiked to 100 %, and a bunch of broker DB queries were starting to get queued, until pages started to timeout. Using SQL Profiler and the execution plans, we were able to find a number of expensive queries. The SQL activity monitor also suggested some index optimizations that would make the queries quicker.
With the approval of SDL Support, we added those indexes to the Broker DB, and after that we did notice some improvements. CPU spikes were gone, pages started to load quicker, life looked a little better again, until we tested one more page.
This page alone had again a startup time of over one minute. So something else must be wrong. Again we analysed the queries, and now also compared the queries and execution times of our old environment.
We finally found this huge broker query (basically, the query selects one or more components, based on some taxonomy value criteria and a schema id)
declare @p1 int; exec sp_prepexec @p1 output,N'@P0 int,@P1 int,@P2 int,@P3 int,@P4 int, @P5 nvarchar(4000),@P6 int,@P7 int,@P8 nvarchar(4000),@P9 int,@P10 int, @P11 nvarchar(4000),@P12 nvarchar(4000),@P13 int,@P14 int,@P15 int, @P16 int,@P17 int,@P18 nvarchar(4000),@P19 float,@P20 int,@P21 int, @P22 nvarchar(4000)',N'select distinct TOP(@P0) itemmeta0_.PUBLICATION_ID as col_0_0_, itemmeta0_.ITEM_REFERENCE_ID as col_1_0_, itemmeta0_.ITEM_TYPE as col_2_0_, itemmeta0_.LAST_PUBLISHED_DATE as col_3_0_ from ITEMS itemmeta0_ cross joinITEM_CATEGORIES_AND_KEYWORDS relatedkey1_ cross join TAXFACETS taxonomyit2_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey3_ cross join TAXFACETS taxonomyit4_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey5_ cross join TAXFACETS taxonomyit6_ cross joinCOMPONENT componentm7_ inner join ITEMS componentm7_1_ on componentm7_.PUBLICATION_ID=componentm7_1_.PUBLICATION_ID and componentm7_.NAMESPACE_ID=componentm7_1_.NAMESPACE_ID and componentm7_.ITEM_REFERENCE_ID=componentm7_1_.ITEM_REFERENCE_ID cross join TAXFACETStaxonomyit8_ cross join TAXFACETS taxonomyit9_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey10_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey11_ cross join TAXFACETS taxonomyit12_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey13_ cross join CUSTOM_METAcustommeta14_ where itemmeta0_.PUBLICATION_ID=@P1 and itemmeta0_.ITEM_REFERENCE_ID=componentm7_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=componentm7_.PUBLICATION_ID and componentm7_.SCHEMA_ID=@P2 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey5_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey5_.PUBLICATION_ID and relatedkey5_.KEYWORD_ID=taxonomyit6_.FACET_ID and relatedkey5_.PUBLICATION_ID=taxonomyit6_.PUBLICATION_ID and relatedkey5_.PUBLICATION_ID=@P3 and taxonomyit6_.TAXONOMY_ID=@P4 and taxonomyit6_.FACET_NAME=@P5 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey3_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey3_.PUBLICATION_ID and relatedkey3_.KEYWORD_ID=taxonomyit4_.FACET_ID and relatedkey3_.PUBLICATION_ID=taxonomyit4_.PUBLICATION_ID and relatedkey3_.PUBLICATION_ID=@P6 and taxonomyit4_.TAXONOMY_ID=@P7 and taxonomyit4_.FACET_KEY=@P8 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey1_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey1_.PUBLICATION_ID and relatedkey1_.KEYWORD_ID=taxonomyit2_.FACET_ID and relatedkey1_.PUBLICATION_ID=taxonomyit2_.PUBLICATION_ID and relatedkey1_.PUBLICATION_ID=@P9 and taxonomyit2_.TAXONOMY_ID=@P10 and taxonomyit2_.FACET_KEY=@P11 and taxonomyit8_.PUBLICATION_ID=itemmeta0_.PUBLICATION_ID and taxonomyit9_.TAXONOMY_ID=taxonomyit8_.TAXONOMY_ID and taxonomyit9_.PUBLICATION_ID=taxonomyit8_.PUBLICATION_ID and itemmeta0_.ITEM_REFERENCE_ID=relatedkey10_.ITEM_REFERENCE_ID and relatedkey10_.KEYWORD_ID=taxonomyit9_.FACET_ID and taxonomyit8_.FACET_KEY=@P12 and taxonomyit8_.TAXONOMY_ID=@P13 and taxonomyit8_.PUBLICATION_ID=@P14 and taxonomyit9_.FACET_LEFT>=taxonomyit8_.FACET_LEFT and taxonomyit9_.FACET_RIGHT<=taxonomyit8_.FACET_RIGHT and itemmeta0_.ITEM_REFERENCE_ID=relatedkey13_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey13_.PUBLICATION_ID and relatedkey13_.KEYWORD_ID=@P15 and relatedkey13_.TAXONOMY_ID=@P16 and relatedkey13_.PUBLICATION_ID=@P17 and itemmeta0_.PUBLICATION_ID=custommeta14_.PUBLICATION_ID and itemmeta0_.ITEM_REFERENCE_ID=custommeta14_.ITEM_ID and itemmeta0_.ITEM_TYPE=custommeta14_.ITEM_TYPE and custommeta14_.KEY_NAME=@P18 and custommeta14_.KEY_FLOAT_VALUE=@P19 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey11_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey11_.PUBLICATION_ID and relatedkey11_.KEYWORD_ID=taxonomyit12_.FACET_ID and relatedkey11_.PUBLICATION_ID=taxonomyit12_.PUBLICATION_ID and relatedkey11_.PUBLICATION_ID=@P20 and taxonomyit12_.TAXONOMY_ID=@P21 and taxonomyit12_.FACET_KEY=@P22 order by itemmeta0_.LAST_PUBLISHED_DATE DESCÂ Â Â Â Â Â Â Â ',6,20,1815,20,79,N'2017',20,121,N'09',20,85,N'XXXX',N'1',183,20,2045,77,20,N'xxxxx',1,20,76,N'xxxxx'
We could also retrieve the similar query of our old environment, doing the exact same thing.
exec sp_prepexec @p1 output,N'@P0 int,@P1 int,@P2 int,@P3 int,        @P4 nvarchar(4000),@P5 int,@P6 int,@P7 nvarchar(4000),@P8 int,@P9 int, @P10 nvarchar(4000),@P11 nvarchar(4000),@P12 int,@P13 int,@P14 int, @P15 int,@P16 int,@P17 nvarchar(4000),@P18 float,@P19 int,@P20 int, @P21 nvarchar(4000)', N'select distinct top 6   itemmeta0_.PUBLICATION_ID as col_0_0_, itemmeta0_.ITEM_REFERENCE_ID as col_1_0_, itemmeta0_.ITEM_TYPE as col_2_0_, itemmeta0_.LAST_PUBLISHED_DATE as col_3_0_ from ITEMS itemmeta0_ cross join TAXFACETS taxonomyit1_ cross join TAXFACETS taxonomyit2_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey3_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey4_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey5_ cross join TAXFACETS taxonomyit6_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey7_ cross join TAXFACETS taxonomyit8_ cross join COMPONENT componentm9_ inner join ITEMS componentm9_1_ on componentm9_.PUBLICATION_ID=componentm9_1_.PUBLICATION_ID and componentm9_.ITEM_REFERENCE_ID=componentm9_1_.ITEM_REFERENCE_ID cross join CUSTOM_META custommeta10_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey11_ cross join TAXFACETS taxonomyit12_ cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey13_ cross join TAXFACETS taxonomyit14_ where itemmeta0_.PUBLICATION_ID=@P0 and itemmeta0_.ITEM_REFERENCE_ID=componentm9_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=componentm9_.PUBLICATION_ID and componentm9_.SCHEMA_ID=@P1 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey13_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey13_.PUBLICATION_ID and relatedkey13_.KEYWORD_ID=taxonomyit14_.FACET_ID and relatedkey13_.PUBLICATION_ID=taxonomyit14_.PUBLICATION_ID and relatedkey13_.PUBLICATION_ID=@P2 and taxonomyit14_.TAXONOMY_ID=@P3 and taxonomyit14_.FACET_NAME=@P4 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey11_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey11_.PUBLICATION_ID and relatedkey11_.KEYWORD_ID=taxonomyit12_.FACET_ID and relatedkey11_.PUBLICATION_ID=taxonomyit12_.PUBLICATION_ID and relatedkey11_.PUBLICATION_ID=@P5       and taxonomyit12_.TAXONOMY_ID=@P6 and taxonomyit12_.FACET_KEY=@P7 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey7_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey7_.PUBLICATION_ID and relatedkey7_.KEYWORD_ID=taxonomyit8_.FACET_ID and relatedkey7_.PUBLICATION_ID=taxonomyit8_.PUBLICATION_ID and relatedkey7_.PUBLICATION_ID=@P8 and taxonomyit8_.TAXONOMY_ID=@P9 and taxonomyit8_.FACET_KEY=@P10 and taxonomyit1_.PUBLICATION_ID=itemmeta0_.PUBLICATION_ID and taxonomyit2_.TAXONOMY_ID=taxonomyit1_.TAXONOMY_ID and taxonomyit2_.PUBLICATION_ID=taxonomyit1_.PUBLICATION_ID and itemmeta0_.ITEM_REFERENCE_ID=relatedkey3_.ITEM_REFERENCE_ID and relatedkey3_.KEYWORD_ID=taxonomyit2_.FACET_ID and taxonomyit1_.FACET_KEY=@P11 and taxonomyit1_.TAXONOMY_ID=@P12 and taxonomyit1_.PUBLICATION_ID=@P13 and taxonomyit2_.FACET_LEFT>=taxonomyit1_.FACET_LEFT and taxonomyit2_.FACET_RIGHT<=taxonomyit1_.FACET_RIGHT and itemmeta0_.ITEM_REFERENCE_ID=relatedkey4_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey4_.PUBLICATION_ID and relatedkey4_.KEYWORD_ID=@P14 and relatedkey4_.TAXONOMY_ID=@P15 and relatedkey4_.PUBLICATION_ID=@P16 and itemmeta0_.PUBLICATION_ID=custommeta10_.PUBLICATION_ID and itemmeta0_.ITEM_REFERENCE_ID=custommeta10_.ITEM_ID and itemmeta0_.ITEM_TYPE=custommeta10_.ITEM_TYPE and custommeta10_.KEY_NAME=@P17 and custommeta10_.KEY_FLOAT_VALUE=@P18 and itemmeta0_.ITEM_REFERENCE_ID=relatedkey5_.ITEM_REFERENCE_ID and itemmeta0_.PUBLICATION_ID=relatedkey5_.PUBLICATION_ID and relatedkey5_.KEYWORD_ID=taxonomyit6_.FACET_ID and relatedkey5_.PUBLICATION_ID=taxonomyit6_.PUBLICATION_ID and relatedkey5_.PUBLICATION_ID=@P19 and taxonomyit6_.TAXONOMY_ID=@P20 and taxonomyit6_.FACET_KEY=@P21 order by itemmeta0_.LAST_PUBLISHED_DATE DESC' ,20,1815,20,79,N'2017',20,121,N'09',20,85,N'XXXX',N'1',183,20,2045, 77,20,N'XXXXX',1,20,76,N'XXXXX'
And then, we compared execution times (both queries running on the same DB server).
quick tip: to do that, just add this before and after your query
set statistics time on <your query here> set statistics time off
Using this, we got back these numbers:
SDL Web 8.5 query :
SQL Server Execution Times:
  CPU time = 2219 ms, elapsed time = 3547 ms.
SDL Tridion 2013 SP1 query :
SQL Server Execution Times:
  CPU time = 172 ms, elapsed time = 204 ms.
So, our query was performing over 10 times slower than the original one, this wasn’t right.
Analysing both queries, we found two major differences:
- The order of the joins
- The extra condition in the inner join
“…and componentm7_.NAMESPACE_ID=componentm7_1_.NAMESPACE_ID..â€
Since variations of this query were executed several times (since we needed to load different blocks of content dynamically from the broker, this explained most of the load times issues we were seeing.
So, now the task was to find out the problem with this query.
Since these queries are generated by the Broker API using Hibernate, we can’t really change them to optimize them, so we forwarded all this information to SDL support, so hopefully they could provide us with an answer.
Off course, we took it one step further.
We couldn’t change those queries, but we could still play with them to see if they can be improved. After some playing around with the query, we found that the extra condition was causing all of the delay. Removing the namespace_ID check in the inner join seemed to speed things up. Instead of 3 seconds to run, the query now also ran in milliseconds (yay!!!).
Now, this didn’t really help us, since we still couldn’t really change the broker API generating that sql statement, so we looked a bit further.
We couldn’t change those queries, but we could try to change the DB (at least the indexes). If we could add an index to optimize that query, and SDL Support would approve this fix (or come up with an alternative fix), our problems would be solved.
So, after some more testing, we found that adding this index to the Component table
CREATE NONCLUSTERED INDEX [index_name_here] ON [dbo].[COMPONENT] ( [ITEM_REFERENCE_ID] ASC, [PUBLICATION_ID] ASC, [NAMESPACE_ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO
Adding this made the performance of the query as fast as the Tridion 2013 version.
As a result, suddenly our problematic page also loaded in 20 seconds (on first load, without anything cached, and iis starting up), instead of in a minute.
I’m not sure if this is the final fix, as we’re still waiting for SDL support to come back with a final answer. Once I get more information, i’ll update this post.
Disclaimer :
Changing any of the Tridion CM or Broker databases is not a good practice and is not recommended, and voids your support contract with SDL. Any of the techniques described in this post were done on a development environment, with a backup of the original broker DB.
After we found any improvements by adding indexes, we first consulted SDL Support to get approval to install these. If you ever want to do this as well, first contact SDL support.
Probably worth noting that whilst the indexing improves performance it doesn’t address the root cause; the root cause is _likely_ to be some rogue taxonomy sub-queries and SDL Support are looking to resolve this at the core product level.
Also I believe that we are OK to update the broker database (within reason) but the CM database is a great big no no.