Community Distributed Cache

CDC stands for Community Distributed Cache and allows high-performance, and scalable + distributed memory clustering cache based on Hazelcast for both CDA and Mondrian.

Credits: Webdetails Team, Lead - Pedro Alves; De Bortoli Wines

Main Features

  • CDA distributed cache support;
  • Selectively clear the cache of specific schemas / cubes / dimensions of Mondrian cubes;
  • Ability to switch between default and CDC cache for CDA and Mondrian;
  • Gracefully handles adding / removing new cache nodes;
  • Mondrian distributed cache support;
  • Provides an API to clean the cache from the outside (e.g.: after running ETL);
  • Provides a view over cluster status;
  • Supports several memory configuration options.

Documentation

Performance is a key point, not only in Business Intelligence software but in user interfaces in general. The goal of CDC is to give a Pentaho implementation based on Mondrian/CDA and a distributed caching layer that can prevent as much as possible the database to be hit.

One added functionality is the ability to clear the cache of only specific Mondrian cubes. Even though Mondrian has a very complete API to control the member's cache, Pentaho only exposes a clean all functionality that ends up being very limited in production environments.

The cache ability to survive server restarts is a design bonus, and supported by CDA out of the box. It will be supported by Mondrian as soon as MONDRIAN-1107 is fixed.

CDC Requirements

  • Mondrian 3.4 or later (in Pentaho 4.5);
  • CDA 12.05.15.

CDC Usage

  • Install CDC using either the installer (available soon) or ctools-installer. If you do a manual install, be sure to copy the contents of solution/system/cdc/pentaho/lib to server's WEB-INF/lib;
  • Download the standalone cache node;
  • Execute the standalone cache node in the same machine as Pentaho or in the same internal network (launch-hazelcast.sh), optionally editing the file and changing the memory settings (defaults to 1Gb, increase at will). You can launch as many nodes as you want;
  • Launch Pentaho and click on the CDC button;
  • Enable cache usage on CDA and Mondrian;
  • Restart Pentaho Server;
  • Check if the settings screen are satisfactory. Usually the defaults work fine;
  • Open analyzer, jpivot or a CDE Dashboard that uses CDA and you should see the cache being populated.

Hazelcast has a very good Management Center, so it's outside the scope of CDC to reimplement that kind of features.

However, we do support a simple cluster information Dashboard that gives an overview of the state of the nodes.

With CDC you can selectively control the contents of the cache, allowing you to clean either specific Dashboards or cubes.

The business case around this is simple: We need to clear the cache after new data is available (usually as a result of a ETL job).

CDC allows not only to do that but also to do it from within the ETL process.

CDA

CDC offers a solution navigator so that we can select a Dashboard. When we select that dashboard, all the CDA queries used by that Dashboard will be cleaned.

Clicking on the URL button we'll get a url that we can call externally (from an ETL job). Be aware that you need to add the user credentials when calling from the outside (e.g.: &userid=joe&password=password )

Mondrian

This one is very similar to the previous one, but navigates through the available cubes.

You can then either clean the entire schema, a specific cube or even the individual cell cache for a specific dimension (use this latest one with care).

FAQ

What does CDC mean?

CDC stands for Community Distributed Cache.

Is CDC free?

CDC is licensed under the MPLv2 license.

Recommend this article:

Registered in Portugal. NIF 508 528 283