Over the past eight years, Zia has delivered many successful Alfresco health checks. Because every environment is unique, as are the desired outcomes from the analysis we perform, there is no single solution to apply across the board.
However, there are some issues we run into regularly. In this series of articles, we’ll outline some of the more common issues we see as well as share some tips to address them.
Hosting Environment
Many of our customers host Alfresco and Solr on virtual machines. Virtual infrastructure tends to provide a great cost savings as software rarely uses all of the resources available. The virtualization tooling provides a way to increase resource utilization per host so less physical hardware needs to be purchased.
Some of our customers are not aware of the nuanced implications of this resource sharing which can result in very hard-to-diagnose issues with availability, performance, and responsiveness.
One example of an issue we run into regularly is with Java garbage collection. Most of the time, it does an amazing job of simplifying memory management and Alfresco Content Services uses it to great effect. However, when Alfresco is running on virtual infrastructure which reports a fixed amount of memory to the Java Virtual Machine (JVM), and then some of that memory is borrowed by other virtual machines on the same host, all hell can break loose.
Issues we’ve seen related to this:
- System freezes are usually a few seconds but we’ve even seen them last for hours
- CPU spikes are often related to the garbage collector needing to stop the world in order to sort out memory when its model of available resources don’t match the actual resources provided by the virtual machine
- Alfresco process death due to out of memory conditions—this can even happen when it appears that there is 50% or more memory free!
- Cluster flapping where nodes repeatedly disconnect and reconnect with each other, resulting in widespread cache invalidations which can be a significant performance hit
Fortunately, virtual infrastructure allows you to pin resources to a virtual machine so they are no longer shared by other virtual machines on the same host. For Java applications, it is critical to have an appropriate amount of memory allocated to the virtual machine and to the JVM and that the memory for the virtual machine is pinned.
Tip: When running Java applications on virtual infrastructure you must pin virtual machine memory!
Some bonus tips related to the hosting environment:
- Generally, databases should be run on bare-metal and should have fast storage as well as sufficient RAM to support a good level of caching for your use cases
- The Solr index is a specialized database—as such see the previous point. However, it’s often OK to run this on virtual infrastructure and the disk should be crazy fast (local SSDs ideally)
- Database files (including the Solr index), tend to be very sensitive to tooling that actively monitors/interferes with read/write operations. So, for example, we do not recommend running antivirus software against these parts of your filesystems
In my next post, we’ll discuss JVM provided memory management capabilities. If you have questions about a hosting environment, feel free to contact us today.