How to Make Apache Ignite Production Ready
Actionable guidelines to make Apache Ignite production-ready.
This article was originally published at https://www.bugdbug.com/post/how-to-make-apache-ignite-production-ready
In this article, we will discuss the guidelines that should be followed while deploying Apache Ignite in the production environment. It will help you to get the most out of it and will make sure it performs better without any problem. I assume you are familiar with Apache Ignite and its capabilities. If not, it is best to stop here and first go through this official documentation of Apache Ignite and get yourself familiar with it.
Apache Ignite is primarily a in-memory distributed caching and computing framework. It also provides different capabilities like service-grid, messaging etc.
As we know, Ignite uses off-heap to store the cache data. By default, Ignite acquires 20% of the available memory as Off-Heap for cache data storage and all the caches use this storage. By default,
DataStorageRegion is disabled. This will make a system to throw an exception when DataRegion is filled.
Following guidelines should be followed while configuring data storage
FULLY_REPLICATEDcaches perform better for read-operations. Thus, read-intensive caches with moderate size data should use
PARTITIONEDcaches give better performance for write operations. Thus, caches with write-intensive operations should use
- Avoid using Ignite’s Native Persistence for
- To avoid data loss for
PARTITIONEDcaches, one should use Ignite’s native persistence or third party persistence or use proper replication factor.
- Define separate
DataEvictionMode(DISABLED, LRU) and sizes for caches with different needs. Avoid using the same DataRegion for all the caches.
- For partitioned cache, one can also use swap space instead of native persistence or third party persistence.
2. Cluster Discovery
For any distributed system, cluster formation is an essential process. By default, Ignite uses TCP discovery with a multicast IP finder. In most of the production deployments, multicasting is restricted. For elastic clusters, static IpFinder based discovery is not recommended.
Thus, one could use
TcpDiscoveryJdbcIpFinder or others apart from static and multicast.
For clusters with more than 100 nodes, it is advised to use Zookeeper based discovery mechanism.
Cluster security is of utmost importance for data storage systems. Apache Ignite does not provide authentication and authorization mechanism out of the box. Authentication during cluster joining and data access authentication and authorization can be implemented by writing a custom security plugin by following this article.
4. Network partitioning
Apache Ignite, like any other distributed system, follows the CAP theorem. So, in the event of Network partitioning, it can either have Availability or Consistency. Since Apache Ignite is a system with the ACID guarantees, it needs to make sure that data consistency is maintained in the event of partitioning.
Network partitioning could occur because of a network glitch, node failure or long GC pauses. Because of network partitioning, the cluster could be divided into one or more segments with one or more nodes. These segments could act as independent clusters and may cause data inconsistency if kept working and writing data.
Thus, handling network partitioning becomes an issue of the highest priority. Segmentation can be handled by writing a custom network segmentation plugin by following this article.
5. Cache node filter
Every cache created using Apache Ignite other than Local cache is deployed on all the nodes whether it being partitioned or fully replicated.
When you have a large cluster, you do not want all the caches to be deployed on all the cluster nodes. This should specifically be avoided for
We can achieve this by setting node predicate to the caches as follows
Add node attribute through IgniteConfiguration
Use this property in a cache node filter
- Timeouts like
segmentationCheckFrequencyshould be configured properly through
- The default
failureHandlerstops the Ignite node upon failure. For embedded Ignite, this property should be changed.
- The default segmentation policy is
SegmentationPolicy.STOP. This should be changed to an appropriate one.
- IPv4: According to several discussions on Ignite user forum, different IP configuration causes network segmentation. Thus, it is advised to IPv4 over IPv6 which can be configured through java option as follows http://apache-ignite-users.70518.x6.nabble.com/Ignite-Node-failure-Node-out-of-topology-SEGMENTED-td21360.html -
- Ports: Apache Ignite uses multiple network ports for different things like node discovery, node communication, REST client, SQL client. Out of these, node discovery (47500) and node communication (47100) are mandatory for it to function properly.
- Predicate classes: Classes used in all the predicates should be present in the classpath of all the nodes. These classes are not made available through
Peerclassloading. One has to add these classes to the classpath.
- Client Mode: If you do not want any cache partitions on some cluster nodes, add those nodes are client nodes.
- EventListeners: If you have added event listeners for different events like cache events, node events, etc, make sure that user logic is not getting executed in the event thread.
You can refer to this link for more guidelines on production readiness and performance.
If you have more tips and guidelines feel free to share it in comments.
Thanks for reading. If you find this article helpful, share it with friends!