Morphogenetic Engineering at Internet Scale - Aerospike Case Study
28 Feb 2015This is part of a continuing series of posts and case studies where we examine technologies eGloo is utilizing to develop self-programming and self-assembling information systems at Internet scale.
Technologies in Focus: AerospikeDB
Brief
Most information systems benefit from a sensible organization of the IO stack according to the structure of the stored data, its relationships (internal and external) and its read/write demand within the system. We have specific business requirements regarding responsiveness, auto-scaling, fault-tolerance and programmatic interfaces lending to morphogenetic programming. As such a fast and scalable NoSQL key-value store is an important component of a larger information storage and delivery solution. After reviewing several options, Aerospike was the clear choice for that role.
Objectives
- Internet scale key-value / NoSQL store
- <=5ms performance (excluding network latency)
- Linear scaling of QPS against deployed node count
- XDR
- Maximum economic efficiency per node
- Persistence
- Clustering
- Automatic Sharding
- Fault tolerance
- Python Support
- Low TCO
Challenges
As a startup, our challenges are both technical and existential. In short, we must:
- Engineer for the next order of magnitude
- Limit technical debt
- Optimize burn rate
Aerospike allows us to achieve our stated objectives while mitigating the very real challenges of being a startup.
Other Technologies Considered: Memcached, Redis, MongoDB
Before going into the reasons we settled on AerospikeDB, it's worth discussing the reasons we dismissed three of its primary competitors for our use case.
Memcached
Memcached is an old tried-and-true workhorse. Its biggest benefits are:
- Simplicity
- Speed
- No licensing cost
However, there are several drawbacks for our use case:
- Strict value size / data type limits
- Scaling requires tight coupling to application design decisions
- Scaling requires planning / maintenance / downtime
- XDR requires special considerations, planning and engineering; not built-in
- Not Fault-Tolerant
- Not Persistent
- No support for clustering
- Relatively high TCO when factoring in application engineering and infrastructure maintenance
- Does not utilize SSD, limiting vertical scaling and increasing need for horizontal scaling
Given our desire for a high degree of automation, low TCO and morphogenetic application engineering, memcached is not the best solution for us.
Redis
Redis has a lot of really great features going for it, including:
- Simplicity
- Speed
- Flexibility in data value size / type
- Persistent
- Fault Tolerant
- No licensing cost
However, it has drawbacks similar to memcached:
- Scaling requires tight coupling to application design decisions
- Scaling requires planning / maintenance / downtime
- XDR requires special considerations, planning and engineering; not built-in
- Fault-Tolerance requires planning, maintenance and chosen replication modes each have trade-offs
- Persistence limited to size of available memory
- Relatively high TCO when factoring in application engineering and infrastructure maintenance
- Does not utilize SSD, limiting vertical scaling and increasing need for horizontal scaling
MongoDB
MongoDB has become the poster child for NoSQL databases. Its biggest benefits are:
- Simplicity (application programming)
- Flexibility in data value size / type
- Persistent
- Fault Tolerant
- Bills itself as "data center aware"
- No licensing cost
We gave serious consideration to MongoDB as our engineers have extensive experience with it in production settings in the past. However, drawbacks include:
- Scaling requires tight coupling to application design decisions
- Scaling requires planning / maintenance / downtime
- XDR requires special considerations, planning and engineering; not built-in
- Fault-Tolerance requires planning and maintenance
- Relatively high TCO when factoring in application engineering and infrastructure maintenance
- Performance drops off at scale
- In our experience, MongoDB is often buggy, prone to failure more often than any of the other solutions we tried
Solution Overview
After much research, experimentation and speaking with other users, we settled on AerospikeDB. Its biggest advantages are:
- Flexibility in data value size / type
- Excellent performance / low latency at scale
- Scaling does not require tight coupling to application design decisions
- Scaling requires fewer nodes than other solutions
- No Downtime
- Persistent
- Fault Tolerant (Auto-Assignment of Master-Slave)
- Auto-Clustering
- Auto-Sharding
- Auto-Balancing
- Community Edition is Open Source
- XDR Built-In (Enterprise)
- Flexible Enterprise Licensing Model
- Reliability
- Low TCO
Best of all, it's now a supported click-to-deploy service on Google Cloud Engine, our preferred hosting provider.
We would be hard-pressed to identify any disadvantages. Aerospike is a fantastic product, well-supported, has many interfaces for various languages, a fair licensing model, minimal overhead in terms of hands-on management and their community outreach is phenomenal.
If, like us, you're developing Internet scale platforms that depend on information storage and delivery with strict requirements for performance, scaling, reliability and loose coupling for self-programming/self-assembly, Aerospike is the best solution we've found.
If you liked this post, you can share it with your followers or follow us on Twitter!