in

Unleashing the Potential of JunoDB: PayPal’s Key-Value Store Liberated and Open-Source | Written by Yaping Shi | The PayPal Technology Blog | May 2023



JunoDB: PayPal’s Highly Reliable NoSQL Infrastructure

PayPal’s JunoDB is an open-source, distributed key-value store that deploys an extremely scalable, secure, and highly available NoSQL infrastructure. It plays a critical role in powering PayPal’s diverse range of applications, ranging from login to risk prevention to final transaction processing. With JunoDB, applications can efficiently store and cache data for fast access and load reduction on relational databases and other services. JunoDB was built specifically to address the unique requirements of PayPal, delivering security, consistency, and high availability with low latency, all while scaling to handle hundreds of thousands of connections.

JunoDB: The Unmatched NoSQL Solution

While other NoSQL solutions may perform well in certain use-cases, JunoDB is unmatched when it comes to meeting PayPal’s extreme scale, security, and availability needs. JunoDB is designed to be cost-effective, ensuring that PayPal can maintain its high standards of quality and operational excellence while keeping costs manageable. JunoDB has evolved from a single-threaded C++ program to a highly concurrent and multi-core friendly Golang program. It has also evolved from an in-memory short TTL (Time To Live) data store to a persistent data store that supports long TTLs, providing improved data security via on disk encryption and TLS in transit by default. JunoDB’s journey has also involved quick scaling out through data redistribution, enabling it to handle the ever-increasing volume of requests.

Common Use Cases of JunoDB

JunoDB is used in several use cases to efficiently store and cache data. Below are some of the prevalent use cases:

Caching: JunoDB is often used as a temporary cache to store data that doesn’t change often, ranging from a few seconds to a few days. This can include user preferences, account details, API responses, access tokens, and more. Using JunoDB for caching reduces calls to expensive databases and downstream services across all domains.

Idempotency: JunoDB is used as a short TTL, highly available store to ensure that an operation is idempotent and remove duplicate processing. JunoDB is used to ensure that payments are not reprocessed during retries or resend messages from the notification platforms. In the distributed locking variation, JunoDB is used to ensure that only one process is executing a required operation.

Counters: JunoDB is used to provide a limits type counter when certain resources are unavailable. This enables PayPal to be available and compliant.

SoR: JunoDB is used for a limited set of long term (multi-year) System of Record needs.

Latency Bridging: JunoDB’s quick inter-cluster replication helps address replication delays in the Oracle processing, enabling near-instant, consistent reads everywhere.

JunoDB Architecture: A High-Level Overview

The JunoDB architecture is a highly reliable and scalable solution designed with simplicity, scalability, security, and adaptability in mind. It is based on a proxy-based design that enables linear horizontal connection scaling and uses consistent hashing to partition data and minimize data movement when clusters are expanded or shrunk. JunoDB’s architecture comprises three key components: the JunoDB client library, JunoDB proxy instances, and JunoDB storage server instances.

JunoDB Client Library: The JunoDB client library resides in applications and provides an API that allows for easy storage, retrieval, and updating of application data through the JunoDB proxy. The JunoDB thin client library is implemented in several programming languages, such as Java, Golang, C++, Node, and Python, making it easy to integrate with applications written in different programming languages.

JunoDB Proxy Instances: The JunoDB proxy instances are driven by a load balancer and accept client requests and replication traffic from other sites. Each proxy connects to all JunoDB storage server instances and forwards each request to a group of storage server instances based on the shard mapping maintained in ETCD, the data store which saves JunoDB cluster configurations.

JunoDB Storage Server Instances: The JunoDB storage server instances accept operation requests from proxy and store data in memory or persistent storage using RocksDB. Each storage server instance is responsible for a set of shards, ensuring smooth and efficient data storage and management.

Achieving Scalability

To facilitate horizontal scaling, a proxy-based architecture is established where clients are lightweight, eliminating the need to establish connections with all storage nodes. JunoDB leverages consistent hashing for partitioning data and assigns shards to physical storage nodes using a shard map. JunoDB uses within-data center and cross-data center replication to ensure zero downtime and data consistency. JunoDB also employs micro-shards within each primary shard, serving as building blocks for data redistribution.

Conclusion

JunoDB is an open-source, distributed key-value store designed to meet the unique scale, security, and availability needs of PayPal. It is not just another NoSQL solution; it was built from the ground up to provide security, consistency, and high availability with low latency. JunoDB has evolved to handle increased traffic and provides efficient storage and retrieval of data. With JunoDB available on Github, the industry can leverage PayPal’s highly scalable, secure, and available NoSQL infrastructure to enhance their application performance.



Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Crafting OntoCraft: The Ultimate Guide to Mastering the Art of Crafting.

“Introducing the Intelligent Canvas: Elevate Your Design Game with Our Innovative Technology | Inside Design Blog”