Pika Introduction

# What is Pika?

Pika is a class of Redis-compatible storage system jointly developed by the DBA and Infrastructure teams. It fully supports the Redis protocol, allowing users to seamlessly migrate their services to Pika without any code modifications. Pika is a persistent, high-capacity Redis storage service that is compatible with a vast array of interfaces including string, hash, list, zset, and set. It addresses the capacity bottleneck issues of Redis caused by massive data storage, preventing memory shortages. Similar to Redis, Pika supports master-slave backup through the 'slaveof' command, offering both full and partial synchronization. Additionally, the DBA team provides migration tools, ensuring users experience a smooth and imperceptible transition process. The migration is seamless for users.

# Comparison with Redis

Relative to Redis, the most significant difference with Pika lies in its persistent storage, where data is stored on disk, while Redis utilizes in-memory storage. This distinction brings both advantages and disadvantages to Pika in comparison to Redis.

Advantages:

Larger Capacity: Pika does not have the memory limitations of Redis; its maximum usable space equals the size of the disk. Faster Database Loading: Pika writes data to disk, allowing for quick recovery without the need for RDB or OPLOG. Upon restart, Pika can restore previous data without loading all data into memory, eliminating the need for replaying data operations. Swift Backup Speed: Pika's backup speed is approximately equivalent to the speed of 'cp' (copying data files), making it efficient for backing up large databases (e.g., hundreds of gigabytes). Faster backup speeds effectively address the issues associated with full synchronization in master-slave configurations. Disadvantages:

Performance: Pika's performance is somewhat lower than Redis due to its reliance on both memory and file-based data storage. However, using SSD drives to store data helps mitigate this performance difference, aiming to match or even surpass Redis performance.

# Use Cases

From the above comparison, it is evident that if your business scenario involves large datasets, Redis may struggle to handle it, especially when exceeding 50GB. Additionally, if data integrity is crucial, and you cannot afford data loss due to power interruptions, then utilizing Pika can address these concerns. However, in practical use, Pika's performance is approximately 50% of Redis.

# Pika's Characteristics

Large Capacity: Supports storage of datasets exceeding 100GB.
Redis Compatibility: Seamless transition from Redis to Pika without code modifications.
Support for Master-Slave Replication (slaveof).
Robust Operations Commands.

# The Current Applicability Situation

At present, Pika has been deployed and running in over 20 large-scale clusters (in terms of data capacity compared to Redis). A rough estimate indicates that the total daily request volume exceeds 10 billion, and the current data capacity being managed is approximately 3TB.

# Performance Comparison with Redis

Configuration: CPU - 24 Cores, Intel® Xeon® CPU E5-2630 v2 @ 2.60GHz, Memory - 165157944 kB, Operating System - CentOS release 6.2 (Final), Network Card - Intel Corporation I350 Gigabit Network Connection.

# With the Testing Process

Write 150GB of data into Pika, distributed across 50 hash keys with fields in the order of 10 million. In comparison, Redis writes 5GB of data. Pika employs 18 threads, while Redis operates with a single thread.

Conclusion: Pika's single-threaded performance is undoubtedly inferior to Redis. However, given Pika's multi-threaded architecture, in scenarios with a higher number of threads, the performance of certain data structures can surpass that of Redis.

# Overview of Pika Performance in Specific Scenarios

# Pika vs SSDB (Detail) 1 10 Pika vs Redis

# How to Migrate from Redis to Pika

# Tasks Required for Development

Developers don't need to do anything. No code changes, no driver replacement (Pika uses the native Redis driver), nothing at all. Just watch the DBA at work.

# Tasks Required for DBA

DBA Migration of Redis Data to Pika
The DBA ensures real-time synchronization of data from Redis to Pika, ensuring consistency between Redis and Pika datasets.
The DBA switches the LVS backend IP, replacing Redis with Pika.

FAQ →