# Overview Opatomic is a crash-safe persistent data structure server. It allows multiple clients to efficiently access and modify data at the same time using a network connection. It is similar to Redis. Opatomic has been designed to overcome some limitations of Redis without sacrificing the simplicity and joy of using Redis. It will perform best if data access is not random. Opatomic has bugs and will corrupt or lose your data. It will crash, it will leak memory. Some important features have not been implemented. Over time, the server should stabilize and become more reliable. However, you should not plan to use Opatomic if: - your data access patterns are random - you do not need to exceed memory - you do not need data to be sorted Contact Info: - [Report bugs here](https://github.com/opatomic/opad/issues) - email: j@opatomic.com # Download [Latest release](https://opatomic.com/releases/latest.html) # Installation There is no installation provided. Simply run the "opad" binary and your database will be stored in the current directory. # Running ./opad To see a list of all arg options, run ./opad --help # Administration An open source admin tool has been included. Run the server with extra args: ./opad --wsport 8080 Open a web browser to http://localhost:8080. On the webpage, use host ws://localhost:8080 and click the connect button. Your web browser will communicate with the opatomic server via websockets. # Clients Several open source client libraries are available to connect to the database. - Java - [opac-java](https://github.com/opatomic/opac-java) - Javascript/node.js - [opac-js](https://github.com/opatomic/opac-js) - [npm](https://www.npmjs.com/package/opatomic-client) - C - [opac-c](https://github.com/opatomic/opac-c) # Implementation details - [Serialization specs](https://github.com/opatomic/opaspec/blob/master/serialization.md) - [RPC protocol specs](https://github.com/opatomic/opaspec/blob/master/rpc-protocol.md) ## Concepts - keys are sorted - most commands are O(log(N)) unless ranges are returned which is typically O(log(N)+M) - sets are sorted - keys and values have a type rather than being a binary string - can store an array as a key. ie, compound/composite key - undefined and sortmax can be used to specify ranges: ```KEYS START [2018, undefined] END [2019, SORTMAX]``` - cannot store undefined or sortmax in db - Sort order: 1. undefined 2. null 3. false 4. true 5. numbers 6. blobs/strings 7. arrays 8. sortmax # Comparison with Redis ## Major improvements over Redis - gracefully degrades performance when storage requirements exceed memory capacity (assuming non-random data access) - operations are truly atomic (all or nothing; rollback occurs on error) - persistence: startup is instant, shutdown requires minimal effort (mainly msync + fflush) - typically requires less memory per stored value, especially for small values (fewer pointers) - maps and sets are sorted and easily iterable - numbers do not lose precision and can exceed 64 bits - lists are stored as a tree; accessing a value at any index is O(log(N)) rather than O(N) - read only iterators, range queries, etc do not block other operations and large iterations do not typically increase memory usage by much (a full copy of the requested data is not allocated; data is copied on write if needed) - expiration is exact (not random) - no fork required: currently runs on 64-bit versions of linux, windows, mac - can subscribe and continue running commands on same connection - fast without requiring jemalloc or any external allocators - SRANDMEMBER, SPOP, RANDOMKEY have uniform distribution - (not yet available) perform multi-key read-only ops without blocking other ops (ie, long running read only script) ## Problems with Redis - Since data is stored using hashing and lots of pointers, exceeding memory can slow things down - Redis says that its transactions are *atomic*. However, they are not *atomic* in terms of [ACID](https://en.wikipedia.org/wiki/ACID_(computer_science)); they are actually *isolated*. If an error occurs, ops do not rollback: SET a 9223372036854775807 MULTI SET b 1 INCR a SET c 2 EXEC # this script fails to copy a key to itself and deletes the whole key curl -L -o "copykey.lua" "https://raw.githubusercontent.com/itamarhaber/redis-lua-scripts/018559b7f36f1c83c89e6c2fec57e624b4c77109/copykey.lua" ./redis-cli SET a dont_delete_me_bro ./redis-cli --eval copykey.lua a a # script fails with error and key "a" has now been deleted ./redis-cli GET a curl -L -o "INCREXPIRE.lua" "https://raw.githubusercontent.com/alexanderscott/redis-lua-samples/master/INCREXPIRE.lua" ./redis-cli --eval INCREXPIRE.lua keyThatWillNotExpireBecauseArgIsMissing curl -L -o "geo.lua" "https://raw.githubusercontent.com/RedisLabs/geo.lua/57a6c89f8d396f905ca21815701e0dcba47eb113/geo.lua" REDTXT="" ./redis-cli FLUSHDB for i in $(seq 1 7000); do REDTXT="$REDTXT $i"; done ./redis-cli --eval geo.lua key1 key2 , GEOZADD $REDTXT # The previous command will fail and if redis ops were atomic, then key1 and key2 should not exist ./redis-cli keys "*" REDTXT="" for i in $(seq 1 7000); do REDTXT="$REDTXT 1.$i $i"; done ./redis-cli --eval geo.lua key1 key2 , GEOZADD $REDTXT # The following transaction fails in the middle and does not rollback; leaving g1 key in the database FLUSHDB MULTI GEOADD g1 85 85 m1 GEOADD g1 86 86 m2 EXEC - If using persistence, Redis must load all data from disk into memory during startup (and shutdown). If using snapshot persistence then Redis must fork and save all data to disk at regular intervals. These are slow and resource consuming operations. - Redis uses lots of pointers which requires lots of extra memory, especially for small values - Iterating using KEYS command is slow, using SCAN/HSCAN/etc is not sorted (and slightly awkward) - Numbers can hit a limit (64 bits) or lose precision 127.0.0.1:6379> SET a 9223372036854775807 OK 127.0.0.1:6379> INCR a (error) ERR increment or decrement would overflow https://github.com/antirez/redis/issues/3695 https://github.com/antirez/redis/issues/4663 127.0.0.1:6379> SET a 1000 OK 127.0.0.1:6379> INCRBYFLOAT a 1.8 "1001.79999999999999999" - Accessing list elements not at head/tail is O(n) - Retrieving a large chunk of a data structure requires allocating memory to store all the data [link](https://redislabs.com/blog/top-redis-headaches-for-devops-client-buffers/) - [fork is slow](https://redis.io/topics/latency#latency-generated-by-fork), not supported on windows ## Speed Opatomic may be slower than Redis for some ops. This is unavoidable because all data is sorted in Opatomic. However, the speed difference is often not very much. Very simple benchmark results on a dev machine (Intel i3 dual core 2.6Ghz; 8GB ram; Linux 5.0.0-23-generic): $ ./redis-server --version Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=624ff421d708d69b $ ./redis-benchmark -q -r 50000 PING_INLINE: 75414.78 requests per second PING_BULK: 75414.78 requests per second SET: 76452.60 requests per second GET: 75700.23 requests per second INCR: 76219.51 requests per second LPUSH: 74460.16 requests per second RPUSH: 77821.02 requests per second LPOP: 73046.02 requests per second RPOP: 78369.91 requests per second SADD: 71275.84 requests per second HSET: 77760.50 requests per second SPOP: 77881.62 requests per second LPUSH (needed to benchmark LRANGE): 78308.54 requests per second LRANGE_100 (first 100 elements): 44782.80 requests per second LRANGE_300 (first 300 elements): 14823.60 requests per second LRANGE_500 (first 450 elements): 11515.43 requests per second LRANGE_600 (first 600 elements): 8499.79 requests per second MSET (10 keys): 67567.57 requests per second # run opad single threaded and pinned to cpu 0: $ ./opad --cpua 0 --allowResp 1 # benchmark with no pipelining: $ ./redis-benchmark -q -p 4567 -r 50000 PING_INLINE: 78926.60 requests per second PING_BULK: 75987.84 requests per second SET: 79808.46 requests per second GET: 79365.08 requests per second INCR: 71839.09 requests per second LPUSH: 80645.16 requests per second RPUSH: 81037.28 requests per second LPOP: 80385.85 requests per second RPOP: 80256.82 requests per second SADD: 80515.30 requests per second HSET: 58823.53 requests per second SPOP: 78554.59 requests per second LPUSH (needed to benchmark LRANGE): 80645.16 requests per second LRANGE_100 (first 100 elements): 34626.04 requests per second LRANGE_300 (first 300 elements): 16852.04 requests per second LRANGE_500 (first 450 elements): 8992.00 requests per second LRANGE_600 (first 600 elements): 6756.76 requests per second MSET (10 keys): 24485.80 requests per second notes about these results: 1. These ops are testing random access which will be slower for Opatomic. 2. Opatomic is at a disadvantage because every op must be translated between Redis/RESP types and Opatomic supported data/types. 3. The Opatomic results are using redis-benchmark which uses the RESP protocol and is slower when an iterator is returned (ie, LRANGE). This is because the RESP protocol requires an array length, which forces Opatomic to serialize the iterator to memory to count the number of elements before sending the result back to the client. 4. Pipelining isn't used which would improve results for both systems 5. The last test (MSET of 10 random keys) illustrates the speed difference for random ops. Setting 10 *random* keys in Opatomic is definitely slower than Redis. ## Major features missing but planned - scripting (EVAL, EVALSHA, etc) - replication - oplog compaction ## Other missing features (may be implemented eventually) - LRU key eviction - GEO family of commands - hyperloglog (PFADD/PFCOUNT/PFMERGE) - streams (XADD, etc) - BLPOP/BRPOP/BRPOPLPUSH/BZPOPMIN/BZPOPMAX - keyspace notifications - transactions (MULTI/EXEC/WATCH/UNWATCH/DISCARD) - clustering - modules ## Missing features that are unlikely to be added - SELECT/SWAPDB/MOVE - only 1 db per server (SELECT 0 works when using Redis client)