# Overview

Opatomic is a crash-safe persistent data structure server. It allows multiple clients
to efficiently access and modify data at the same time using a network connection.
It is similar to Redis.

Opatomic has been designed to overcome some limitations of Redis without sacrificing
the simplicity and joy of using Redis. It will perform best if data access is not random.

Opatomic has bugs and will corrupt or lose your data. It will crash, it will leak memory. Some
important features have not been implemented. Over time, the server should stabilize and become
more reliable. However, you should not plan to use Opatomic if:
 - your data access patterns are random
 - you do not need to exceed memory
 - you do not need data to be sorted

Contact Info:
 - [Report bugs here](https://github.com/opatomic/opad/issues)
 - email: j@opatomic.com

# Download

[Latest release](https://opatomic.com/releases/latest.html)

# Installation

There is no installation provided. Simply run the "opad" binary and your database will be
stored in the current directory.

# Running

	./opad

To see a list of all arg options, run

	./opad --help

# Administration

An open source admin tool has been included. Run the server with extra args:

    ./opad --wsport 8080

Open a web browser to http://localhost:8080. On the webpage, use host ws://localhost:8080 and click
the connect button. Your web browser will communicate with the opatomic server via websockets.

# Clients

Several open source client libraries are available to connect to the database.

 - Java - [opac-java](https://github.com/opatomic/opac-java)
 - Javascript/node.js
   - [opac-js](https://github.com/opatomic/opac-js)
   - [npm](https://www.npmjs.com/package/opatomic-client)
 - C - [opac-c](https://github.com/opatomic/opac-c)

# Implementation details
 - [Serialization specs](https://github.com/opatomic/opaspec/blob/master/serialization.md)
 - [RPC protocol specs](https://github.com/opatomic/opaspec/blob/master/rpc-protocol.md)

## Concepts
 - keys are sorted
 - most commands are O(log(N)) unless ranges are returned which is typically O(log(N)+M)
 - sets are sorted
 - keys and values have a type rather than being a binary string
 - can store an array as a key. ie, compound/composite key
 - undefined and sortmax can be used to specify ranges: ```KEYS START [2018, undefined] END [2019, SORTMAX]```
 - cannot store undefined or sortmax in db
 - Sort order:
   1. undefined
   2. null
   3. false
   4. true
   5. numbers
   6. blobs/strings
   7. arrays
   8. sortmax


# Comparison with Redis
## Major improvements over Redis
 - gracefully degrades performance when storage requirements exceed memory capacity (assuming non-random data access)
 - operations are truly atomic (all or nothing; rollback occurs on error)
 - persistence: startup is instant, shutdown requires minimal effort (mainly msync + fflush)
 - typically requires less memory per stored value, especially for small values (fewer pointers)
 - maps and sets are sorted and easily iterable
 - numbers do not lose precision and can exceed 64 bits
 - lists are stored as a tree; accessing a value at any index is O(log(N)) rather than O(N)
 - read only iterators, range queries, etc do not block other operations and large iterations do not typically
   increase memory usage by much (a full copy of the requested data is not allocated; data is copied on write
   if needed)
 - expiration is exact (not random)
 - no fork required: currently runs on 64-bit versions of linux, windows, mac
 - can subscribe and continue running commands on same connection
 - fast without requiring jemalloc or any external allocators
 - SRANDMEMBER, SPOP, RANDOMKEY have uniform distribution
 - (not yet available) perform multi-key read-only ops without blocking other ops (ie, long running read only script)

## Problems with Redis
 - Since data is stored using hashing and lots of pointers, exceeding memory can slow things down
 - Redis says that its transactions are *atomic*. However, they are not *atomic* in terms of
   [ACID](https://en.wikipedia.org/wiki/ACID_(computer_science)); they are actually *isolated*.
   If an error occurs, ops do not rollback:

		SET a 9223372036854775807
		MULTI
		SET b 1
		INCR a
		SET c 2
		EXEC

		# this script fails to copy a key to itself and deletes the whole key
		curl -L -o "copykey.lua" "https://raw.githubusercontent.com/itamarhaber/redis-lua-scripts/018559b7f36f1c83c89e6c2fec57e624b4c77109/copykey.lua"
		./redis-cli SET a dont_delete_me_bro
		./redis-cli --eval copykey.lua a a
		# script fails with error and key "a" has now been deleted
		./redis-cli GET a

		curl -L -o "INCREXPIRE.lua" "https://raw.githubusercontent.com/alexanderscott/redis-lua-samples/master/INCREXPIRE.lua"
		./redis-cli --eval INCREXPIRE.lua keyThatWillNotExpireBecauseArgIsMissing

		curl -L -o "geo.lua" "https://raw.githubusercontent.com/RedisLabs/geo.lua/57a6c89f8d396f905ca21815701e0dcba47eb113/geo.lua"
		REDTXT=""
		./redis-cli FLUSHDB
		for i in $(seq 1 7000); do REDTXT="$REDTXT $i"; done
		./redis-cli --eval geo.lua key1 key2 , GEOZADD $REDTXT
		# The previous command will fail and if redis ops were atomic, then key1 and key2 should not exist
		./redis-cli keys "*"
		REDTXT=""
		for i in $(seq 1 7000); do REDTXT="$REDTXT 1.$i $i"; done
		./redis-cli --eval geo.lua key1 key2 , GEOZADD $REDTXT

		# The following transaction fails in the middle and does not rollback; leaving g1 key in the database
		FLUSHDB
		MULTI
		GEOADD g1 85 85 m1
		GEOADD g1 86 86 m2
		EXEC

 - If using persistence, Redis must load all data from disk into memory during startup (and shutdown). If
   using snapshot persistence then Redis must fork and save all data to disk at regular intervals. These
   are slow and resource consuming operations.
 - Redis uses lots of pointers which requires lots of extra memory, especially for small values
 - Iterating using KEYS command is slow, using SCAN/HSCAN/etc is not sorted (and slightly awkward)
 - Numbers can hit a limit (64 bits) or lose precision

		127.0.0.1:6379> SET a 9223372036854775807
		OK
		127.0.0.1:6379> INCR a
		(error) ERR increment or decrement would overflow

		https://github.com/antirez/redis/issues/3695
		https://github.com/antirez/redis/issues/4663
		127.0.0.1:6379> SET a 1000
		OK
		127.0.0.1:6379> INCRBYFLOAT a 1.8
		"1001.79999999999999999"

 - Accessing list elements not at head/tail is O(n)
 - Retrieving a large chunk of a data structure requires allocating memory to store all the data
   [link](https://redislabs.com/blog/top-redis-headaches-for-devops-client-buffers/)
 - [fork is slow](https://redis.io/topics/latency#latency-generated-by-fork), not supported on windows

## Speed
Opatomic may be slower than Redis for some ops. This is unavoidable because all data is
sorted in Opatomic. However, the speed difference is often not very much.

Very simple benchmark results on a dev machine (Intel i3 dual core 2.6Ghz; 8GB ram; Linux 5.0.0-23-generic):

	$ ./redis-server --version
	Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=624ff421d708d69b
	$ ./redis-benchmark -q -r 50000
	PING_INLINE: 75414.78 requests per second
	PING_BULK: 75414.78 requests per second
	SET: 76452.60 requests per second
	GET: 75700.23 requests per second
	INCR: 76219.51 requests per second
	LPUSH: 74460.16 requests per second
	RPUSH: 77821.02 requests per second
	LPOP: 73046.02 requests per second
	RPOP: 78369.91 requests per second
	SADD: 71275.84 requests per second
	HSET: 77760.50 requests per second
	SPOP: 77881.62 requests per second
	LPUSH (needed to benchmark LRANGE): 78308.54 requests per second
	LRANGE_100 (first 100 elements): 44782.80 requests per second
	LRANGE_300 (first 300 elements): 14823.60 requests per second
	LRANGE_500 (first 450 elements): 11515.43 requests per second
	LRANGE_600 (first 600 elements): 8499.79 requests per second
	MSET (10 keys): 67567.57 requests per second


	# run opad single threaded and pinned to cpu 0:
	$ ./opad --cpua 0 --allowResp 1
	# benchmark with no pipelining:
	$ ./redis-benchmark -q -p 4567 -r 50000
	PING_INLINE: 78926.60 requests per second
	PING_BULK: 75987.84 requests per second
	SET: 79808.46 requests per second
	GET: 79365.08 requests per second
	INCR: 71839.09 requests per second
	LPUSH: 80645.16 requests per second
	RPUSH: 81037.28 requests per second
	LPOP: 80385.85 requests per second
	RPOP: 80256.82 requests per second
	SADD: 80515.30 requests per second
	HSET: 58823.53 requests per second
	SPOP: 78554.59 requests per second
	LPUSH (needed to benchmark LRANGE): 80645.16 requests per second
	LRANGE_100 (first 100 elements): 34626.04 requests per second
	LRANGE_300 (first 300 elements): 16852.04 requests per second
	LRANGE_500 (first 450 elements): 8992.00 requests per second
	LRANGE_600 (first 600 elements): 6756.76 requests per second
	MSET (10 keys): 24485.80 requests per second

notes about these results:

 1. These ops are testing random access which will be slower for Opatomic.
 2. Opatomic is at a disadvantage because every op must be translated between
    Redis/RESP types and Opatomic supported data/types.
 3. The Opatomic results are using redis-benchmark which uses the RESP protocol and is
    slower when an iterator is returned (ie, LRANGE). This is because the RESP protocol
    requires an array length, which forces Opatomic to serialize the iterator to memory
    to count the number of elements before sending the result back to the client.
 4. Pipelining isn't used which would improve results for both systems
 5. The last test (MSET of 10 random keys) illustrates the speed difference for random ops.
    Setting 10 *random* keys in Opatomic is definitely slower than Redis.

## Major features missing but planned
 - scripting (EVAL, EVALSHA, etc)
 - replication
 - oplog compaction

## Other missing features (may be implemented eventually)
 - LRU key eviction
 - GEO family of commands
 - hyperloglog (PFADD/PFCOUNT/PFMERGE)
 - streams (XADD, etc)
 - BLPOP/BRPOP/BRPOPLPUSH/BZPOPMIN/BZPOPMAX
 - keyspace notifications
 - transactions (MULTI/EXEC/WATCH/UNWATCH/DISCARD)
 - clustering
 - modules

## Missing features that are unlikely to be added
 - SELECT/SWAPDB/MOVE - only 1 db per server (SELECT 0 works when using Redis client)