The logic for reconstructing changes from the compressed document format
records operations which set a key in an object so that it can later
reconstruct delete operations from the successor list of the document
format operations. The logic to do this was only recording set
operations and not `make*` operations. This meant that delete operations
targeting `make*` operations could not be loaded correctly.
Correctly record `make*` operations for later use in constructing delete
operations.
Occasionally one needs to debug problems in a document with a large
number of objects. In this case it is unhelpful to print a graphviz of
the whole opset because there are too many objects. Add a
`Option<Vec<ObjId>>` argument to `OpSet::visualise` to filter the
objects which are visualised.
The compressed document format includes at the end of the document chunk
the indicies of the heads of the document. Older versions of the
javascript implementation do not include these indicies so we allow them
to be omitted when decoding.
Whilst we're here add some tracing::trace logs to make it easier to
understand where parsing is failing.
The latest clippy (90.1.65 for me) added a lint which checks for types
that implement `PartialEq` and could implement `Eq`
(`derive_partial_eq_without_eq`). Add a `derive(Eq)` in a bunch of
places to satisfy this lint.
For some usecases the overhead of compressed columns in the document
format is not worth it. Add `Automerge::save_nocompress` to save without
compressing columns.
Signed-off-by: Alex Good <alex@memoryandthought.me>
This is achieved by liberal use of feature flags. Main additions are:
* Build the OpSet more efficiently when loading from compressed
document storage using a DocObserver as implemented in
`automerge::op_tree::load`
* Reimplement the parsing login in the various types in
`automerge::sync`
There are numerous other small changes required to get the types to line
up.
Signed-off-by: Alex Good <alex@memoryandthought.me>
It is useful to be able to generate a `serde::Value` representation of
an automerge document. We can do this without an intermediate type by
iterating over the keys of the document recursively. Add
`autoeserde::AutoSerde` to implement this.
Signed-off-by: Alex Good <alex@memoryandthought.me>
Implement parsing the binary format using the new parser library and the
new encoding types. This is superior to the previous parsing
implementation in that invalid data should never cause panics and it
exposes and interface to construct an OpSet from a saved document much
more efficiently.
Signed-off-by: Alex Good <alex@memoryandthought.me>
The representation of changes in storage-v2 is different to the existing
representation so add accessor methods to the fields of `Change` and
make all accesses go through them. This allows the change representation
in storage-v2 to be a drop-in.
Signed-off-by: Alex Good <alex@memoryandthought.me>
The existing implementation of the columnar format elides a lot of error
handling (by converting `Err` to `None`) and doesn't allow writing to a
single chunk of memory when encoding. Implement a new set of encoding and
decoding primitives which handle errors more robustly and allow us to
use a single chunk of memory when reading and writing.
Signed-off-by: Alex Good <alex@memoryandthought.me>
Op IDs in the OpSet are represented using an index into a set of actor
IDs. This is efficient but requires conversion when reading and
writing from storage (where the set of actors might be different from
ths in the OpSet). Add a trait for converting between different
representations of an OpID.
Signed-off-by: Alex Good <alex@memoryandthought.me>
We have parsing needs which are slightly more complex than just reading
stuff from a buffer, but not complex enough to justify a dependency on a
parsing library. Implement a simple parser combinator library for use in
parsing the binary storage format.
Signed-off-by: Alex Good <alex@memoryandthought.me>
The colunar storage format allows for values which we do not know the
type of. In order that we can handle these types in a forward compatible
way we add ScalarValue::Unknown.
Signed-off-by: Alex Good <alex@memoryandthought.me>
The new storage implementation is sufficiently large a change that it
warrants a period of testing. To facilitate testing the new and old
implementations side by side we slightly abuse cargo's feature flags and
add a storage-v2 feature which enables the new storage and disables the
old storage.
Note that this commit doesn't use `--all-features` when building the
workspace in scripts/ci/build-test. This will be rectified in a later
commit once the storage-v2 feature is integrated into the other crates
in the workspace.
Signed-off-by: Alex Good <alex@memoryandthought.me>
as `AMgetChangeByHash()`.
Add the `AM_CHANGE_HASH_SIZE` macro define constant for
`AMgetChangeByHash()`.
Replace the literal `32` with the `automerge::types::HASH_SIZE` constant.
Expose `automerge::AutoCommit::splice()` as `AMsplice()`.
Add the `automerge::error::AutomergeError::InvalidValueType` variant for
`AMsplice()`.
Add push functionality to `AMspliceText()`.
Fix some documentation content bugs.
Fix some documentation formatting bugs.
The ordering of opids in the successor and predecessors of an op is
relevant when encoding because inconsistent ordering changes the
hashgraph. This means we must maintain the invariant that opids are
encoded in ascending lamport order. We have been maintaining this
invariant in the encoding implementation - however, this is not ideal
because it requires allocating for every op in the change when we commit
a transaction.
Add `types::OpIds` and use it in place of `Vec<OpId>` for `Op::succ` and
`Op::pred`. `OpIds` maintains the invariant that the IDs it contains
must be ordered with respect to some comparator function - which is
always `OpSetMetadata::lamport_cmp`. Remove the sorting of opids in
SuccEncoder::append.
Makes SuccEncoder sort successors in Lamport clock order.
Such an ordering is expected by automerge js when loading documents,
otherwise some documents fail to load with a "operation IDs are not in
ascending order" error.
There is easy confusion when calling parents with the id of a scalar,
wanting it to get the parent object first but that is not implemented.
To get the parent object of a scalar id would mean searching every
object for the OpId which may get too expensive when lots of objects are
around, this may be reconsidered later but the result would still be
useful to indicate when the id doesn't exist in the document vs has no
parents.