Commit graph

91 commits

Author SHA1 Message Date
Conrad Irwin
b34b46fa16
smaller automerge c (#545)
* Fix automerge-c tests on mac

* Generate significantly smaller automerge-c builds

This cuts the size of libautomerge_core.a from 25Mb to 1.6Mb on macOS
and 53Mb to 2.7Mb on Linux.

As a side-effect of setting codegen-units = 1 for all release builds the
optimized wasm files are also 100kb smaller.
2023-03-09 15:09:43 +00:00
Conrad Irwin
7b747b8341
Error instead of corrupt large op counters (#543)
Since b78211ca6, OpIds have been silently truncated to 2**32. This
causes corruption in the case the op id overflows.

This change converts the silent error to a panic, and guards against the
panic on the codepath found by the fuzzer.
2023-03-07 16:49:04 +00:00
Conrad Irwin
2c1970f664
Fix panic on invalid action (#541)
We make the validation on parsing operations in the encoded changes stricter to avoid a possible panic when applying changes.
2023-03-04 12:09:08 +00:00
christine betts
63b761c0d1
Suppress clippy warning in parse.rs + bump toolchain (#542)
* Fix rust error in parse.rs
* Bump toolchain to 1.67.0
2023-03-03 22:42:40 +00:00
Conrad Irwin
44fa7ac416
Don't panic on missing deps of change chunks (#538)
* Fix doubly-reported ops in load of change chunks

Since c3c04128f5, observers have been
called twice when calling Automerge::load() with change chunks.

* Better handle change chunks with missing deps

Before this change Automerge::load would panic if you passed a change
chunk that was missing a dependency, or multiple change chunks not in
strict dependency order. After this change these cases will error
instead.
2023-02-27 20:12:09 +00:00
Jason Kankiewicz
8de2fa9bd4
C API 2 (#530)
The AMvalue union, AMlistItem struct, AMmapItem struct, and AMobjItem struct are gone, replaced by the AMitem struct.

The AMchangeHashes, AMchanges, AMlistItems, AMmapItems, AMobjItems, AMstrs, and AMsyncHaves iterators are gone, replaced by the AMitems iterator.

The AMitem struct is opaque, getting and setting values is now achieved exclusively through function calls.

The AMitemsNext(), AMitemsPrev(), and AMresultItem() functions return a pointer to an AMitem struct so you ultimately get the same thing whether you're iterating over a sequence or calling AMmapGet() or AMlistGet().

Calling AMitemResult() on an AMitem struct will produce a new AMresult struct referencing its storage so now the AMresult struct for an iterator can be subsequently freed without affecting the AMitem structs that were filtered out of it.

The storage for a set of AMitem structs can be recombined into a single AMresult struct by passing pointers to their corresponding AMresult structs to AMresultCat().

For C/C++ programmers, I've added AMstrCmp(), AMstrdup(), AM{idxType,objType,status,valType}ToString() and AM{idxType,objType,status,valType}FromString(). It's also now possible to pass arbitrary parameters through AMstack{Item,Items,Result}() to a callback function.
2023-02-25 18:47:00 +00:00
Philip Schatz
407faefa6e
A few setup fixes (#529)
* include deno in dependencies

* install javascript dependencies

* remove redundant operation
2023-02-15 09:23:02 +00:00
Alex Good
c92d042c87 @automerge/automerge-wasm@0.1.24 and @automerge/automerge@2.0.2-alpha.2 2023-02-14 17:59:23 +00:00
Alex Good
9271b20cf5 Correct logic when skip = B and fix formatting
A few tests were failing which exposed the fact that if skip is `B` (the
out factor of the OpTree) then we set `skip = None` and this causes us
to attempt to return `Skip` in a non root node. I ported the failing
test from JS to Rust and fixed the problem.

I also fixed the formatting issues.
2023-02-14 17:21:59 +00:00
Orion Henry
5e82dbc3c8 rework how skip works to push the logic into node 2023-02-14 17:21:59 +00:00
Conrad Irwin
2cd7427f35 Use our leb128 parser for values
This ensures that values in automerge documents are encoded correctly,
and that no extra data is smuggled in any LEB fields.
2023-02-09 15:46:22 +00:00
Alex Good
a24d536d16 Move automerge::SequenceTree to automerge_wasm::SequenceTree
The `SequenceTree` is only ever used in `automerge_wasm` so move it
there.
2023-02-05 11:08:33 +00:00
Alex Good
c5fde2802f @automerge/automerge-wasm@0.1.24 and @automerge/automerge@2.0.2-alpha.1 2023-02-03 16:31:46 +00:00
Alex Good
13a775ed9a Speed up loading by generating clocks on demand
Context: currently we store a mapping from ChangeHash -> Clock, where
`Clock` is the set of (ActorId, (Sequence number, max Op)) pairs derived
from the given change and it's dependencies. This clock is used to
determine what operations are visible at a given set of heads.

Problem: populating this mapping for documents with large histories
containing many actors can be very slow as for each change we have to
allocate and merge a bunch of hashmaps.

Solution: instead of creating the clocks on load, create an adjacency
list based representation of the change graph and then derive the clock
from this graph when it is needed. Traversing even large graphs is still
almost as fast as looking up the clock in a hashmap.
2023-02-03 16:15:15 +00:00
Alex Good
1e33c9d9e0 Use Automerge::load instead of load_incremental if empty
Problem: when running the sync protocol for a new document the API
requires that the user create an empty document and then call
`receive_sync_message` on that document. This results in the OpObserver
for the new document being called with every single op in the document
history. For documents with a large history this can be extremely time
consuming, but the OpObserver doesn't need to know about all the hidden
states.

Solution: Modify `Automerge::load_with` and
`Automerge::apply_changes_with` to check if the document is empty before
applying changes. If the document _is_ empty then we don't call the
observer for every change, but instead use
`automerge::observe_current_state` to notify the observer of the new
state once all the changes have been applied.
2023-02-03 10:01:12 +00:00
Alex Good
c3c04128f5 Only observe the current state on load
Problem: When loading a document whilst passing an `OpObserver` we call
the OpObserver for every change in the loaded document. This slows down
the loading process for two reasons: 1) we have to make a call to the
observer for every op 2) we cannot just stream the ops into the OpSet in
topological order but must instead buffer them to pass to the observer.

Solution: Construct the OpSet first, then only traverse the visible ops
in the OpSet, calling the observer. For documents with a deep history
this results in vastly fewer calls to the observer and also allows us to
construct the OpSet much more quickly. It is slightly different
semantically because the observer never gets notified of changes which
are not visible, but that shouldn't matter to most observers.
2023-02-03 10:01:12 +00:00
Alex Good
da55dfac7a refactor: make fields of Automerge private
The fields of `automerge::Automerge` were crate public, which made it
hard to change the structure of `Automerge` with confidence. Make all
fields private and put them behind accessors where necessary to allow
for easy internal changes.
2023-02-03 10:01:12 +00:00
alexjg
9195e9cb76
Fix deny errors (#518)
* Ignore deny errors on duplicate windows-sys

* Delete spurious lockfile in automerge-cli
2023-02-02 15:02:53 +00:00
Conrad Irwin
a6959e70e8
More robust leb128 parsing (#515)
Before this change i64 decoding did not work for negative numbers (not a
real problem because it is only used for the timestamp of a change),
and both u64 and i64 would allow overlong LEB encodings.
2023-01-31 17:54:54 +00:00
alexjg
de5af2fffa
automerge-rs 0.3.0 and automerge-test 0.2.0 (#512) 2023-01-30 19:58:35 +00:00
alexjg
08801ab580
automerge-rs: Introduce ReadDoc and SyncDoc traits and add documentation (#511)
The Rust API has so far grown somewhat organically driven by the needs of the
javascript implementation. This has led to an API which is quite awkward and
unfamiliar to Rust programmers. Additionally there is no documentation to speak
of. This commit is the first movement towards cleaning things up a bit. We touch
a lot of files but the changes are all very mechanical. We introduce a few
traits to abstract over the common operations between `Automerge` and
`AutoCommit`, and add a whole bunch of documentation.

* Add a `ReadDoc` trait to describe methods which read value from a document.
  make `Transactable` extend `ReadDoc`
* Add a `SyncDoc` trait to describe methods necessary for synchronizing
  documents.
* Put the `SyncDoc` implementation for `AutoCommit` behind `AutoCommit::sync` to
  ensure that any open transactions are closed before taking part in the sync
  protocol
* Split `OpObserver` into two traits: `OpObserver` + `BranchableObserver`.
  `BranchableObserver` captures the methods which are only needed for observing
  transactions.
* Add a whole bunch of documentation.

The main changes Rust users will need to make is:

* Import the `ReadDoc` trait wherever you are using the methods which have been
  moved to it. Optionally change concrete paramters on functions to `ReadDoc`
  constraints.
* Likewise import the `SyncDoc` trait wherever you are doing synchronisation
  work
* If you are using the `AutoCommit::*_sync_message` methods you will need to add
  a call to `AutoCommit::sync()` first. E.g. `doc.generate_sync_message` becomes
  `doc.sync().generate_sync_message`
* If you have an implementation of `OpObserver` which you are using in an
  `AutoCommit` then split it into an implementation of `OpObserver` and
  `BranchableObserver`
2023-01-30 19:37:03 +00:00
alexjg
58a7a06b75
@automerge/automerge-wasm@0.1.23 and @automerge/automerge@2.0.1-alpha.6 (#509) 2023-01-27 20:27:11 +00:00
Conrad Irwin
931ee7e77b
Add Fuzz Testing (#498)
* Add fuzz testing for document load

* Fix fuzz crashers and add to test suite
2023-01-25 16:03:05 +00:00
alexjg
819767cc33
fix: use saturating_sub when updating cached text width (#505)
Problem: In `automerge::query::Index::change_vis` we use `-=` to
subtract the width of an operation which is being hidden from the text
widths which we store on the index of each node in the optree. This
index represents the width of all the visible text operations in this
node and below. This was causing an integer underflow error when
encountering some list operations. More specifically, when a
`ScalarValue::Str` in a list was made invisible by a later operation
which contained a _shorter_ string, the width subtracted from the indexed
text widths could be longer than the current index.

Solution: use `saturating_sub` instead. This is technically papering
over the problem because really the width should never go below zero,
but the text widths are only relevant for text objects where the
existing logic works as advertised because we don't have a `set`
operation for text indices. A more robust solution would be to track the
type of the Index (and consequently of the `OpTree`) at the type level,
but time is limited and problems are infinite.

Also, add a lengthy description of the reason we are using
`saturating_sub` so that when I read it in about a month I don't have
to redo the painful debugging process that got me to this commit.
2023-01-23 19:19:55 +00:00
Alex Currie-Clark
78adbc4ff9
Update patch types (#499)
* Update `Patch` types

* Clarify that the splice patch applies to text

* Add Splice patch type to exports

* Add new patches to javascript
2023-01-23 17:02:02 +00:00
Andrew Jeffery
1f7b109dcd
Add From<SmolStr> for ScalarValue::Str (#506) 2023-01-23 17:01:41 +00:00
Conrad Irwin
98e755106f
Fix and simplify lebsize calculations (#503)
Before this change numbits_i64() was incorrect for every value of the
form 0 - 2^x. This only manifested in a visible error if x%7 == 6 (so
for -64, -8192, etc.) at which point `lebsize` would return a value one
too large, causing a panic in commit().
2023-01-23 11:01:05 +00:00
alexjg
6b0ee6da2e
Bump js to 2.0.1-alpha.5 and automerge-wasm to 0.1.22 (#497) 2023-01-19 22:15:06 +00:00
alexjg
9b44a75f69
fix: don't panic when generating parents for hidden objects (#500)
Problem: the `OpSet::export_key` method uses `query::ElemIdPos` to
determine the index of sequence elements when exporting a key. This
query returned `None` for invisible elements. The `Parents` iterator
which is used to generate paths to objects in patches in
`automerge-wasm` used `export_key`. The end result is that applying a
remote change which deletes an object in a sequence would panic as it
tries to generate a path for an invisible object.

Solution: modify `query::ElemIdPos` to include invisible objects. This
does mean that the path generated will refer to the previous visible
object in the sequence as it's index, but this is probably fine as for
an invisible object the path shouldn't be used anyway.

While we're here also change the return value of `OpSet::export_key` to
an `Option` and make `query::Index::ops` private as obeisance to the
Lady of the Golden Blade.
2023-01-19 21:11:36 +00:00
alexjg
d8baa116e7
automerge-rs: Add ExId::to_bytes (#491)
The `ExId` structure has some internal details which make lookups for
object IDs which were produced by the document doing the looking up
faster. These internal details are quite specific to the implementation
so we don't want to expose them as a public API. On the other hand, we
need to be able to serialize `ExId`s so that FFI clients can hold on to
them without referencing memory which is owned by the document (ahem,
looking at you Java).

Introduce `ExId::to_bytes` and `TryFrom<&[u8]> ExId` implementing a
canonical serialization which includes a version tag, giveing us
compatibility options if we decide to change the implementation.
2023-01-19 17:02:47 +00:00
alexjg
964ae2bd81
Fix SeekOpWithPatch on optrees with only internal optrees (#496)
In #480 we fixed an issue where `SeekOp` calculated an incorrect
insertion index on optrees where the only visible ops were on internal
nodes. We forgot to port this fix to `SeekOpWithPatch`, which has almost
the same logic just with additional work done in order to notify an
`OpObserver` of changes. Add a test and fix to `SeekOpWithPatch`
2023-01-14 11:27:48 +00:00
Alex Good
22e9915fac automerge-wasm: publish release build in Github Action 2023-01-12 12:42:19 +00:00
Alex Good
5c02445bee
Bump automerge-wasm, again
In order to re-trigger the release action we are testing we bump the
version which was de-bumped in the last commit.
2023-01-12 10:39:11 +00:00
Alex Good
3ef60747f4
Roll back automerge-wasm to test release action
The release action we are working conditionally executes based on the
version of `automerge-wasm` in the previous commit. We need to trigger
it even though the version has not changed so we roll back the version
in this commit and the commit immediately following this will bump it
again.
2023-01-12 10:37:11 +00:00
Alex Good
a0d698dc8e
Version bump js and wasm
js: 2.0.1-alpha.3
wasm: 0.1.20
2023-01-12 09:55:12 +00:00
Alex Good
5763210b07
wasm: Allow a choice of text representations
The wasm codebase assumed that clients want to represent text as a
string of characters. This is faster, but in order to enable backwards
compatibility we add a `TextRepresentation` argument to
`automerge_wasm::Automerge::new` to allow clients to choose between a
`string` or `Array<any>` representation. The `automerge_wasm::Observer`
will consult this setting to determine what kind of diffs to generate.
2023-01-10 12:52:19 +00:00
Alex Good
18a3f61704 Update rust toolchain to 1.66 2023-01-10 12:51:56 +00:00
Alex Currie-Clark
0306ade939 Update action name on IncPatch type 2023-01-06 15:23:41 +00:00
Alex Good
8a645bb193 js: Enable typescript for the JS tests
The tsconfig.json was setup to not include the JS tests. Update the
config to include the tests when checking typescript and fix all the
consequent errors. None of this is semantically meaningful _except_ for
a few incorrect usages of the API which were leading to flaky tests.
Hooray for types!
2022-12-22 11:48:06 +00:00
Alex Good
4de0756bb4 Correctly handle ops on optree node boundaries
The `SeekOp` query can produce incorrect results when the optree it is
searching only has visible ops on the internal nodes. Add some tests to
demonstrate the issue as well as a fix.
2022-12-20 20:38:29 +00:00
Alex Good
d678280b57 automerge-cli: Add an examine-sync command
This is useful when receiving sync messages that behave in unexptected
ways
2022-12-19 16:30:14 +00:00
Alex Good
f682db3039 automerge-cli: Add a flag to skip verifiying heads 2022-12-19 16:30:14 +00:00
Alex Good
6da93b6adc Correctly implement colored json
My quickly thrown together implementation had somem mistakes in it which
meant that the JSON produced was malformed.
2022-12-19 16:30:14 +00:00
Alex Good
0f90fe4d02 Add a method for loading a document without verifying heads
This is primarily useful when debugging documents which have been
corrupted somehow so you would like to see the ops even if you can't
trust them. Note that this is _not_ currently useful for performance
reasons as the hash graph is still constructed, just not verified.
2022-12-19 16:30:14 +00:00
alexjg
8aff1296b9
automerge-cli: remove a bunch of bad dependencies (#478)
Automerge CLI depends transitively (via and old version of `clap` and
via `colored_json` on `atty` and `ansi_term`. These crates are both
marked as unmaintained and this generates irritating `cargo deny`
messages. To avoid this, implement colored JSON ourselves using the
`termcolor` crate - colored JSON is pretty mechanical. Also update
criterion and cbindgen dependencies and ignore the criterion tree in
deny.toml as we only ever use it in benchmarks.

All that's left now is a warning about atty in cbindgen, we'll just have
to wait for cbindgen to fix that, it's a build time dependency anyway so
it's not really an issue.
2022-12-14 18:06:19 +00:00
Conrad Irwin
6dad2b7df1
Don't panic on invalid gzip stream (#477)
* Don't panic on invalid gzip stream

Before this change automerge-rs would panic if the gzip data in
a raw column was invalid; after this change the error is propagated
to the caller correctly.
2022-12-14 17:34:22 +00:00
Orion Henry
3229548fc7
update js dependencies and some lint errors (#474) 2022-12-11 21:26:00 +00:00
Orion Henry
b78211ca65
change opid to (u32,u32) - 10% performance uptick (#473) 2022-12-11 18:56:20 +00:00
Orion Henry
1222fc0df1
rewrite opnode to store usize instead of Op (#471) 2022-12-10 10:36:05 +00:00
Orion Henry
2db9e78f2a
Text v2. JS Api now uses text by default (#462) 2022-12-09 23:48:07 +00:00