Integrate BACKBEAT SDK and resolve KACHING license validation

Major integrations and fixes:
- Added BACKBEAT SDK integration for P2P operation timing
- Implemented beat-aware status tracking for distributed operations
- Added Docker secrets support for secure license management
- Resolved KACHING license validation via HTTPS/TLS
- Updated docker-compose configuration for clean stack deployment
- Disabled rollback policies to prevent deployment failures
- Added license credential storage (CHORUS-DEV-MULTI-001)

Technical improvements:
- BACKBEAT P2P operation tracking with phase management
- Enhanced configuration system with file-based secrets
- Improved error handling for license validation
- Clean separation of KACHING and CHORUS deployment stacks

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
anthonyrawlins
2025-09-06 07:56:26 +10:00
parent 543ab216f9
commit 9bdcbe0447
4730 changed files with 1480093 additions and 1916 deletions

77
vendor/github.com/ipld/go-ipld-prime/codec/README.md generated vendored Normal file
View File

@@ -0,0 +1,77 @@
Codecs
======
The `go-ipld-prime/codec` package is a grouping package.
The subpackages contains some codecs which reside in this repo.
The codecs included here are our "batteries included" codecs,
but they are not otherwise special.
It is not necessary for a codec to be a subpackage here to be a valid codec to use with go-ipld;
anything that implements the `codec.Encoder` and `codec.Decoder` interfaces is fine.
Terminology
-----------
We generally refer to "codecs" as having an "encode" function and "decode" function.
We consider "encoding" to be the process of going from {Data Model} to {serial data},
and "decoding" to be the process of going from {serial data} to {Data Model}.
### Codec vs Multicodec
A "codec" is _any_ function that goes from {Data Model} to {serial data}, or vice versa.
A "multicodec" is a function which does that and is _also_ specifically recognized and described in
the tables in https://github.com/multiformats/multicodec/ .
Multicodecs generally leave no further room for customization and configuration,
because their entire behavior is supposed to be specified by a multicodec indicator code number.
Our codecs, in the child packages of this one, usually offer configuration options.
They also usually offer exactly one function, which does *not* allow configuration,
which is supplying a multicodec-compatible behavior.
You'll see this marked in the docs on those functions.
### Marshal vs Encode
It's common to see the terms "marshal" and "unmarshal" used in golang.
Those terms are usually describing when structured data is transformed into linearized, tokenized data
(and then, perhaps, all the way to serially encoded data), or vice versa.
We would use the words the same way... except we don't end up using them,
because that feature doesn't really come up in our codec layer.
In IPLD, we would describe mapping some typed data into Data Model as "marshalling".
(It's one step shy of tokenizing, but barely: Data Model does already have defined ordering for every element of data.)
And we do have systems that do this:
`bindnode` and our codegen systems both do this, implicitly, when they give you an `ipld.Node` of the representation of some data.
We just don't end up talking about it as "marshalling" because of how it's done implicitly by those systems.
As a result, all of our features relating to codecs only end up speaking about "encoding" and "decoding".
### Legacy code
There are some appearances of the words "marshal" and "unmarshal" in some of our subpackages here.
That verbiage is generally on the way out.
For functions and structures with those names, you'll notice their docs marking them as deprecated.
Why have "batteries-included" codecs?
-------------------------------------
These codecs live in this repo because they're commonly used, highly supported,
and general-purpose codecs that we recommend for widespread usage in new developments.
Also, it's just plain nice to have something in-repo for development purposes.
It makes sure that if we try to make any API changes, we immediately see if they'd make codecs harder to implement.
We also use the batteries-included codecs for debugging, for test fixtures, and for benchmarking.
Further yet, the batteries-included codecs let us offer getting-started APIs.
For example, we offer some helper APIs which use codecs like e.g. JSON to give consumers of the libraries
one-step helper methods that "do the right thing" with zero config... so long as they happen to use that codec.
Even for consumers who don't use those codecs, such functions then serve as natural documentation
and examples for what to do to put their codec of choice to work.

128
vendor/github.com/ipld/go-ipld-prime/codec/api.go generated vendored Normal file
View File

@@ -0,0 +1,128 @@
package codec
import (
"io"
"github.com/ipld/go-ipld-prime/datamodel"
)
// The following two types define the two directions of transform that a codec can be expected to perform:
// from Node to serial stream (aka "encoding", sometimes also described as "marshalling"),
// and from serial stream to Node (via a NodeAssembler) (aka "decoding", sometimes also described as "unmarshalling").
//
// You'll find a couple of implementations matching this shape in subpackages of 'codec'.
// (These are the handful of encoders and decoders we ship as "batteries included".)
// Other encoder and decoder implementations can be found in other repositories/modules.
// It should also be easy to implement encodecs and decoders of your own!
//
// Encoder and Decoder functions can be used on their own, but are also often used via the `ipld/linking.LinkSystem` construction,
// which handles all the other related operations necessary for a content-addressed storage system at once.
//
// Encoder and Decoder functions can be registered in the multicodec table in the `ipld/multicodec` package
// if they're providing functionality that matches the expectations for a multicodec identifier.
// This table will be used by some common EncoderChooser and DecoderChooser implementations
// (namely, the ones in LinkSystems produced by the `linking/cid` package).
// It's not strictly necessary to register functions there, though; you can also just use them directly.
//
// There are furthermore several conventions that codec packages are recommended to follow, but are only conventions:
//
// Most codec packages should have a ReusableEncoder and ResuableDecoder type,
// which contain any working memory needed by the implementation, as well as any configuration options,
// and those types should have an Encode and Decode function respectively which match these function types.
// They may alternatively have EncoderConfig and DecoderConfig types, which have similar purpose,
// but aren't promising memory reuse if kept around.
//
// By convention, a codec package that expects to fulfill a multicodec contract will also have
// a package-scope exported function called Encode or Decode which also matches this interface,
// and is the equivalent of creating a zero-value ReusableEncoder or ReusableDecoder (aka, default config)
// and using its Encode or Decode methods.
// This package-scope function may also internally use a sync.Pool
// to keep some ReusableEncoder values on hand to avoid unnecesary allocations.
//
// Note that an EncoderConfig or DecoderConfig type that supports configuration options
// does not functionally expose those options when invoked by the multicodec system --
// multicodec indicator codes do not provide room for extended configuration info.
// Codecs that expose configuration options are doing so for library users to enjoy;
// it does not mean those non-default configurations will necessarly be available
// in all scenarios that use codecs indirectly.
// There is also no standard interface for such configurations: by nature,
// if they exist at all, they tend to vary per codec.
type (
// Encoder defines the shape of a function which traverses a Node tree
// and emits its data in a serialized form into an io.Writer.
//
// The dual of Encoder is a Decoder, which takes a NodeAssembler
// and fills it with deserialized data consumed from an io.Reader.
// Typically, Decoder and Encoder functions will be found in pairs,
// and will be expected to be able to round-trip each other's data.
//
// Encoder functions can be used directly.
// Encoder functions are also often used via a LinkSystem when working with content-addressed storage.
// LinkSystem methods will helpfully handle the entire process of traversing a Node tree,
// encoding this data, hashing it, streaming it to the writer, and committing it -- all as one step.
//
// An Encoder works with Nodes.
// If you have a native golang structure, and want to serialize it using an Encoder,
// you'll need to figure out how to transform that golang structure into an ipld.Node tree first.
//
// It may be useful to understand "multicodecs" when working with Encoders.
// In IPLD, a system called "multicodecs" is typically used to describe encoding foramts.
// A "multicodec indicator" is a number which describes an encoding;
// the Link implementations used in IPLD (CIDs) store a multicodec indicator in the Link;
// and in this library, a multicodec registry exists in the `codec` package,
// and can be used to associate a multicodec indicator number with an Encoder function.
// The default EncoderChooser in a LinkSystem will use this multicodec registry to select Encoder functions.
// However, you can construct a LinkSystem that uses any EncoderChooser you want.
// It is also possible to have and use Encoder functions that aren't registered as a multicodec at all...
// we just recommend being cautious of this, because it may make your data less recognizable
// when working with other systems that use multicodec indicators as part of their communication.
Encoder func(datamodel.Node, io.Writer) error
// Decoder defines the shape of a function which produces a Node tree
// by reading serialized data from an io.Reader.
// (Decoder doesn't itself return a Node directly, but rather takes a NodeAssembler as an argument,
// because this allows the caller more control over the Node implementation,
// as well as some control over allocations.)
//
// The dual of Decoder is an Encoder, which takes a Node and
// emits its data in a serialized form into an io.Writer.
// Typically, Decoder and Encoder functions will be found in pairs,
// and will be expected to be able to round-trip each other's data.
//
// Decoder functions can be used directly.
// Decoder functions are also often used via a LinkSystem when working with content-addressed storage.
// LinkSystem methods will helpfully handle the entire process of opening block readers,
// verifying the hash of the data stream, and applying a Decoder to build Nodes -- all as one step.
//
// A Decoder works with Nodes.
// If you have a native golang structure, and want to populate it with data using a Decoder,
// you'll need to either get a NodeAssembler which proxies data into that structure directly,
// or assemble a Node as intermediate storage and copy the data to the native structure as a separate step.
//
// It may be useful to understand "multicodecs" when working with Decoders.
// See the documentation on the Encoder function interface for more discussion of multicodecs,
// the multicodec table, and how this is typically connected to linking.
Decoder func(datamodel.NodeAssembler, io.Reader) error
)
// -------------------
// Errors
//
type ErrBudgetExhausted struct{}
func (e ErrBudgetExhausted) Error() string {
return "decoder resource budget exhausted (message too long or too complex)"
}
// ---------------------
// Other valuable and reused constants
//
type MapSortMode uint8
const (
MapSortMode_None MapSortMode = iota
MapSortMode_Lexical
MapSortMode_RFC7049
)

View File

@@ -0,0 +1,3 @@
package dagcbor
const linkTag = 42

View File

@@ -0,0 +1,42 @@
/*
The dagcbor package provides a DAG-CBOR codec implementation.
The Encode and Decode functions match the codec.Encoder and codec.Decoder function interfaces,
and can be registered with the go-ipld-prime/multicodec package for easy usage with systems such as CIDs.
Importing this package will automatically have the side-effect of registering Encode and Decode
with the go-ipld-prime/multicodec registry, associating them with the standard multicodec indicator numbers for DAG-CBOR.
This implementation follows most of the rules of DAG-CBOR, namely:
- by and large, it does emit and parse CBOR!
- only explicit-length maps and lists will be emitted by Encode;
- only tag 42 is accepted, and it must parse as a CID;
- only 64 bit floats will be emitted by Encode.
This implementation is also not strict about certain rules:
- Encode is order-passthrough when emitting maps (it does not sort, nor abort in error if unsorted data is encountered).
To emit sorted data, the node should be sorted before applying the Encode function.
- Decode is order-passthrough when parsing maps (it does not sort, nor abort in error if unsorted data is encountered).
To be strict about the ordering of data, additional validation must be applied to the result of the Decode function.
- Decode will accept indeterminate length lists and maps without complaint.
(These should not be allowed according to the DAG-CBOR spec, nor will the Encode function re-emit such values,
so this behavior should almost certainly be seen as a bug.)
- Decode does not consistently verify that ints and floats use the smallest representation possible (or, the 64-bit version, in the float case).
(Only these numeric encodings should be allowed according to the DAG-CBOR spec, and the Encode function will not re-emit variations,
so this behavior should almost certainly be seen as a bug.)
A note for future contributors: some functions in this package expose references to packages from the refmt module, and/or use them internally.
Please avoid adding new code which expands the visibility of these references.
In future work, we'd like to reduce or break this relationship entirely
(in part, to reduce dependency sprawl, and in part because several of
the imprecisions noted above stem from that library's lack of strictness).
*/
package dagcbor

View File

@@ -0,0 +1,381 @@
package dagcbor
import (
"fmt"
"io"
"sort"
"github.com/polydawn/refmt/cbor"
"github.com/polydawn/refmt/shared"
"github.com/polydawn/refmt/tok"
"github.com/ipld/go-ipld-prime/codec"
"github.com/ipld/go-ipld-prime/datamodel"
cidlink "github.com/ipld/go-ipld-prime/linking/cid"
)
// This file should be identical to the general feature in the parent package,
// except for the `case datamodel.Kind_Link` block,
// which is dag-cbor's special sauce for schemafree links.
// EncodeOptions can be used to customize the behavior of an encoding function.
// The Encode method on this struct fits the codec.Encoder function interface.
type EncodeOptions struct {
// If true, allow encoding of Link nodes as CBOR tag(42);
// otherwise, reject them as unencodable.
AllowLinks bool
// Control the sorting of map keys, using one of the `codec.MapSortMode_*` constants.
MapSortMode codec.MapSortMode
}
// Encode walks the given datamodel.Node and serializes it to the given io.Writer.
// Encode fits the codec.Encoder function interface.
//
// The behavior of the encoder can be customized by setting fields in the EncodeOptions struct before calling this method.
func (cfg EncodeOptions) Encode(n datamodel.Node, w io.Writer) error {
// Probe for a builtin fast path. Shortcut to that if possible.
type detectFastPath interface {
EncodeDagCbor(io.Writer) error
}
if n2, ok := n.(detectFastPath); ok {
return n2.EncodeDagCbor(w)
}
// Okay, generic inspection path.
return Marshal(n, cbor.NewEncoder(w), cfg)
}
// Future work: we would like to remove the Marshal function,
// and in particular, stop seeing types from refmt (like shared.TokenSink) be visible.
// Right now, some kinds of configuration (e.g. for whitespace and prettyprint) are only available through interacting with the refmt types;
// we should improve our API so that this can be done with only our own types in this package.
// Marshal is a deprecated function.
// Please consider switching to EncodeOptions.Encode instead.
func Marshal(n datamodel.Node, sink shared.TokenSink, options EncodeOptions) error {
var tk tok.Token
return marshal(n, &tk, sink, options)
}
func marshal(n datamodel.Node, tk *tok.Token, sink shared.TokenSink, options EncodeOptions) error {
switch n.Kind() {
case datamodel.Kind_Invalid:
return fmt.Errorf("cannot traverse a node that is absent")
case datamodel.Kind_Null:
tk.Type = tok.TNull
_, err := sink.Step(tk)
return err
case datamodel.Kind_Map:
return marshalMap(n, tk, sink, options)
case datamodel.Kind_List:
// Emit start of list.
tk.Type = tok.TArrOpen
l := n.Length()
tk.Length = int(l) // TODO: overflow check
if _, err := sink.Step(tk); err != nil {
return err
}
// Emit list contents (and recurse).
for i := int64(0); i < l; i++ {
v, err := n.LookupByIndex(i)
if err != nil {
return err
}
if err := marshal(v, tk, sink, options); err != nil {
return err
}
}
// Emit list close.
tk.Type = tok.TArrClose
_, err := sink.Step(tk)
return err
case datamodel.Kind_Bool:
v, err := n.AsBool()
if err != nil {
return err
}
tk.Type = tok.TBool
tk.Bool = v
_, err = sink.Step(tk)
return err
case datamodel.Kind_Int:
if uin, ok := n.(datamodel.UintNode); ok {
v, err := uin.AsUint()
if err != nil {
return err
}
tk.Type = tok.TUint
tk.Uint = v
} else {
v, err := n.AsInt()
if err != nil {
return err
}
tk.Type = tok.TInt
tk.Int = v
}
_, err := sink.Step(tk)
return err
case datamodel.Kind_Float:
v, err := n.AsFloat()
if err != nil {
return err
}
tk.Type = tok.TFloat64
tk.Float64 = v
_, err = sink.Step(tk)
return err
case datamodel.Kind_String:
v, err := n.AsString()
if err != nil {
return err
}
tk.Type = tok.TString
tk.Str = v
_, err = sink.Step(tk)
return err
case datamodel.Kind_Bytes:
v, err := n.AsBytes()
if err != nil {
return err
}
tk.Type = tok.TBytes
tk.Bytes = v
_, err = sink.Step(tk)
return err
case datamodel.Kind_Link:
if !options.AllowLinks {
return fmt.Errorf("cannot Marshal ipld links to CBOR")
}
v, err := n.AsLink()
if err != nil {
return err
}
switch lnk := v.(type) {
case cidlink.Link:
if !lnk.Cid.Defined() {
return fmt.Errorf("encoding undefined CIDs are not supported by this codec")
}
tk.Type = tok.TBytes
tk.Bytes = append([]byte{0}, lnk.Bytes()...)
tk.Tagged = true
tk.Tag = linkTag
_, err = sink.Step(tk)
tk.Tagged = false
return err
default:
return fmt.Errorf("schemafree link emission only supported by this codec for CID type links")
}
default:
panic("unreachable")
}
}
func marshalMap(n datamodel.Node, tk *tok.Token, sink shared.TokenSink, options EncodeOptions) error {
// Emit start of map.
tk.Type = tok.TMapOpen
expectedLength := int(n.Length())
tk.Length = expectedLength // TODO: overflow check
if _, err := sink.Step(tk); err != nil {
return err
}
if options.MapSortMode != codec.MapSortMode_None {
// Collect map entries, then sort by key
type entry struct {
key string
value datamodel.Node
}
entries := []entry{}
for itr := n.MapIterator(); !itr.Done(); {
k, v, err := itr.Next()
if err != nil {
return err
}
keyStr, err := k.AsString()
if err != nil {
return err
}
entries = append(entries, entry{keyStr, v})
}
if len(entries) != expectedLength {
return fmt.Errorf("map Length() does not match number of MapIterator() entries")
}
// Apply the desired sort function.
switch options.MapSortMode {
case codec.MapSortMode_Lexical:
sort.Slice(entries, func(i, j int) bool {
return entries[i].key < entries[j].key
})
case codec.MapSortMode_RFC7049:
sort.Slice(entries, func(i, j int) bool {
// RFC7049 style sort as per DAG-CBOR spec
li, lj := len(entries[i].key), len(entries[j].key)
if li == lj {
return entries[i].key < entries[j].key
}
return li < lj
})
}
// Emit map contents (and recurse).
for _, e := range entries {
tk.Type = tok.TString
tk.Str = e.key
if _, err := sink.Step(tk); err != nil {
return err
}
if err := marshal(e.value, tk, sink, options); err != nil {
return err
}
}
} else { // no sorting
// Emit map contents (and recurse).
var entryCount int
for itr := n.MapIterator(); !itr.Done(); {
k, v, err := itr.Next()
if err != nil {
return err
}
entryCount++
tk.Type = tok.TString
tk.Str, err = k.AsString()
if err != nil {
return err
}
if _, err := sink.Step(tk); err != nil {
return err
}
if err := marshal(v, tk, sink, options); err != nil {
return err
}
}
if entryCount != expectedLength {
return fmt.Errorf("map Length() does not match number of MapIterator() entries")
}
}
// Emit map close.
tk.Type = tok.TMapClose
_, err := sink.Step(tk)
return err
}
// EncodedLength will calculate the length in bytes that the encoded form of the
// provided Node will occupy.
//
// Note that this function requires a full walk of the Node's graph, which may
// not necessarily be a trivial cost and will incur some allocations. Using this
// method to calculate buffers to pre-allocate may not result in performance
// gains, but rather incur an overall cost. Use with care.
func EncodedLength(n datamodel.Node) (int64, error) {
switch n.Kind() {
case datamodel.Kind_Invalid:
return 0, fmt.Errorf("cannot traverse a node that is absent")
case datamodel.Kind_Null:
return 1, nil // 0xf6
case datamodel.Kind_Map:
length := uintLength(uint64(n.Length())) // length prefixed major 5
for itr := n.MapIterator(); !itr.Done(); {
k, v, err := itr.Next()
if err != nil {
return 0, err
}
keyLength, err := EncodedLength(k)
if err != nil {
return 0, err
}
length += keyLength
valueLength, err := EncodedLength(v)
if err != nil {
return 0, err
}
length += valueLength
}
return length, nil
case datamodel.Kind_List:
nl := n.Length()
length := uintLength(uint64(nl)) // length prefixed major 4
for i := int64(0); i < nl; i++ {
v, err := n.LookupByIndex(i)
if err != nil {
return 0, err
}
innerLength, err := EncodedLength(v)
if err != nil {
return 0, err
}
length += innerLength
}
return length, nil
case datamodel.Kind_Bool:
return 1, nil // 0xf4 or 0xf5
case datamodel.Kind_Int:
v, err := n.AsInt()
if err != nil {
return 0, err
}
if v < 0 {
v = -v - 1 // negint is stored as one less than actual
}
return uintLength(uint64(v)), nil // major 0 or 1, as small as possible
case datamodel.Kind_Float:
return 9, nil // always major 7 and 64-bit float
case datamodel.Kind_String:
v, err := n.AsString()
if err != nil {
return 0, err
}
return uintLength(uint64(len(v))) + int64(len(v)), nil // length prefixed major 3
case datamodel.Kind_Bytes:
v, err := n.AsBytes()
if err != nil {
return 0, err
}
return uintLength(uint64(len(v))) + int64(len(v)), nil // length prefixed major 2
case datamodel.Kind_Link:
v, err := n.AsLink()
if err != nil {
return 0, err
}
switch lnk := v.(type) {
case cidlink.Link:
length := int64(2) // tag,42: 0xd82a
bl := int64(len(lnk.Bytes())) + 1 // additional 0x00 in front of the CID bytes
length += uintLength(uint64(bl)) + bl // length prefixed major 2
return length, err
default:
return 0, fmt.Errorf("schemafree link emission only supported by this codec for CID type links")
}
default:
panic("unreachable")
}
}
// Calculate how many bytes an integer, and therefore also the leading bytes of
// a length-prefixed token. CBOR will pack it up into the smallest possible
// uint representation, even merging it with the major if it's <=23.
type boundaryLength struct {
upperBound uint64
length int64
}
var lengthBoundaries = []boundaryLength{
{24, 1}, // packed major|minor
{256, 2}, // major, 8-bit length
{65536, 3}, // major, 16-bit length
{4294967296, 5}, // major, 32-bit length
{0, 9}, // major, 64-bit length
}
func uintLength(ii uint64) int64 {
for _, lb := range lengthBoundaries {
if ii < lb.upperBound {
return lb.length
}
}
// maximum number of bytes to pack this int
// if this int is used as a length prefix for a map, list, string or bytes
// then we likely have a very bad Node that shouldn't be encoded, but the
// encoder may raise problems with that if the memory allocator doesn't first.
return lengthBoundaries[len(lengthBoundaries)-1].length
}

View File

@@ -0,0 +1,48 @@
package dagcbor
import (
"io"
"github.com/ipld/go-ipld-prime/codec"
"github.com/ipld/go-ipld-prime/datamodel"
"github.com/ipld/go-ipld-prime/multicodec"
)
var (
_ codec.Decoder = Decode
_ codec.Encoder = Encode
)
func init() {
multicodec.RegisterEncoder(0x71, Encode)
multicodec.RegisterDecoder(0x71, Decode)
}
// Decode deserializes data from the given io.Reader and feeds it into the given datamodel.NodeAssembler.
// Decode fits the codec.Decoder function interface.
//
// A similar function is available on DecodeOptions type if you would like to customize any of the decoding details.
// This function uses the defaults for the dag-cbor codec
// (meaning: links (indicated by tag 42) are decoded).
//
// This is the function that will be registered in the default multicodec registry during package init time.
func Decode(na datamodel.NodeAssembler, r io.Reader) error {
return DecodeOptions{
AllowLinks: true,
}.Decode(na, r)
}
// Encode walks the given datamodel.Node and serializes it to the given io.Writer.
// Encode fits the codec.Encoder function interface.
//
// A similar function is available on EncodeOptions type if you would like to customize any of the encoding details.
// This function uses the defaults for the dag-cbor codec
// (meaning: links are encoded, and map keys are sorted (with RFC7049 ordering!) during encode).
//
// This is the function that will be registered in the default multicodec registry during package init time.
func Encode(n datamodel.Node, w io.Writer) error {
return EncodeOptions{
AllowLinks: true,
MapSortMode: codec.MapSortMode_RFC7049,
}.Encode(n, w)
}

View File

@@ -0,0 +1,307 @@
package dagcbor
import (
"errors"
"fmt"
"io"
"math"
cid "github.com/ipfs/go-cid"
"github.com/polydawn/refmt/cbor"
"github.com/polydawn/refmt/shared"
"github.com/polydawn/refmt/tok"
"github.com/ipld/go-ipld-prime/datamodel"
cidlink "github.com/ipld/go-ipld-prime/linking/cid"
"github.com/ipld/go-ipld-prime/node/basicnode"
)
var (
ErrInvalidMultibase = errors.New("invalid multibase on IPLD link")
ErrAllocationBudgetExceeded = errors.New("message structure demanded too many resources to process")
ErrTrailingBytes = errors.New("unexpected content after end of cbor object")
)
const (
mapEntryGasScore = 8
listEntryGasScore = 4
)
// This file should be identical to the general feature in the parent package,
// except for the `case tok.TBytes` block,
// which has dag-cbor's special sauce for detecting schemafree links.
// DecodeOptions can be used to customize the behavior of a decoding function.
// The Decode method on this struct fits the codec.Decoder function interface.
type DecodeOptions struct {
// If true, parse DAG-CBOR tag(42) as Link nodes, otherwise reject them
AllowLinks bool
// TODO: ExperimentalDeterminism enforces map key order, but not the other parts
// of the spec such as integers or floats. See the fuzz failures spotted in
// https://github.com/ipld/go-ipld-prime/pull/389.
// When we're done implementing strictness, deprecate the option in favor of
// StrictDeterminism, but keep accepting both for backwards compatibility.
// ExperimentalDeterminism requires decoded DAG-CBOR bytes to be canonical as per
// the spec. For example, this means that integers and floats be encoded in
// a particular way, and map keys be sorted.
//
// The decoder does not enforce this requirement by default, as the codec
// was originally implemented without these rules. Because of that, there's
// a significant amount of published data that isn't canonical but should
// still decode with the default settings for backwards compatibility.
//
// Note that this option is experimental as it only implements partial strictness.
ExperimentalDeterminism bool
// If true, the decoder stops reading from the stream at the end of a full,
// valid CBOR object. This may be useful for parsing a stream of undelimited
// CBOR objects.
// As per standard IPLD behavior, in the default mode the parser considers the
// entire block to be part of the CBOR object and will error if there is
// extraneous data after the end of the object.
DontParseBeyondEnd bool
}
// Decode deserializes data from the given io.Reader and feeds it into the given datamodel.NodeAssembler.
// Decode fits the codec.Decoder function interface.
//
// The behavior of the decoder can be customized by setting fields in the DecodeOptions struct before calling this method.
func (cfg DecodeOptions) Decode(na datamodel.NodeAssembler, r io.Reader) error {
// Probe for a builtin fast path. Shortcut to that if possible.
type detectFastPath interface {
DecodeDagCbor(io.Reader) error
}
if na2, ok := na.(detectFastPath); ok {
return na2.DecodeDagCbor(r)
}
// Okay, generic builder path.
err := Unmarshal(na, cbor.NewDecoder(cbor.DecodeOptions{
CoerceUndefToNull: true,
}, r), cfg)
if err != nil {
return err
}
if cfg.DontParseBeyondEnd {
return nil
}
var buf [1]byte
_, err = io.ReadFull(r, buf[:])
switch err {
case io.EOF:
return nil
case nil:
return ErrTrailingBytes
default:
return err
}
}
// Future work: we would like to remove the Unmarshal function,
// and in particular, stop seeing types from refmt (like shared.TokenSource) be visible.
// Right now, some kinds of configuration (e.g. for whitespace and prettyprint) are only available through interacting with the refmt types;
// we should improve our API so that this can be done with only our own types in this package.
// Unmarshal is a deprecated function.
// Please consider switching to DecodeOptions.Decode instead.
func Unmarshal(na datamodel.NodeAssembler, tokSrc shared.TokenSource, options DecodeOptions) error {
// Have a gas budget, which will be decremented as we allocate memory, and an error returned when execeeded (or about to be exceeded).
// This is a DoS defense mechanism.
// It's *roughly* in units of bytes (but only very, VERY roughly) -- it also treats words as 1 in many cases.
// FUTURE: this ought be configurable somehow. (How, and at what granularity though?)
var gas int64 = 1048576 * 10
return unmarshal1(na, tokSrc, &gas, options)
}
func unmarshal1(na datamodel.NodeAssembler, tokSrc shared.TokenSource, gas *int64, options DecodeOptions) error {
var tk tok.Token
done, err := tokSrc.Step(&tk)
if err == io.EOF {
return io.ErrUnexpectedEOF
}
if err != nil {
return err
}
if done && !tk.Type.IsValue() && tk.Type != tok.TNull {
return fmt.Errorf("unexpected eof")
}
return unmarshal2(na, tokSrc, &tk, gas, options)
}
// starts with the first token already primed. Necessary to get recursion
//
// to flow right without a peek+unpeek system.
func unmarshal2(na datamodel.NodeAssembler, tokSrc shared.TokenSource, tk *tok.Token, gas *int64, options DecodeOptions) error {
// FUTURE: check for schema.TypedNodeBuilder that's going to parse a Link (they can slurp any token kind they want).
switch tk.Type {
case tok.TMapOpen:
expectLen := int64(tk.Length)
allocLen := int64(tk.Length)
if tk.Length == -1 {
expectLen = math.MaxInt64
allocLen = 0
} else {
if *gas-allocLen < 0 { // halt early if this will clearly demand too many resources
return ErrAllocationBudgetExceeded
}
}
ma, err := na.BeginMap(allocLen)
if err != nil {
return err
}
var observedLen int64
lastKey := ""
for {
_, err := tokSrc.Step(tk)
if err != nil {
return err
}
switch tk.Type {
case tok.TMapClose:
if expectLen != math.MaxInt64 && observedLen != expectLen {
return fmt.Errorf("unexpected mapClose before declared length")
}
return ma.Finish()
case tok.TString:
*gas -= int64(len(tk.Str) + mapEntryGasScore)
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
// continue
default:
return fmt.Errorf("unexpected %s token while expecting map key", tk.Type)
}
observedLen++
if observedLen > expectLen {
return fmt.Errorf("unexpected continuation of map elements beyond declared length")
}
if observedLen > 1 && options.ExperimentalDeterminism {
if len(lastKey) > len(tk.Str) || lastKey > tk.Str {
return fmt.Errorf("map key %q is not after %q as per RFC7049", tk.Str, lastKey)
}
}
lastKey = tk.Str
mva, err := ma.AssembleEntry(tk.Str)
if err != nil { // return in error if the key was rejected
return err
}
err = unmarshal1(mva, tokSrc, gas, options)
if err != nil { // return in error if some part of the recursion errored
return err
}
}
case tok.TMapClose:
return fmt.Errorf("unexpected mapClose token")
case tok.TArrOpen:
expectLen := int64(tk.Length)
allocLen := int64(tk.Length)
if tk.Length == -1 {
expectLen = math.MaxInt64
allocLen = 0
} else {
if *gas-allocLen < 0 { // halt early if this will clearly demand too many resources
return ErrAllocationBudgetExceeded
}
}
la, err := na.BeginList(allocLen)
if err != nil {
return err
}
var observedLen int64
for {
_, err := tokSrc.Step(tk)
if err != nil {
return err
}
switch tk.Type {
case tok.TArrClose:
if expectLen != math.MaxInt64 && observedLen != expectLen {
return fmt.Errorf("unexpected arrClose before declared length")
}
return la.Finish()
default:
*gas -= listEntryGasScore
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
observedLen++
if observedLen > expectLen {
return fmt.Errorf("unexpected continuation of array elements beyond declared length")
}
err := unmarshal2(la.AssembleValue(), tokSrc, tk, gas, options)
if err != nil { // return in error if some part of the recursion errored
return err
}
}
}
case tok.TArrClose:
return fmt.Errorf("unexpected arrClose token")
case tok.TNull:
return na.AssignNull()
case tok.TString:
*gas -= int64(len(tk.Str))
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignString(tk.Str)
case tok.TBytes:
*gas -= int64(len(tk.Bytes))
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
if !tk.Tagged {
return na.AssignBytes(tk.Bytes)
}
switch tk.Tag {
case linkTag:
if !options.AllowLinks {
return fmt.Errorf("unhandled cbor tag %d", tk.Tag)
}
if len(tk.Bytes) < 1 || tk.Bytes[0] != 0 {
return ErrInvalidMultibase
}
elCid, err := cid.Cast(tk.Bytes[1:])
if err != nil {
return err
}
return na.AssignLink(cidlink.Link{Cid: elCid})
default:
return fmt.Errorf("unhandled cbor tag %d", tk.Tag)
}
case tok.TBool:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignBool(tk.Bool)
case tok.TInt:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignInt(tk.Int)
case tok.TUint:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
// note that this pushes any overflow errors up the stack when AsInt() may
// be called on a UintNode that is too large to cast to an int64
if tk.Uint > math.MaxInt64 {
return na.AssignNode(basicnode.NewUint(tk.Uint))
}
return na.AssignInt(int64(tk.Uint))
case tok.TFloat64:
*gas -= 1
if *gas < 0 {
return ErrAllocationBudgetExceeded
}
return na.AssignFloat(tk.Float64)
default:
panic("unreachable")
}
}