Notes Metadata Shaping


What is the incentive to maintain Metadata?
For Digital Rights Management, there is a strong incentive for both royalties and attribution.
Citations and Backlinks Metadata are key for Content Discovery-> Traffic->Influence-> Money
The Problem of Canonical Digital Content
We are the Signer of Contents. Not the ID provider! We use the IDs from the ID providers:
Books: ISBN
Journals: ISSN
Papers/Works: DOI
Humans/Researchers: ORCID
We offer archiver, persistence and storage.
Who provides and validates Content Authenticity?
Trust mechanism. The paper of record to HyperDocument is based on the trust of the Knowledge Community's Archiver/Librarian/Journal authority.
How is maintaining Academic Metadata?
Consortia of Universities and Publishers.
Publisher Associations around a Topic.
Journals for internal purposes.
Big Information companies Clarivadate, Easy Knowledge, etc
Who is Using Metadata Ontologies? Who is querying this information?
Information for their Backends!
Academics for Diagrams for Content-Hypertext Navigation.
Normally in a personal knowledge management context
Where do we get the Schemes?
Internet Archive.
Journals and Libraries have defined their own Schemes, such as the ACM.
How do we deal with different Schemes applicable to the same Content?
Extensible Schemas? Hierarchical Schemas?
When do we use Hyperdocs Internal Metadata, and When is an external Document?
How do we make it decentralized with Plugins?
How are we representing the Hyperdoc Metadata?
Hyperdocs will have a Metadata Viewer. Mark Bernstein is the Title of the document. The document might be empty.
Metadata/Attributes Templates
What should be a block Type, and what is a field on the document or a field in a different document?
How to enforce schemas?
This could be a good argument to use one of the three options.
blocktypes enforce logic to render.
What is the current Mintter Software Architecture?
Document Top Level Fields
Set by the user.
Derived by the system, not set by the user. The system doesn't allow changes to these fields.
Document Entity
Block types
Open-ended attributes. It is a Map of string to strings. When serializing, flatten all the attributes to the top level of the block. having type safety in protobuf but having flexibility.
Values are enforced from the front end.
Values are not enforced in the Schema in the Backend.
Backend is resilient to types: values that it doesn't know about.
Annotation types. It is a Map of string to strings.
Open-ended attributes
Not enforced in the Schema
We don't have Open-ended Attributes at the Document Level. The problem we have is that we want to mix purposes and data types for the same fields.
Triplet Knowledge Bases
Plugins Attributes
What are the types of data structure and metadata of a Hyperdoc?
Content structure such as parent-child hierarchy or block types.
Hypermedia Presentation, Web Presentation, Viewspecs, Layout Attributes
Cover Image, Fonts, etc.
Children Style
Copyright Metadata.
Web importing
Citations/References Metadata.
Metadata to build Linked Data or Knowledge Structures.
RDF, Turtle, etc.
LLMs required/optimized Metadata.
Custom Knowledge Community Metadata.
Claim attributes for web importing
Original Publish date
Original Author
Web capture date
Files for image capture, pdf, html
Do we need Claims for Digital Rights?
Are these Triplets?
Can we use the same triplets system for Digital Rights and Knolwedge Structures?
A reason not to use Claims/Triple sets for Digital Rights is that there won't be guarantee to get the copyright information if you don't you hypermedia System. We loose the Distribution Factor. Same as in comments. For example, if a consumer uses IPLD directly it won't get the Metadata.
Create indexes or databases and signing the route. Subscriptions
BlockType Copyright or License?
A special purpose with three: document, paragraph, annotation.
We dont want
What if an Archiver or Library wants to store Documents and build their Cataloging and Metadata?
It makes sense to have the cataloguing/archiving Metadata outside the document!
That is external Cataloguing, but a document has an internal essence digital rights attached to it! that should be inside the document.


Metadata is data that describes other data.
data structure
Cryptographic level for logic.
Metadata is the selection of literal values.


Top level Document Attributes for Site and App Presentation Layer
Open-ended attributes for the document.
Enforce at Client Level.
Daemon won't Index this information, so it should not require Indexing.
Header as in HTML, Field, or Blocktype?
Archiver Process
Import web, PDFs, etc
Define the Source Block/ License Block/ Authorship Field?
Includes License.
Could be block/field/annotation
Different scheme fields per License/source, which could be a reason to use the block as it has open-ended attributes.
Provide determinism in the import process. The result of the import is deterministic and detached from the identity.
There are hashable deterministic processes:
Web Importing
Overleaf can be deterministic if we define the source and overleaf software versions.
We attached the Original PDFs.
When is signed by the archiver?
Say somehow that the archiver is not the author.
Research Claims/Metadata
External from the document.
The relationship between claims and documents is the same as between comments and documents.
We can start by using our schema for our repository.
We are missing tables.
Use cases for Claims and Metadata:
Build Knowledge structures or Knowledge Bases
Cataloguing is Discovering!
Discover content and Indexing
Team Workflows.
Metadata schemes for a Group.