Meta standard for data reusability
Ever since the creation of first software systems, such as life itself, something akin to evolving file formats has emerged, and they had taken very specific shapes. For example, the fact that we are using very specific sequences of symbols like curly braces
01111101) to group larger chunks of code in most programming languages, may be quite arbitrary. The fact that life in DNA often uses very specific sequences (such as
stop codons to group a protein may also be quite arbitrary. Every new version of Microsoft Excel may define slightly different .xls file format. The same is true for APIs and their versions. We have a problem of complex evolving data and systems.
Today, part of the problem is addressed via intermediate level standardization, such as REST (Representational State Transfer), and OpenAPI (aka Swagger) specifically, enabling to create generic clients. For the sake of interoperability, engineers tend to choose to comply with the standards. However, standardization approach does not generalize.
From the scientific and data analyst's perspective, much like the text written in languages of other cultures, the format and the very specific shape that the data comes in, carries the information about the circumstances and the nature of the information sources that created them. They are the cultural and technological legacy, which, ideally, is left functional, accessible and easily reusable.
This goal would be to create a maximally general metastandard that maximizes reusability of any data and any systems, in order to make them relevant in the future of the evolution.