Book's object format A

Find a file

Booklordofthedings 3a4b4c9424 Merge pull request 'Update to version 1.0.0' (#3 ) from dev into main Reviewed-on: #3		2024-08-26 13:33:37 +02:00
src	finished serialization	2024-08-26 12:35:29 +02:00
.gitignore	initial commit	2024-05-11 19:11:28 +02:00
BeefProj.toml	added copyright stuff	2024-08-26 13:31:44 +02:00
BeefSpace.toml	added testing + some performance improvements	2024-08-14 21:46:36 +02:00
LICENSE	initial commit	2024-05-11 19:11:28 +02:00
README.md	Update README.md	2024-08-26 13:29:32 +02:00

README.md

Bofa - Books Object Format A - 1.0.0

Bofa is an object notation for general purpose use in serialization and deserialization.
The primary focus is on being human readable, while also being extremely easy to parse
This repository implements a Bofa parser in the Beef programming language.

# A floating point number
n number 125.435
l line this is a normal line
a Array
 t text you can
 - put more text in here
 t text2 and even
 - more in here

Index

How to install
Changelog
The Bofa notation
Parser notes
Bofa parser API usage
Potential improvements

How to install

Download the repository, either with git, the download zip functionality or the release tab
Create or open your existing Beef project
Right click on the workspace panel
Select "Add Existing Project"
Navigate to the BeefProj.toml file inside the Bofa folder
Right click on the project you want to use Bofa with
Click "Properties" -> "Dependencies"
Check the box that says Bofa
Add the Bofa namespace to the files you want to use Bofa with
Use Bofa

Changelog

Version 1.0.0
- Initial Version
- Fast parsing
- Add automated serialization generation

The Bofa notation

The notation itself is pretty simple.
Every line contains one Bofa "Entry", but empty lines and comments are also allowed.
A normal Bofa entry has 3 Parts: {Type} {Name} {Value}

The possible values of {Type} are:

l - Line - A string (utf8) without a newline character
t - Text - A string (utf8) that may contain newline characters
b - Boolean - A boolean value of either "true" or "false"
i - Integer - A 32bit signed integer
ui - UnsignedInteger - A 32bit unsigned integer
bi - BigInteger - A 64bit signed integer
bui - BigUnsignedInteger - A 64bit unsigned integer
n - Number - A 32bit floating point number
bn - BigNumber - A 64bit floating point number
c - Custom - Any custom type in a string (utf8) representation
o - Object - A Key/Value container for other Bofa entries
a - Array - A container for other Bofa entries, where entries may have identical names

{Name} is simple just a string that does not contain a space or tab character
{Value} is a string that represents the choosen datatype and can be parsed into the datatype

Empty lines are full ignored and comments start with a # as the first non whitespace character in the line.

An entry of type custom needs additional information to work.
Instead of {Type} {Name} {Value} a custom entry uses {Type} {Typename} {Name} {Value} instead.
Similarly arrays and objects dont actually need a entry for {Value} and just use {Type} {Name}.
In order to allow multiline text without too many issues the - character is used to indicate that the content of this line should be appended to the last line if it was of type text.

Membership of containers like arrays or objects is indicated via a depth value.
In order to indicate depth the start of the line may contain any number of tabs or spaces, the total amount of which indicates the depth of the entry.
The exact object it is a member of is always the last defined object at a given depth.

a Array1
 a Array2
 a Array3
  l Line this line is a member of Array3 which is itself contained inside of Array1

Objects have a key uniqueness rules, while Arrays dont have this (Entries of an array still need to have a proper name though, even if its only "-").

Parser notes

Bofa should always be parsed loosely and to the best of the parsers ability, but never fail completely.
Instead it can indicate on which line an error occured and leave the user to figure out wether to stop or to continue.
This also enables a parsing pattern where Bofa data can be structured so any parser can read a document from a newer or different parser:

# The following line has a type that was added by a imaginary parser using the Bofa 2.0 Standard
rgb color 255:0:255
c rgb color 255:0:255

A parser that supports 2.0 should just parse the first entry and ignore the second, as it contains a dublicate name.
A parser still on an old version will ignore the first one, since it doesnt know a type named rgb but instead parse the second one correctly

While the insertion at a depth needs to be done in the order of the document, the parsing and validation of the individual line does not.
As such you can theoretically multithread the parsing of each individual line to get some potential speed.

Different parsers may add different types for their usecases, but single and 2 character typenames should stay reserved for official notation types.

The entire standard is case sensitive.

Bofa parser api usage

This shows of an example of how to use the parser api

namespace Example;

using System;
using Bofa;

class ExampleClass
{
    public static void Main()
    {
        String bofaString = """
        # A floating point number
        n number 125.435
        l line this is a normal line
        a Array
        t text you can
        - put more text in here
        t text2 and even
        - more in here
        """;

        Dictionary<StringView, Bofa> output = new .();
		List<int64> errors = new .();
		BofaParser.Parse(bofaString, output, errors);

        DeleteDictionaryAndValues!(output);
		delete errors;
    }
}

Potential improvements

There is still alot of speed left on the table
- Reduce the amount of allocations
- Multithread initial parsing
New useful types
- Color
- Version
- Encrypted raw data
More automatic serialization/deserialization in Corlib
- Hashes
- Pointers
- Tuples