Creating a Datafile

The actual output. The data that gets flashed onto the FX chip.

I’ll just make up an input format as I go.
(Probably using XML.)

It would be super cool if you used JSON because it’s already being used in the Arduboy file :slight_smile:

1 Like

The endinaness is little ending as far as I know and the file size might need to be a multiple of a blocksize (256bytes I guess). Another output needs to be an .h file as you proposed. I would say just give it a shot and then lets discuss if it fits or if we need to change/add things. Some things pop up as soon as people start using it. I am willing to try it out soon on my devkit.

Yes, maybe we should try to stick to the same language for the input files of our tools. Personally I have no preference but it might be good not to use too much different languages to not clutter the development environment too much. Maybe JSON is even easier to understand for beginners? Or more human readable?

Once you have a library to parse it, it’s trivial but it’s a lot easier to programatically traverse a JSON in C in my experience.

EDIT: I guess it’s probably python that is going to have to gobble this data up, so whatever. I guess its mostly up to blinky if he is writing the utility.

I’m going to need to start thinking about writing a front end that can talk to python soon, or getting someone else on that.

Oh maybe I got something wrong here. The JSON file is just the input to a parser running on a PC, or am I wrong? Or do you mean the PC tool written in C. I guess @Pharap will never touch C :sweat_smile:

I wasnt thinking, I made an edit. It doesn’t matter as much as I thought it did, it’s mostly convention but whatever is easiest to write the tool.

What I am/was proposing is:

{ Resource Builder } -> [Resource Description File] -> { Resource Packer } -> [Binary Image File] -> { FX Flasher } -> [FX chip]

(Where [] respresents a file/data and {} represents a tool/script/program.)

There’s probably more to it than just making sure the data is little endian and the final file size is a multiple of 256…

The best I could do at the moment is just dump the data into a file,
so it probably won’t work.

They’re about the same to understand, but they vary in easiness to write, to read and to parse.

Nope.
As far as I’m concerned C is obsolete.
C++ does everything C does, and does it better.
The only thing C has over C++ is restrict.


The ease of data traversal depends entirely on the library.

Yes. I think some developers might want to add padding or align data inside the flash in some manner. Maybe even fill areas with some value.
I can even think of a maximum size, probably this is something we need to think about as well.

Actually, now I think about it, there’s two things I can think of that might make JSON a bad option.

Firstly, JSON doesn’t accept hexadecimal integers.

And secondly, as far as I’m aware JSON is a key-value format,
which means you can’t have duplicate keys.

So instead of being able to have full control over the data order you’d be forced to bundle all images together, all text together, all sound together and (crucially) all raw data together.

There might be a way around it, but I suspect that would mean complicating the format more.


A comparison of XML vs roughly equivalent JSON:

XML

<?xml version="1.0" encoding="UTF-8"?>
<fxdata>
	<image name="player">
		<frame index="0" source="/images/player/frame0.png"/>
		<frame index="1" source="/images/player/frame1.png"/>
		<frame index="2" source="/images/player/frame2.png"/>
	</image>
	<image name="enemy">
		<frame index="0" source="/images/enemy/frame0.png"/>
		<frame index="1" source="/images/enemy/frame1.png"/>
		<frame index="2" source="/images/enemy/frame2.png"/>
	</image>
	<text name="dialogue">
		<textblock index="0" source="/text/dialogue/block0.txt"/>
	</text>
	<sound name="backgroundMusic">
		<track index="0" source="/sounds/music/track0.wav"/>
	</sound>
	<raw name="puzzle0">
		0x00, 0xFF, 0x88, 0x92,
	</raw>
	<raw name="data0" source="data/data0.raw"/>
</fxdata>

JSON

{
	"fxdata":
	{
		"images":
		[
			{
				"name": "player",
				"frames":
				[
					{ "index": 0, "source": "/images/player/frame0.png" },
					{ "index": 1, "source": "/images/player/frame1.png" },
					{ "index": 2, "source": "/images/player/frame2.png" }
				]
			},
			{
				"name": "enemy",
				"frames":
				[
					{ "index": 0, "source": "/images/enemy/frame0.png" },
					{ "index": 1, "source": "/images/enemy/frame1.png" },
					{ "index": 2, "source": "/images/enemy/frame2.png" }
				]
			},
		],
		"text":
		[
			{
				"name": "dialogue",
				"textblocks":
				[
					{ "index": 0, "source": "/text/dialogue/block0.txt" }
				]
			}
		],
		"sound":
		[
			{
				"name": "backgroundMusic",
				"tracks":
				[
					{ "index": 0, "source": "/sounds/music/track0.wav" }
				]
			},
		],
		"raw":
		[
			{ "name": "puzzle0", "data": [ "0x00", "0xFF", "0x88", "0x92" ] },
			{ "name": "data0", "source": "data/data0.raw" }
		]
	}
}

<image name="player" offset="0x010000">
	<frame index="0" source="/images/player/frame0.png"/>
	<frame index="1" source="/images/player/frame1.png"/>
	<frame index="2" source="/images/player/frame2.png"/>
</image>

Note: index is optional in both formats.

What is your intended use for index?

To allow the frame tags to be out of order when the files are being written by hand.
That also allows the data can be hastily reordered if need be.
The generator probably wouldn’t bother to emit the index.

Maybe that’s not even that much of a benefit and index could just be dropped?

We shall see I suppose.

1 Like

I wonder if something simple like:

{
“name”: “…”,
“source”: “file or list of values”,
},
{
“name”: “…”,
“source”: “file or list of values”,
“offset”: “from previous entry”,
“pad”: “optional pad value for data between previous entry and this one”,
},
{
“name”: “…”,
“source”: “file or list of values”,
“align”: “align this data to a specific boundary”,
}

would be enough. Maybe you have some things in mind with all the tags. Can you elaborate?

What do you mean by “all the tags?”.

I the tag you use in your xml example. E.g. sound, textblock, frame, image, fxdata, textblock, track. They seem to complicate the direct conversion to JSON. I am just asking because I want to understand your intention behind this. Not judging anything.
I am not a big user of JSON but I wonder why it seems not suitable as it was invented for object storage of java script. So it looks like it was meant for something like we intend.

The different tags are there to make it easier for the packer to identify the type of data.
Otherwise the packer has to rely solely on the file extension for external data, and it makes it almost impossible to represent sprites (unless you add an attribute to specify ‘this data is a set of sprites’).

I suppose strictly some of the other tags could be simplified until it’s just like this:

<?xml version="1.0" encoding="UTF-8"?>
<fxdata>
	<image name="player">
		<frame source="/images/player/frame0.png"/>
		<frame source="/images/player/frame1.png"/>
		<frame source="/images/player/frame2.png"/>
	</image>
	<image name="enemy">
		<frame source="/images/enemy/frame0.png"/>
		<frame source="/images/enemy/frame1.png"/>
		<frame source="/images/enemy/frame2.png"/>
	</image>
	<text name="dialogue" source="/text/dialogue.txt"/>
	<sound name="backgroundMusic" source="/sounds/backgroundMusic.wav"/>
	<raw name="puzzle0"> 0x00, 0xFF, 0x88, 0x92 </raw>
	<raw name="data0" source="data/data0.raw"/>
</fxdata>

But I was thinking there might be some similar index-based API for playing sounds and loading text.

All the offset, alignment and padding stuff can be accounted for easily enough with attributes.

Why would I be converting XML to JSON?

I think “invented” is a bit of a strong word.
JSON’s essentially just a subset of JavaScript.

When it was first put to use it was literally just a matter of dumping valid JavaScript object definitions into a file with a .json extension and calling eval to load the objects.
(Hopefully everyone has switched to a proper parser by now and nobody is still using eval.)

The same thing could be done with any language that has an eval,
the reason JSON got popular is because JavaScript dominates the web.

I see. We could use a type key in the JSON file to give hints to the parser: “type”: “image”.

Language barrier. You gave two examples, one XML and after that, one in JSON and to me it looked like the JSON file was complicated because you based the JSON output on the XML input.

Yes. Maybe because of its popularity it is worth considering it as input language.

However, one strong argument for an XML file for me at least would be a DTD file. So we can check if the file is well formatted.

Just for understanding…
This datafile is supposed to be generated somehow with a magic tool by feeding it assets.
Then the actual assets are packed with another magic tool with the information from the datafile into a binary which can then be put on the fx chip?

Why do we need different tools for that and can’t one tool that builds and packs it and creates a ready to use binary file?

Sorry for interrupting with stupid questions.

It was supposed to be a one-to-one comparison of data with the same structure.

Ok, let’s try this the other way around then.
Here’s a ‘flat’ structure in both formats:

JSON

{
	"name": "player",
	"type": "image",
	"width": 32,
	"height": 32,
	"frames":
	[
		{ "source": "/images/player/frame0.png" },
		{ "source": "/images/player/frame1.png" },
		{ "source": "/images/player/frame2.png" }
	]
},
{
	"name": "enemy",
	"type": "image",
	"width": 32,
	"height": 32,
	"frames":
	[
		{ "source": "/images/enemy/frame0.png" },
		{ "source": "/images/enemy/frame1.png" },
		{ "source": "/images/enemy/frame2.png" }
	]
},
{
	"name": "dialogue",
	"type": "text",
	"source": "/text/dialogue.txt"
	"relativeoffset": 64,
	"padding": 128
},
{
	"name": "backgroundMusic",
	"type": "sound",
	"source": "/sounds/backgroundMusic.wav"
	"alignment": "256
},
{
	"name": "puzzle0",
	"type": "raw",
	"data": [ 0, 255, 136, 146 ]
},
{
	"name": "data0",
	"type": "raw",
	"source": "data/data0.raw"
}

XML

<?xml version="1.0" encoding="UTF-8"?>
<image name="player" width="32" height="32">
	<frame source="/images/player/frame0.png"/>
	<frame source="/images/player/frame1.png"/>
	<frame source="/images/player/frame2.png"/>
</image>
<image name="enemy" width="32" height="32">
	<frame source="/images/enemy/frame0.png"/>
	<frame source="/images/enemy/frame1.png"/>
	<frame source="/images/enemy/frame2.png"/>
</image>
<text name="dialogue" source="/text/dialogue.txt" relativeoffset="64" padding="128"/>
<sound name="backgroundMusic" source="/sounds/backgroundMusic.wav" alignment="256"/>
<raw name="puzzle0"> 0x00, 0xFF, 0x88, 0x92 </raw>
<raw name="data0" source="data/data0.raw"/>

Both are popular with different groups of people.

That’s a good point.

JSON does have JSON schema, but I don’t think it’s as ubiquitous as DTD.
There’s also XML schema, which is apparently more powerful than DTDs,
but I only have experience with DTDs so I can’t vouch for what it’s like.


The resource builder is the program that builds the resource description file (the .xml/.json file).
I imagine the resource builder to be a GUI program that allows the data to be reordered/organised et cetera.
Presumably there would be a way to make a command line program too.

The resource description file lists all the resources that need to be included and either includes the data directly or references an external file.
This file is needed so people can have fine control over where their data goes.
The reason it’s a file is so that:

  • The configuration can be saved and moved around
  • There can be more than one ‘resource builder’ program

The resource packer is the program that takes all the data from the resource description file (and the external files it references) and builds them into a binary image file that can be flashed onto the FX chip.

The binary image file is just raw data that can be instantly flashed onto an FX chip.

The FX flasher is the tool that flashes the FX chip.

The FX chip is… the FX chip.

The three ‘programs’ don’t have to be different programs,
this is just an illustration of the process.
We may get to a point where there’s just one program that does it all.

However, having the programs/tools separate allows us to:

  • Build different tools at different times
    • I.e. we start by building the flasher (I believe we already have one written in Python), then we build the resource packer, then we build the resource builder. That way we’re not stuck, people can start off working the hard way and things will gradually get easier as we build more tools
  • Swap tools out
    • If we decide one of the tools or file formats isn’t working, we can swap it out for something different.
  • Add extra processing steps
    • This may or may not be a bonus, but by having a chain of programs it’s easier to insert extra programs to do additional processing

Your questions were perfectly valid.

1 Like

In that case we wouldn’t need a datafile like discussed here or do we?

1 Like