Hey thanks a lot for all the positive responses!
@JO3RI thanks a lot for the links. Actually what you did for Mystical Balloon is very close do what I do currently. However I use 2 bit per tiles so I can differentiate between empty tiles, grave tiles (background props), 'wall' tiles and ground tiles. Each tile is rendered differently depending on its neighbor. For example first empty tile under a wall tile as a small effect, first ground tile as some grass, etc...
However I do think this is not enough as we also want to have interior stages which will have subtle tiles such as chains holding the platforms and windows/walls in the background. Because of that I need to increase to 4 bit per tiles which will give 16 tiles to play with! But also double the size of all stages...
So I though of another technique which I will try to explain here. Maybe somebody already tried it and can tell me in advance it's a bad idea
In the level there is lot of repetition so it seems that RLE can give good compression rate (to be checked with real data). The problem is it would take too much RAM to decompress a full level.
Frame rate is 60FPS.
Each tile is 8x8 pixels and there are 16 different tiles (4bits per tile).
The game only scrolls horizontally so in height there is always 8 tiles, not less, not more. Width of stage can vary (max is 256 tiles).
Split each stage into RLE compressed blocks of 16x8 tiles. The RLE would simply bit 4bit RLE + 4 bit tile data, so it fits into one byte which makes it easier/faster to decompress. As one block takes the whole screen horizontally only two blocks need to be loaded into the RAM at the same time (16x8x2x4=1024bits=128bytes). When the player enter a new block I only need to load one block of 16x8 tiles into the RAM. Now the big question is can I decompress the block within one frame without having the player noticing it?
If this is too slow I can also try to use smaller blocks (e.g. 8x8). However I'm scared this would dramatically reduce the efficient of the RLE...
So basically the idea is to have a 'streaming RLE algorithm'.