Page 1 of 1

Archive folder format?

Posted: Mon Nov 28, 2011 11:37 pm
by MadPumpkin
I've written a small dynamic library for archiving files, and soon will add encryption and compression functionality. Right now I'm working on the 64 bit encryption functions. It's actually a bit harder than I thought it would be given that I had minimal previous knowledge before coding engagement and research for such :P.

I have my archive format for use in my game and other applications. Just like a .zip, .rar, jar, pak etc. They have many differences most small/subtle. But I chose to go with my own format just for the sake of ease of writing the tools for it. The main question of the matter is, how should I store folders and why? I've thought of two methods that sound logical:
  • Each files name will be a path containing slashes that the library interprets as path directories.
  • Also store directories similar to how I store files, using a linked list system. Then store them the same way that files are.
They both seem logical, but does anyone have experience with which one might be better, or maybe just some logic reasoning as to why?

Re: Archive folder format?

Posted: Tue Nov 29, 2011 12:56 am
by tappatekie
May I ask what language you are using?

Re: Archive folder format?

Posted: Tue Nov 29, 2011 1:09 am
by MadPumpkin
It's in C++. I could have easily achieved all of it (and could even change it to,) standard C. But I love namespaces, and all (mine at least) programs that will be using it are mostly C++ anyways.

Re: Archive folder format?

Posted: Tue Nov 29, 2011 1:16 am
by tappatekie
MadPumpkin wrote:It's in C++. I could have easily achieved all of it (and could even change it to,) standard C. But I love namespaces, and all (mine at least) programs that will be using it are mostly C++ anyways.
Don't really know C++ but know the syntax near enough since I know C#.

In my experience, I developed a File system in 1 file (archive really...) and I used a file table which contained both directories and file data. Every entry had a parent entry id as well as an ID and a boolean flag to tell the system that it is a directory or file

Re: Archive folder format?

Posted: Tue Nov 29, 2011 1:21 am
by MadPumpkin
Hmm sounds like it could work out. But I've already started coding it the first way I put (I think it was the first). Where each files "name" is a full path to the file, the archive file itself being the highest directory that is. and I generate folders via the string before slashes, and use slashes to place the file. It seems like it would be easier to show a file tree, without having to have nearly as much data actually stored in the archive file to me.

Re: Archive folder format?

Posted: Tue Nov 29, 2011 1:30 am
by tappatekie
Ok, sounds like your going to have something like
myDir/file.txt

then you define myDir as a directory from processing the slashes right?

If that is the case, what if you have a file without an extension? Unless your checking if it ends with a slash?
Wups did'nt read the "before the slashes" bit

Re: Archive folder format?

Posted: Tue Nov 29, 2011 1:34 am
by MadPumpkin
Yup, using the slashes and what's before them specifically :]. So if a slash is found, but more string is after it without a slash, it's the file name.

EDIT: I'll probably do something semi token based, it would probably make processing things as separate folders easier anyways since I could more easily check if a folder has already been created. To avoid any not only non-allowed, but unwanted directory duplicates that is.
EDIT AGAIN: Hmm.. Probably just store the directory names in an array of some kind. Simple enough :P.

Re: Archive folder format?

Posted: Tue Nov 29, 2011 1:37 am
by tappatekie
Hmm, Just out of curiosity, how are you going to save the content to the file? As in every time a file is changed, you re-save the whole table or do you write to the file but only write the necessary changes?

And from your edit. It will make things more faster too :D

Re: Archive folder format?

Posted: Tue Nov 29, 2011 1:50 am
by MadPumpkin
I already have the saving and sorta loading, complete. The header contains a "magic number", which you probably know, but it's just preset byte data, so that you can just check if it exists (and what it says) and not bother reading anything more if it doesn't. A file version, just to know what version of the load tools to use, or if it's even compatible. a unique header ID, not sure why I bothered with this since I personally am not even sure what I'll use it for. Most important part of the header, how many file entries the archive has.

The data is stored file by file like so:
  • file name
  • size
  • byte offset (calculated at file output/save time)
  • the full file data in bytes
it's loaded by using offsets, including the header file entries, to find offsets for each individual file and where their own data starts, then it's loaded back into a linked list after decompression and decryption (if needed)