Modding rundown: Sound

Started by Noelemahc, December 05, 2006, 11:53PM

Previous topic - Next topic
I though you used the PS2 voices okay? The side-effect of the compression the PSP version uses is that you CANNOT make sound it any better than it does on the PS2 version, because its native format uses 16700 Hz sound, which is way weaker than the 22050 Hz the non-Sony versions use. Then again, nobody knows what the PS3 uses...

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

December 13, 2006, 02:54AM #16 Last Edit: December 13, 2006, 03:18AM by Noelemahc
Alright, time for a smallish update: I've started work on getting Hawkeye's voice back into the game. Aside from the small issue that the PSP version has WAY more sounds than the PC one (and as I do not really know the name tags for them as the PS2/PSP format does not STORE them), it's going... goingy. I've converted all the sounds to the proper format (I checked it by mimicking the procedures I did when I was seeking out how to decode them, and believe you me, it sounds as awful as it should) and am now seeking logic in how the ZSS/ZSM header is formed and how to get the sounds to work.

Good news is, there IS a logic to the header. Bad news? I am yet to get any real results. The game still thinks the (makeshift thrown-together on the base of Colossus' ZSMs) sounds aren't there, which bugs me incessantly.

EDIT: Okay, so the Wiki @ Xentax (link) holds all the necessary data on how the header is formed. Goody. I feel slightly worried by the prospect of facing a hash check in there.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

Quote from: Noelemahc on December 12, 2006, 01:05PM
I though you used the PS2 voices okay? The side-effect of the compression the PSP version uses is that you CANNOT make sound it any better than it does on the PS2 version, because its native format uses 16700 Hz sound, which is way weaker than the 22050 Hz the non-Sony versions use. Then again, nobody knows what the PS3 uses...

Colos en mk sounds from ps2 aren't final sounds, bad quality and some sounds are missing But it is better than no voices at all.

Oh. Still, I do not know how to pack files into that format. It is completely different from the one used in the PC version, so I wouldn't even know where to look. Sorry.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

Anti-update: the Xentax Wiki is right, those portions separating the hexes listing the number of entries per section and the offsets to section starts do feel a lot like hashes. Which means that until we (I?) figure out how they're calculated or what they are if not hashes, the project is at a standstill as the game blatantly refuses to see the file. Which explains, in a way, why you can't just rename someone else's voice file and use that - the game shall initiate a hash check and the hash won't match up with the different filename. Poof, no sound.
Makes me want to find and gut whoever invented this format of storage.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

QuoteBecause, a year or so ago, the only part of XML2's PSP characters that I could get working was the sounds. And all I did was insert them into the proper folders.
Tried it out with the files Piccolo provided. Didn't work. Granted, the game doesn't CTD, but it's only another sign that XML2 was better adapted to the PC environment than MUA was. The sound is still not working. So, all hope is now on the manipulation of files. Yesh.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

December 20, 2006, 06:12AM #21 Last Edit: December 20, 2006, 07:13AM by Noelemahc
Okay, people (and especially nba2k!), here's what I know about the ZSM/ZSS format (based in part on the data from Xentax):
{for purposes of simplification, let's say that everything within square brackets is a hex number in a single byte, like so: [7c][4a], etc.}

They are of Little Endian order (which means that an offset of 1234 hex will be recorded as [34][12]).
The first 100 bytes make up the header data. They consist of the following:

Bytes 0x0 to 0x7 - file type (I've seen this be ZSNDPC, ZSNDXBOX and ZSDNPS2, if the string is less than 8 symbols, the superfluous ones will be filled up with spaces - that is, [20].
Bytes 0x8 to 0xb - file size. This is the total file size in bytes, in hex, and a multiple of 8. Sometimes it is off by quite a few bytes in either way (and NOT a multiple of 8), yet miraculously works in-game. This probably has to mean something, but for the purposes of our study, let's assume that it should always equal the file size or be rounded UP from the file size to the nearest multiple of 8.
Bytes 0xc to 0xf - distance from offset 0x10 (including that offset) to the start of the actual sound data (or, rather, the last byte just before that). The Xentax Wiki calls this "directory length", you'll now learn why.
Bytes 0x10 to 0x63, the remainder of the header, is split up into seven twelve-byte sections (that would be 0x10 to 0x1b; 0x1c to 0x27 and so on). Each of those, in turn, is split into three four-byte subsections (for group 0x10 to 0x1b that would be 0x10 to 0x13, 0x14 to 0x17 and 0x18 to 0x1b). Each twelve-byte group is devoted to a section of the file, like a subdirectory of sorts.
The first four bytes of each twelve-byte group lists the amount of entries this twelve-byte section (directory?) covers. If the ZSS houses, say, X sound files, the first group shall claim it has lots more than the others (I do not know why, or how and what it counts -- for some files it is X+2, for others, it is heaps more), the second and third groups will have X listed in that part; the others will have all zeroes in that part of their listing.
The second four bytes of each group are an offset to a seemingly random string of symbols the Xentax Wiki suspects to be a hash check for that section. Obviously, for the first group it will always be offset 100d (0x64).
The third four bytes of each group refer to the offset to the start of the section's actual data.

For directories 4 to 7 the first four bytes are always all zeroes, and the second and third always equal the numbers at offset 0xc. I have no idea why, but that's the way it works. Sheesh.

Now, on actual data.

Directory 1 houses seemingly random numbers as 'files'. Each 'file' is a 24-byte entry, with the following format:
First two bytes are the entry number. They start from zero and move on until the number before the one named in the header's 'amount of entries in directory' entry for this directory (which makes the total equal to the number listed in the header -- as entry zero obviously get counted too).
Then comes a seemingly meaningless fixed set of symbols: [00][10][7f][00][1f][00][00][7f][00][7f], and then come twelve zero bytes. This part is identical for every file in MUA.

The hash in this section is always 8 times Y long, where Y is the amount of entries the header lists for this directory. This comes off as a confusing matter for the obvious reason that Y is not in any way related to X, the amount of actual SOUND FILES the archive stores.

Directory 2 stores the file properties. Here, the hash is almost the same, as it is always 8 times X bytes long, where X is the amount of sound files stored in the archive. Yet, there is no discernible pattern to it (i.e. how or why does it look like that).
Each 'file' in this directory is 24 bytes long, the first two bytes being the number in hex, like in the previous one, followed by two zero bytes. Then come two more bytes, housing [22][56] for every entry -- it's 22050, the frequency of the sound for the PC and X-Box versions, in hex (and little endian order, of course). The remaining 18 bytes are zeroes.

Directory 3 lists the file names and locations. The hash for this section is identical in size to the one for the previous one.
The entires are all 76 bytes long, and are in the following format:
First four bytes -- offset to this file's start. Always a multiple of 16 (the listed offset always has a zero for its last digit).
Second four bytes -- length of this file.
Third four bytes -- the format code. This corresponds to the WAVE FMT table, I think, as the X-Box version lists [01] for every entry (which stands for its ADPCM format) and the PC one lists  [6a] for it (which I'm yet to find in any table I've looked at so far -- anyone know any tables for the WAV format tags?). Suffice to say, that in the PC version is ALWAYS should be [6a][00][00][00] if we want to keep the same sound format. One of these days I got to try what will happen if I use 01 instead of 6a...
Anyway, the remaining 64 bytes of each entry are devoted to storing the filename (with its extension). According to shared_sounds.XMLB, only the start of a filename matters -- the numbers don't mean anything for real, nor are fixed in any place other than the hashes.

Once this section ends, there is an indeterminate amount of zeroed bytes until the sound data starts. It would appear that the zeroes don't mean duck since the game only cares for the offsets in the header to find the stuff (or, in this case, the offsets in section three to the file data).

The files are headerless WAVs in their own silly exotic format which VOX Studio 3 identifies as a VoIP sound format utilized by the cards manufactured by BiCOM - a 4-bit, 22050 Hz ADPCM subtype, to be exact.
Once a file ends, unless it is the last file of the archive, the distance to the next file is padded with zeroes till the next multiple of 16 (well, 10 in hex). BUT if the padding ends up being less than 10 (in decimal) bytes, it gets padded until the multiple of 16 AFTER that. Why - beats me, but no pad is smaller than 10 bytes because of it. The only case in which it won't get padded is if, by a stroke of luck, the byte right after the end of this file is on an offset that is a multiple of 16. Convoluted, yes, but still.


Okay, we can extract and convert the files OUT of this format all fine, obviously. Even convert them INTO this format... But how, in the name of all that's (un)holy, do we build our own ZSS/ZSMs and make custom voicepacks for custom characters? By hand? Shooor, via a hex editor. But the game will refuse to recognize the file -- trust me, I've tried already.
How is the hash formed? Is that the key? Does it include the filename of the archive as an overencryptor or not? Is it universal for each directory OR is it counted on a per-entry basis? The second sounds more plausible as there are obvious three-[00] patches in the hashes corresponding to the big blocks of zeroes in the descriptors of each section that separate what I think to be sub-hashes pertaining to each entry. That would mean that the mini-hashes are only 5 bytes long, and are aimed against the hugely different amounts of data seen in different sections' entries. All in all, this format might win the confusion prize even over the IGB one -- that one at least has a publicly available, if dated, builder program.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

While there's nothing else to do, I tore out all there is to tear, audio-wise, from THQ's excellent Punisher game. I'm now processing the long list of Punisher lines for the purpose of crafting a potential soundset for you-know-when.
"It's uninhabited now."
"Where can I get more guns?"
"Left my scorecard at home."
Etc., etc.
We can even add Tom Jane to the actor list once (if?) we do this, y'know. THAT much we already can do.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

December 26, 2006, 07:26AM #23 Last Edit: December 26, 2006, 07:42AM by Noelemahc
Okay, fresh nothings from my brainbox.

The PSP/PS2 format (ZSNDPS2) is actually the same, despite my previous allegations. It just has the filenames cropped out and instead replaced by numbers (which is exactly why Game Extractor thinks it to be a different format - it FIRST looks for the names and THEN for everything else). The rest of it functions in exactly the same manner - the header, the hashes, the rest of it too. At least now, knowing this, I can check if the X-Box soundsets of the PSP exclusives that I've got are complete or not, because the file amount number doesn't go anywhere :P

EDIT: Better yet, two of the three hashes don't change between the PSP and PC versions (based on simple comparison). That means they only reflect the contents of a specific 'directory'. Good. The bad thing is, of course, the difference of the third hash and the impact it may have on my editing.

EDIT2: More comparisons. I'm on a roll, dudes. The X-Box and PC sound formats use the same hashes for the first and third directories. Which means that, with a bit of stubborn idiocy, I may actually be able to compose stuff, by taking the first and second hashes from the PSP version, the third one from the X-Box one and just changing the sound data. It won't allow me to, say, make a Punisher sound set, but it may allow me to at least convert the X-Box audio for the exclusives to the PC.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

Okay, I've recompiled a Hawkeye FX set, hashes and all, only to have it CRASH on me at loading. Damn. After killing two hours on recalculating all the offsets by hand. All I want for New Year is a program that would do this for me...

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

You're not gonna believe this, kids. I've (accidentally) found a game that actually uses the same ZSND format as MUA and XML... Yes, of course it is the monstrosity mostly known as X-Men III The Official Game. I DID remember I saw those weird ZSS files in some other game-- and finally remembered what game this was.
Yeah, I know I should be hung, drawn and quartered for buying shovelware like that -- then again, I occasionaly find playable gems among the shovelware kind (kill.switch had managed to depose Splinter Cell Pandora Tomorrow from my most-played-per-day-game position for quite a while).

But this is beside the point. XM3TOG uses the same sound format as MUA, right down to the frequencies and encoding... And ZSS formatting. Gives me more materiel to study the header styling with, mwahaha.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

January 16, 2007, 09:18AM #26 Last Edit: January 16, 2007, 09:31AM by Noelemahc
Fresh progress report from the sound editing front. This is still the same Deadpool power sound file being torn apart, just because I like it.
http://h1.ripway.com/ivank/dphero_m-rev2.rar
How is this different from the old version?
( http://h1.ripway.com/ivank/dphero_m.rar )
One thing: it uses a non-VOX codec to encode the inserted file. To be exact, this is a simple IMA ADPCM 4-bit 22050 Hz WAV. Not too pretty, and it sounds distorted as heck in-game (well, as distorted as the previous version, actually)... BUT. It not only PLAYS inside of the game without a hitch, hang or CTD, it also can be played back without any conversion upon being taken OUT of the ZSM and getting a WAV header slapped on. So there. I'm still lost without a way to get the sound I import to play back in-game without corruption, but at least I can now preview what I insert outside of VOX Studio... And I'm still looking for codecs that won't CTD the game when used so as to get away from using VOX Studio and its ghastly limit of 5 seconds' audio only which would put a crimp on getting Widow's voice into the PC version due to her eloquency.

EDIT: Screw that, kids.
I'm an idiot. I changed the sound type from [6a] (VOX) to [01] (Microsoft WAV) and inserted a pure unmodified sound dump from the X-Box.
And? Right-o. It not only WORKS, it sounds a heckuva lot cleaner than the native stuff.
http://h1.ripway.com/ivank/dphero_m-rev3.rar
Note the size jump though. This is from ONE file change. Imagine now, that Hawky's taunt file (hawkeye_v.zss) has 76 files in it. Total weight of the source WAVs (one of which was used for the example)... 7 megs. So I'll have to compress them (but these are unmodified unfiltered 22050 16-bit WAVs, kids -- they can be made a lot smaller without much in the way of losses).

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

Okay, working on. Despite the ability to define the WAV format codes in the ZSS, the game fails to interpret some of them correctly. As such, OKI [10] and IMA [11] ADPCM, which aren't much different from the format the game itself uses as default, don't work even at gunpoint, they come out as pure static. Same goes even for pure vanilla Microsoft ADPCM [02], which is quite surprising.
Equally, non-power-of-2-bit WAVs are distorted. Hadn't tried 24-bit or 32-bit WAVs because my system can't even play those back, but 8-bit ones don't work none. So, it's either 16-bit or 4-bit. Either humongous sizes, or crappy quality.

I'm done jumping through hoops, kids. Gonna try to put together the Hawkeye voice files together now and see if it works out.

On a seemingly unrelated note: further study of the X-Men Official Game ZSS files revealed one interesting fact... No hashes. The hash sections (or, rather, what I, after the Xentax Wiki, suppose to be hashes) are simply missing, and so are all the references to them. Gonna try that idea too, what would happen if I was to remove them altogether.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.

could the sounds from XM3TOG be used for the NC mod thats in the works?

Yes, I think so. Would require lots of repacking though, as they're packed on a per-level basis (i.e. to make taunts we'd have to rip them, rework them and convert them back). Do you want to?
I'd rather use XML2 ones.

Crimson Dynamo says:
In Soviet Russia, the games mod YOU!

If anyone needs any art for icons or portraits, feel  free to ask, I'll see what I can do.