|
Post by SheWolf on Dec 19, 2011 11:48:23 GMT -5
bros,
this isn't really about the number side, or balance or anything like that. still, with the large portion of people with a coding/programming backround i thought i'd ask for help here. plus people here have experience with decrypting/unpacking the game files used by BF3, form what i understand.
so what i'm trying to do is this: i want the americans to speak english, and the russians to speak russian. no hilarious english with borderline-racism fake accents, but russian^^
i think i narrowed it down to the en.toc and en.sb files (or ru.toc and ru.sb in the case of the russian version)
having both sets of the files in the folders allows me to switch voiceover language from english to russian. the problem of that is: now everyone speaks russian. even the americans, only with hilariously offensive steretypical american accents on their side now.
so what i aim to do is:
- decrypt / unpack / whatever i have to do to be able to work on the .toc and .sb files - combine the english content of the american side from the english files with the russian portion of the russian side from the russian files - re-pack the whole thing to have a set of files that has the americans speak english and the russians russian.
however, i have no clue how to even start, how i would begin about taking appart the .toc and .sb files, or if i'm even right about those being responsible! this is where i need you bros, especially you who are less clueles than me about the whole inner workings of the game files. every kind of help would be apreciated.
|
|
|
Post by frankelstner on Dec 19, 2011 16:03:47 GMT -5
You are correct about the en.toc and en.sb files. The toc files are XOR encrypted (yeah, the notorious encrypted files; the only encrypted files in fact). I've written a tool to undo the encryption: www.bfeditor.org/forums/index.php?showtopic=15524&view=findpost&p=105052As a result you get a binary table of contents (hence the extension toc). If you stretch it to 0x31 or 49 bytes per line the file reveals its very neat structure (this toc is really neat compared to some other toc files). Obviously the file size is specified by just 4 bytes, a file has a 16 bytes ID and an 8 bytes offset. The offset specifies the position of the file in the sb file, length means its length in the sb file, and the ID is how the file is generally looked up. I suppose I could write an extractor in a few minutes, however a certain issue remains: There are no filenames but merely IDs making it impossible to easily distinguish Russian and English voices. However if you extract and convert the binary text files (I should call them gameplay files, I do NOT refer to the sb or toc files right now) with my tools here: denkirson.proboards.com/index.cgi?action=display&board=general2&thread=3248&page=1#53233you will then see the gameplay files importing the speech data files (located in the sb files) by their ID. In this picture the text file calls the ID a3c6b4ffa85a5d40a086bdd721d4a268 which is also found in the table of contents: i.imgur.com/B8qIn.jpgSo you will need to run through all gameplay files and figure out which team every ID belongs to. After that you could change the gameplay files (will get you kicked for modified content) or the toc/sb files (might get you kicked too; I don't know). You could decrypt the en.toc with my script and try to join an online game. If joining fails even at this point then all considerations are in vain anyway.
|
|
|
Post by didjeridu on Dec 19, 2011 16:06:45 GMT -5
Wasn't it that way by default in BC2? I seem to remember trying to figure out how to get the Russians to speak English. Anyway, I have no idea, but you can simulate 1% of the Russian experience by shouting GRRRENADA whenever you throw a grenade.
|
|
|
Post by SheWolf on Dec 19, 2011 16:17:57 GMT -5
thank you very much for your help. frankelstner. unfortunately i don't think i posess the skills and knowledge to identify the IDs by the reference in the gameplay files. but i will cross that bridge when i get there. first i will now mess arround a bit in the .sb files and see if i get kicked by punkbuster. if this happens there is no point to it anyway, as you said. oh, one more thing: wich .sb and .toc files are relevant? the ones in /data? what about those in the /update folder? didjeridu: don't i do it already ;P and yes, in BC2 you could set it to either way. everyone could speak english, everyone could speak their native language, or russians could speak english only when you were on their team. in BF3 they ditched that option, because consoles didn't have enough resources from supporting multiple sets of language files. so it doesn't exist for pc either. pc is leading platform my ass.^^ edit: hah, turns out i'm even worse at this than i though, i can't even get that phyton script to run ;_; edit 2: so, i just opened the en.sb and the en.toc each in a hexeditor and changed a few random values across the file to FF. i then joined a punkbuster server and could play a completel round without getting kicked. is this enough to assume that we can mod the files without interference from punkbuster, or would i have to make more severe modifications to be sure?
|
|
tiesieman
True Bro
mental lagger
Posts: 1,401
|
Post by tiesieman on Dec 19, 2011 17:23:48 GMT -5
Can't imagine if those values don't matter, random voice related ones wouldn't either. I hope you get it to work, the high pitch russian voice you get to play as is annoying me to no end
are all other voice bits from the same voice actor that does the squad orders? Cause he is awesome. He kind of says "somethingrussianhere NOOO" like his puppy is getting rid over
it touches the innest of my soul everytime
|
|
|
Post by SheWolf on Dec 19, 2011 17:32:59 GMT -5
yeah that's the guy^^ i have the russian version of the game, so i can set all voices to russian. it is completely awesome.
|
|
|
Post by frankelstner on Dec 19, 2011 17:35:52 GMT -5
Well, you need to download and install Python (took me just two minutes to download and install; I upgraded from 2.6 to 2.7 just recently): python.org/download/releases/2.7.2/After that you use the magical drag and drop: i.imgur.com/RzKsh.jpgoh, one more thing: wich .sb and .toc files are relevant? the ones in /data? what about those in the /update folder? I assume that the game loads everything in the non-update folder first; after that the update folder is loaded overwriting some of the files in the memory of your computer. One thing I am unsure about is whether the game makes some consistency check right while loading the files or afterwards. So if you happened to modify parts of the non-patch file which are overwritten by patch files anyway I'm not perfectly sure if the game would kick you. Basically, working with the update folder gives you a chance of 100% to see changes in game, whereas you can't really tell for certain with the unpatched files. edit 2: so, i just opened the en.sb and the en.toc each in a hexeditor and changed a few random values across the file to FF. i then joined a punkbuster server and could play a completel round without getting kicked. is this enough to assume that we can mod the files without interference from punkbuster, or would i have to make more severe modifications to be sure? Whoa, I really didn't expect this. Just run the Python script on the (unmodded) en.toc in the update folder and try to play the game. If this is successful then there's a very good chance it is possible to pull this off (and in any case worth a try). I've just read that you have the Russian version. You should modify the Russian version in this case. Or even better, just drag and drop every language .toc file in the update folder on the Python script file to be sure.
|
|
|
Post by SheWolf on Dec 19, 2011 17:39:26 GMT -5
i just thought of another point: when a patch hits, i have to do it all again, no?^^ so...how much work are we talking about here, exactly?
|
|
|
Post by frankelstner on Dec 19, 2011 17:46:23 GMT -5
No idea actually. Too much work to do this by hand in my opinion even though I haven't taken a closer look at it. Basically you would need to use my scripts to extract and convert the gameplay files into something readable and then check out the folders within \Sound\VO\ I'm not even sure if this is the only place to look at, I've discovered this folder only after you asked. I suppose if you can play online with all tocs modified then I could create a script. Or you could write one. I think a good estimate of the workload is the number of bytes of a toc file divided by 49 (as this is the length of one entry in the toc file). The result is about 4298, so you'd need to look up about 4300 IDs in the gameplay files.
|
|
|
Post by SheWolf on Dec 19, 2011 18:03:10 GMT -5
well ok, i'm sure as hell not doing that by hand and the only programming i've done was the mandatory c++ back in highschool, half a decade ago. i don't think i could write a hello world program in any language anymore, let alone a script to do this. bear with me please, i just want to make sure i understand this right. so we basicly have three options: 1: we mess arround in the game files themselves (will get us a nice PB kick) 2: we change the .toc files, so the gamefiles grab resources from the ru.sb instead of the en.sb 3: we change the .sb files themself, creating one multi-lingo sb file that contains both russian and english voice. correct?
|
|
|
Post by frankelstner on Dec 19, 2011 18:20:27 GMT -5
and the only programming i've done was the mandatory c++ back in highschool, half a decade ago. i don't think i could write a hello world program in any language anymore, let alone a script to do this. You can always convert to Python. I've never learned anything else and even that is only from ebooks and practice. bear with me please, i just want to make sure i understand this right. so we basicly have three options: 1: we mess arround in the game files themselves (will get us a nice PB kick) 2: we change the .toc files, so the gamefiles grab resources from the ru.sb instead of the en.sb 3: we change the .sb files themself, creating one multi-lingo sb file that contains both russian and english voice. correct? Yup, correct. Either change the IDs in the toc files so the gameplay files will be mislead or just change the data itself. Obviously changing IDs is much more appealing because toc files are small and simple.
|
|
|
Post by SheWolf on Dec 19, 2011 19:44:01 GMT -5
toc files then, eh? ok, then i will focus on finding a solution to that.. but really, i'm a poor and hungry medstudent, no time to pick up a programming language i'll figure out something i guess. oh, one more thing: can we even tell the .toc files to access anything else besides the en.sb? because the russian voices are of course stored in the ru.sb.
|
|
|
Post by frankelstner on Dec 19, 2011 20:21:12 GMT -5
oh, one more thing: can we even tell the .toc files to access anything else besides the en.sb? because the russian voices are of course stored in the ru.sb. I keep thinking about this and come up with different answers every time. My current idea is to change several toc files and swap some of the IDs inside between the files. Hopefully the game will see which IDs to load and grab them from all the different sb files. If that fails however and the game loads a single localization file before anything, modding the sb files should fix it. Did I mention these particular toc files are really simple? Writing something to extract the sb files will probably take less than an hour, same for packing. Connecting IDs and names on the other hand seems to be the harder part.
|
|
|
Post by SheWolf on Dec 20, 2011 6:22:43 GMT -5
i think it must load both files at the start of a game, because in the ingame menu i can switch between english and russian instantly without waiting for it to load. it seems unlikely that this would go that quick with a gigabyte sized file, no? or am i making a false connection here? edit: next problem. the magical drag & drop doesn't work for me. i created the xor.py file now, according to your post in the other forum, i copied the code into it, i told python to use files with that extension. but i can't do magical drag and drop on it with anything if i doubleclick the newly created script i just get a short black cmd window, as expected. edit 2: ok, nevermind that first edit. i went through the trouble of making the script an .exe (you may laugh but this is a major endeavour for me that brings me to the edges of my IT skills ;P ) so now drag and drop works. but when i drag and drop the en.toc on the exe file, nothing happens exept the black cmd window again. where does the script output the processed file? edit 3: i kindof sortof maybe a little got it to work somehow. i now got a en.toc.txt. when i open it with a hexeditor and set it to 49 byte stretch, the text is nicely alligned. however. i have a lot more gibberish nonsensial characters there than you. other than "offset" and "size" everything is a clusterFoxtrot of weird special symbols. but whatever, i could in theory now make changes to the en.toc. now all i need is the adresses of the russian voices in the ru.sb and the places in the en.toc where to put them. so, i'm pretty much 0,5% there xD i can't possibly connect all the voices with the IDs by hand, can i?
|
|
|
Post by frankelstner on Dec 20, 2011 10:11:52 GMT -5
i think it must load both files at the start of a game, because in the ingame menu i can switch between english and russian instantly without waiting for it to load. it seems unlikely that this would go that quick with a gigabyte sized file, no? or am i making a false connection here? No idea actually, time will tell. edit 3: i kindof sortof maybe a little got it to work somehow. i now got a en.toc.txt. when i open it with a hexeditor and set it to 49 byte stretch, the text is nicely alligned. however. i have a lot more gibberish nonsensial characters there than you. other than "offset" and "size" everything is a clusterFoxtrot of weird special symbols. As I've said, it's a table of contents. First it tells you the name/ID of the file, then it tells you the position/offset in the sb file. Additionally it specifies the length/size of the file. Of course the length is redundant because one file ends when the next one starts. Numbers are in little-endian. Thus when it says "size" f8790000 the actual size is 000079f8 or just 79f8 which is 31224 as a decimal number. i can't possibly connect all the voices with the IDs by hand, can i? Sure can, but it will probably take 100+ hours of highly repetitive work.
|
|
|
Post by SheWolf on Dec 20, 2011 12:46:43 GMT -5
so...would it be any less work to maybe get the .sb files open? maybe they are nice and cleanly structured, with things like "russian_yelling_for_ammo" and such? no? if not i have to find someone who will make me a script or program to go through the adresses in the game files and...what exactly? check for corresponding adresses in the .toc? and then what? i admit i'm kind of in a dead end here. oh, also something that was brought up by someone who actually knows something about the matter if we manipulate the .sb file and swap in russian sound data instead of the english stuff, will it even "fit"? by that i mean, will the pointers in the .toc still be accurate? what if a russian piece of data is longer than the corresponding english one? you understand what i mean?
|
|
|
Post by frankelstner on Dec 20, 2011 13:01:33 GMT -5
Nope, the toc will not fit anymore. It's already part of my considerations though. As of now, there are still two ways to accomplish your goal. Either change the toc files or change the sb files. Changing the toc files would be easy because it would require changing IDs only. We just mislead the gameplay files by making them think that when they call up a3c6b4ffa85a5d40a086bdd721d4a268 which used to be the English reload message,they actually call a totally different file by that ID (because the IDs in the toc are changed).
Alternatively changing sb files obviously requires changing the toc files as well, but as I've said, writing an extractor and packer is really simple due to the simple file structure.
My current idea is to modify my gameplay file reader script to check for certain IDs and create a dictionary of ID->language/team. With this dictionary it should be possible to run through the toc file and replace the right IDs.
|
|
|
Post by frankelstner on Dec 20, 2011 17:17:05 GMT -5
Here's a simple tocsb extractor for this toc type. There's lots of metastructure in the resulting files which I didn't expect. E.g. the third largest en file 6d6538f3504bd6513539d623df73ec80 has 4800000C1400BB8040 appearing 73 times, so there might be 73 files inside this file. from struct import unpack from binascii import hexlify import os
language="en"
try: os.mkdir(language) except: pass
toc=open(language+".toc","rb") sb=open(language+".sb","rb")
toc.seek(603) while toc.read(6)=="\x82\x2f\x0fid\x00": id=hexlify(toc.read(16)) if toc.read(8)!="\x09offset\x00": print "no offset" break offset=unpack("Q",toc.read(8))[0] if toc.read(6)!="\x08size\x00": print "no size" break size=unpack("I",toc.read(4))[0] outfile=open(language+"/"+id,"wb") sb.seek(offset) outfile.write(sb.read(size)) outfile.close() toc.seek(1,1) toc.close() sb.close() At a second glance I think there's a good chance that these files are not the files we were looking for. They are too structured in my opinion. On the other hand I don't know what kind of file format they use, so I can't rule it out either. In any case it'll be useful to figure out how these files work. I really wonder what this is about for example: i.imgur.com/6UZBE.jpg
|
|
|
Post by SheWolf on Dec 20, 2011 18:14:48 GMT -5
that looks...odd.
well, i can tell you this: i have the russian version of the game. and in order to switch language (text as well as sound) i had to move the en.toc and .sb into the data folder as well as the update folder. also a .dll in the main directory.
|
|
|
Post by frankelstner on Dec 20, 2011 18:27:56 GMT -5
Yeah you are right. Initially I expected only one file type, but there seem to be several distinct types.
|
|
|
Post by raxcoswell on Dec 21, 2011 20:50:43 GMT -5
How good even is the russian? I remember some guy translated some of the stuff from MW2 and they were saying stuff like 'I create the flash grenade!'
|
|
|
Post by SheWolf on Dec 22, 2011 7:14:53 GMT -5
well, funny enough, the russian differs from version to version, at least SP. the russian parts of the english version sound awefull. very thick american accent, very unsure reading and pronounciation. it's pretty obvious that they just gave a sheet of paper with the pronounciation of the russian lines written out in latin letters to some voice actor the russian of the russian version is excellent. professional native speakers as voice actors. fortunately, the russian in the multiplayer was appearently done by the good guys. the pronounciation is right, sentences are accent free and make sense grammaticly and logicly. no creating of flash grenades here. all that from what i can tell at least, i'm no native russian speaker after all.
|
|
|
Post by raxcoswell on Dec 22, 2011 9:40:24 GMT -5
Me neither, all I can recognise is 'blyat' before he says fall back On that note, whatever happened to that totally mortified american shouting 'what the fuck happened' when you lose?
|
|
|
Post by frankelstner on Dec 23, 2011 20:35:17 GMT -5
Alright, they are indeed sound files. I wasn't aware that there's so much structure in audio files in general. The sample rate for example is stored right in the beginning. i.imgur.com/cXcMa.jpgLet's just hope it's not a decoy.
|
|
|
Post by SheWolf on Jan 3, 2012 12:32:05 GMT -5
aaaaallright, back at home, back to tinker with the files. i don't think they would plant decoys just in order to screw with modders, no? god i hope not, if they really did, we are foxtroted. regardles, i think i have come as far as i will with my personal skills and abilities. without a program or script to find the adresses in the .sb files i'm stuck. writing anything like this is far out of my league though, so i will run this by some tech-savy friends and hope someone is nice enough to make anything like that for me.
|
|
|
Post by frankelstner on Jan 3, 2012 14:40:42 GMT -5
Nah things got a bit more complicated too. I've totally ignored that the gameplay files actually specify the chunk length (it was so obvious that I failed to see it). So there are indeed problems with the length, but not directly caused by the tocsb files. A workaround would be to decompress the audio of the important files and adjust their length a bit before compressing them again.
I've tried some disassembling to figure out how the audio compression works, but I'm pretty bad at it so there are no useful results. I don't have the motivation to create anything BF3 binary related either.
And I'm also sure there are no decoys, the audio format is just some custom type.
|
|