I don’t know of any modern encoder for PASC/mp1. Even if it wouldn’t add much value to the project, I would like to strip down an mp3 or mp2 encoder to that level. Has anyone tried to do this before?
I am pretty sure @Jac has worked on this. lol. I see he is typing already.
I did some work on trying to do this a few weeks ago because I’m thinking it would be cool to have a new version of the DCC2WAV software that works with 24 bit source files and sample frequencies other than 44.1kHz.
The main problem seems to be that all the MP1 encoders that I looked at (there are not that many open-source recorders and they all seem to be based on the same ISO sample code) use a frame size that probably wouldn’t be acceptable for DCC (EDIT: this was based on a false assumption; the frame size is not a problem). I tried to change the source code for one decoder to see if it would be possible to fix this but I haven’t been successful. I couldn’t compile the project because it had a Linux-only build system and a lot of dependencies. I’ll surely get back to it later.
===Jac
@drdcc Would a emulated speed up version of DCC2WAV be of any use to you as first relatively easy step? How often do you use it?
Hi,
I use this only when compiling a new tape, so not very often.
@jac has created a different software in the past I believe.
Maybe he can share his version as a start?
As I said above, I looked into the possibility of using an MPEG 1 layer 1 encoder to encode audio for DCC-Studio but I never finished it. It’s certainly on the agenda.
I tried in the 1990s to use an MPEG 1 Layer 1 encoder to generate a file to use with the DCC-Studio. I know it’s capable of reading a PASC file and generate a TRK file from it. But I was unable to generate an MP1 file and make DCC-Studio recognize it as a legitimate PASC file. I gave up on the effort because it took a lot of time (my computer was a 200MHz Pentium at that time) and I didn’t seem to get anywhere.
A few weeks ago I was reminded of this because @drdcc and I were talking about releasing a prerecorded DCC that was mastered at 48kHz and DCC2WAV apparently cannot do anything other than 44.1kHz. So I started looking around for open source MP1 encoders that could do this, and ran into the same problem: The streams they generate are apparently not acceptable for DCC-Studio.
I suspect that this is a matter of small changes to tweak the frame size and to control the framing (at 44.1 kHz 384kbps, the encoder should insert padding bits to make the bit rate constant but this part of the standard is mostly ignored by the standard because it’s only important for tape). But I have to do some more research.
===Jac
If you decide which encoder to settle on, I would be happy to help.
I’m open to suggestions! I spent a whole afternoon a couple of weeks ago trying to find an encoder that I could understand but there are only a few left: Most encoders have abandoned MP1 and MP2.
I would prefer to use LAME or something because they have a giant user base and are well maintained but I don’t know if they would be interested.
I forked a repository into GitHub - DigitalCompactCassette/CDex: famous CD-Ripper, new version 1.71, this is a cloned Repo from cdexos.sourceforge.net with fixed cddb-bug but I didn’t do any work on it. It hasn’t had any work done on it for 6 years so I don’t know if it’s still being maintained and if it’s even useful anymore. Also because it’s a CD ripper there is a lot of code in there that we don’t want to maintain. It appears that the MP1 encoder is at CDex/MpgLibDll/layer1.c at master · DigitalCompactCassette/CDex · GitHub.
Anyway, if you want, I can make you a member of the “digitalcompactcassette” team on Github. Or you can send suggestions via this forum or send me pull requests for any of my projects of course.
===Jac
GitHub - njh/twolame: MPEG Audio Layer 2 (MP2) encoder is the cleanest and best maintained MP2 encoder I could find. MP2 with LAME-compatible API, last release October 2019. They do have some special options for DAB, so might be open for DCC-related pull requests. Or do you think a fork would suit us better?
THat looks very interesting! Layer 1 is a simplified version of Layer 2 if I understand correctly (I should read ISO-11172-3 again) so I think changing the encoder module to support MP1/Pasc should be fairly trivial.
I cloned it to GitHub - DigitalCompactCassette/twolame: MPEG Audio Layer 2 (MP2) encoder. If you’re interested, I can give you access rights there. I may do some work on it too as soon as I have time. Once we have something that works, we can file a pull request. If they’re not interested in our changes, we’ll just keep our version. We’ll see.
===Jac
What does the “Adaptive Encoding” option in DCC2WAV do exactly in technical terms? Please share your findings on the software to avoid duplicate research @Jac, what exactly are the three output files for?
I have no idea about what the “Adaptive Encoding” option does, so at least if you’re going to research that, you’re not reinventing the wheel . I suspect that that may enable Joint Stereo encoding or something. If you try encoding a sample with and without the flag, a quick inspection of the PASC header (the first 32 bits of each frame in the .MPP file), will probably tell you what difference it makes.
The DCC-Studio program works with:
- .TRK files which are a representation (in ASCII) of which PASC files are used, combined with frame numbers (each frame is 420 bytes), markers, volume and equalization settings, etc. I never bothered to figure out the exact format but a .TRK file for a single PASC file, should be easy to reproduce. DCC-Studio can generate .TRK files if you tell it to search the drive for orphan PASC files.
- .LVL files are a representation of the levels of a sound file. I haven’t ever looked at them so I don’t know exactly what’s in there but I suspect that the program calculates an average or maximum volume of each MPEG frame by looking at the scale values, and storing the resulting “volume” for each frame as a byte or nibble for each channel or something. If I remember correctly, DCC-Studio automatically generates the .LVL file if you open a .TRK file for which there is no .LVL file.
- .MPP files contain the actual audio data. They start with a word that contains $2C $00 (*), following by MPEG/PASC frames. The frames are described in ISO/IEC 11172-3 but the language is difficult to follow. I learned a lot from the document at this location which describes a decoder implemented in VHDL for an FPGA. However, unfortunately, the actual source code is not included and seems to be not available on the Internet.
As I already mentioned, I did some research a few weeks ago to find an MP1 encoder that could encode at 48 kHz. I didn’t try to convert an output file to DCC-Studio format because I suspected that the bit stream that came out of that encoder was wrong because it used a different frame size than the 420 bytes that I was used to seeing.
But now I’ve spent some time reading the ISO/IEC standard, and the document linked above, I realize I was wrong: the 384 byte frame size for 48kHz was correct. So I’m going to try and find some time to see if I can get some 48kHz encoded MP1 audio onto my Windows 98 laptop and see if DCC-Studio can work with it. I have a funny feeling that it can. In fact, I think that 48kHz as well as 32kHz will work fine from any MP1 encoder (if concatenated after the required 2-byte DCC-Studio header)
I remember in the 1990s I tried to find out if there was a MPEG-1 encoder that could be made to generate .MPP files for DCC-Studio. I knew that the problem with MP1 decoders was that they couldn’t handle the padding bit and I knew how to work around that. But I had no idea why the padding bit was necessary (I found out around 2003 when I first had a look at the IEC/ISO standard) and I up until now, I didn’t know when and how an encoder should produce a shorter frame with a padding bit. The answer is:
For 44.1kHz encoded MP1/PASC, Every regular frame is followed by a padded frame, except for every 49th frame in a sequence.
I now suspect that many MP1 encoders get this wrong (or simply don’t care), which makes the MPEG data unusable for DCC or DCC-Studio. Everything else in the encoding for MPEG 1 Layer 1 is the same as, or compatible with DCC. And that’s probably why I could never get any other MPEG 1 Layer 1 encoders to generate a usable file for DCC-Studio, but my memory is vague – I don’t even know what happens if you try.
The problem is caused by this: Each frame in the MPEG/PASC data contains a number of “slots” where each slot is 32 bits. The number of slots per frame is fixed at 12 x (bit rate / sample rate). For 44.1 kHz this results in a non-integer result, so that’s where padding is used.
- For 32kHz there are 12 * (384000 / 32000) = 12 * 12 = 144 slots, i.e. 576 bytes per frame. Every frame represents exactly 12ms of audio.
- For 48kHz there are 12 * (384000 / 48000) = 12 * 8 = 96 slots = 384 bytes per frame. Every frame represents exactly 8ms of audio.
- For 44.1kHz the formula results in 12 * (384000 / 44100) = 12 * 8.707… = 104.48… slots.
The calculated number of slots for 44.1kHz is actually 104 + (24/49). That basically means that for any sequence of 49 frames of Layer 1 data, 25 frames should have 104 slots (416 bytes) and 24 frames should have 105 slots (420 bytes) so that the average over time evens out and the time per frame is 8.7ms.
===Jac
(*) EDIT: I haven’t verified this at the time of this writing but I now suspect that the $2C $00 header indicates the sample rate used in the file: $2C equals 44 in decimal. I never used DCC-Studio with 48kHz or 32kHz sound sources but I would not be surprised if the first byte changes to $20 (32 decimal) for 32 kHz and $30 (48 decimal) for 48kHz.
I just found out that I made a major mistake in the interpretation of the padding bit. I’ll explain in a new thread.
===Jac
You could just keep it here, but your choice. I have not yet looked at the standards in detail. Could you summarize the difference between MP2 and PASC? To be honest, I would just purpose build a PASC encoder by stripping all incompatible features. A proper implementation ready for a PR with the upstream would of course be nice, but I see no direct advantage and the disadvantage of uneccesary complexity.
@Jac
I do know what the sub band coding does, just some weeks a go we did some measurements.
First we measure the analoge signal for the source and than the same one on the recorded dcc.
There is a difference in the low level sounds if there are high level sounds, as they call it “In time, forward & Backward Masking”
And this is done in each sub-band (FQ range) based on the human hearing threshold.
In the near future we will make a video and document to visualize this.
I know, thats the concept of it. It was explained to us by an ex-Philips guy in great detail at the premiere.
For MP1, each frame is basically Header - Allocations - Scalefactors - Samples.
For MP2, each frame is Header - Scalefactor Selection Info - Scalefactors - Sample Codes - Samples.
I still don’t know the standards thoroughly but it may be possible to make an MP1 codec out of an MP2 codec. by leaving out the processing of the scale factor selection info and scale factor codes. But of course it would be easier to start with a working MP1 encoder.
The difference between MP1 and PASC is: nothing. My discovery yesterday basically means that, quite likely, the extra zeroes between frames in .MPP files are an optimization in the DCC-Studio program to make it easier to seek to a particular frame in a file, not a difference between PASC and MP1, and not a misinterpretation of the MPEG standard by codec writers.
MP1 encoders/decoders already exist, possibly with varying speed and quality. I think it’s possible to make improvements, e.g. by using integer-only encoding/decoding for speed, 32 bit internal operations for accuracy, and the possibility of using alternate Psycho-Acoustic Models for the encoding. Even if you understand the standard very well, I think that might still be quite a bit of work. It might help to have some knowledge about DSP or MMX or GPU programming because a lot of the processing can be done in parallel (that document I linked to earlier, about the FPGA implementation of an MP1 decoder, would help with parallelization).
However: Making an MPP file out of an MP1 file, or vice versa, is trivial. Modifying an existing MP1 encoder to generate an MPP file instead of an MP1 file or modifying an MP1 decoder or player to accept an MPP file as input, should be very easy.
If you want to create a replacement for DCC2WAV, that would not be very useful I think. DCC2WAV can only convert from 44.1 stereo PCM WAV to .MPP/.TRK/.LVL, or vice versa, as far as I know. Sure, a fast version of that program (and a version that works on other operating systems besides DOS) would be useful, but why stop there?
Instead, I think it would be useful to start with a library or application that already exists, and can convert many different formats to many different other formats, and/or can play many different formats. After going through the learning curve of understanding how the existing program works, adding functionality to convert from/to .MPP/.TRK/.LVL should be trivial if such a program already supports MP1.
The TwoLame software doesn’t support MP1. Adding MP1 support should be possible because as I understand MP2 is a more sophisticated version of MP1 (I may be wrong). Whether the authors are interested in an MP1/PASC feature, is another question. Maybe CDex (which already supports MP1) is a better idea after all. It would only be necessary to change the decoder to skip the header and the extra slots in .MPP files, and the encoder to insert those slots into encoded MPP files.
===Jac
I found a MPEG 1 Audio Layer 1/2 encoder, still in the Debian repositories: mp2enc included in the package/compilation mjpegtools. Mostly sample code, but could be a fast solution for the upcoming release.
mp2enc -l 1 -b 384 < /mnt/c/Users/Max/in.wav -o "out.mp1"
INFO: [mp2enc] Opened WAV file, freq = 44100 Hz, channels = 2, bits = 16
INFO: [mp2enc] format = 0x1, audio length = 39321488 bytes
INFO: [mp2enc] SpF=96, frac SpF=0.000, bitrate=384 kbps, sfreq=48.0 kHz
INFO: [mp2enc] System is little endian
INFO: [mp2enc] Num frames 27863 Avg slots/frame = 96.000; b/smp = 8.00; br = 384.000 kbps
INFO: [mp2enc] Encoding to layer 1 with psychoacoustic model 2 is finished
INFO: [mp2enc] The MPEG encoded output file name is "out.mp1"
It does only support 8 or 16 bit depth, 18 or 20 bit as discussed will require massive extra work, as it is not even common for PCM: 18 bits recording. How to put in practice?
They are quite honest in the manual:
It is actually a very mildly warmed over version of the MPEG Software Simulation Group’s reference encoder.
INFO: [mp2enc] Num frames 27863 Avg slots/frame = 96.000; b/smp = 8.00; br = 384.000 kbps
96 slots per frame? That must be not counting the header and the scaling slots or something. The total number of slots per frame is a little under 104.5 (it’s 104 24/49)
You will definitely see a difference between the data in the frames if you encode the same WAV file with this encoder vs. DCC2WAV if you use Psychoacoustic Model 2. I don’t know what the PAM is that DCC uses but it’s probably either PAM 1, or yet something else. If you have good ears, you might hear the difference but it also depends on what kind of audio you encode. To be clear: decoding is always the same (so a PASC decoder in a DCC recorder should be able to decode any PAM), the PAM just determines which parts of the audio are thrown away during encoding.
There is just one very minor problem with using a different PAM, which is multiple generations: if you encode audio with one PAM and then decode the result, and then encode that again with another PAM, there is a small chance that both PAMs decide to throw away different parts of the music, degrading the audio slightly. However if you encode audio with one PAM and then encode it again with the same PAM, the encoder should find nothing to throw away, so in theory the second generation is equal to the first generation.
About the 8-bit/16-bit limitation: I think I’ve seen that MP1 encoding is specified to require 18 bit processing at least. The document about the FPGA version of an MP1 decoder describes how they tried different bit depths and the author comes to the conclusion that the difference between the decoded audio and the original audio before compression, stabilizes around 24 bits. I think it’s possible to make an efficient encoder/decoder that uses 32-bit integer math (which would match current hardware nicely). How much work it would be, I don’t know.
It’s all a matter of what you want to accomplish and how far you want to go. I think it should be trivial to e.g. go into the LAME source code and make a version that generates (and decoders) MPP files. I understand that LAME has its own PAM that’s renowned for its quality, but you can choose other PAMs too. For me personally, it probably wouldn’t matter; my ears are very sensitive to compression artifacts but I’ve never heard any of those with PASC or MP1 because of the high bit rate. And I’ve never been able to hear the difference between CD’s (16 bit audio) and DCC’s (all my DCC recorders were 18 bits in the 1990s).
===Jac
Doo you mean TwoLAME which is heavily based on LAME or do you want to jump down the rabbit hole and build upon LAME?