Syntrillium Backup Forums :: View topic

MDC · Posted - Sat May 31, 2003 11:29 am

My project is to transfer seminars recorded on tape to CD and Mp3. These tapes are voice recordings with no music. I have the following questions:

1) Transfering from tape to PC : Is there a high speed recorder that can transfer 90 minute tapes to PC in a few minutes?

2) Recording Format: Some tapes will need a lot of cleanup using noise reduction, FFT, and EQ while others will need only one pass of the noise reduction. Do I need to record at 48/32 and then downsample to 44.1/16 to get best result when using CEP 2.1 effects? Perhaps, 44.1/16 will produce same result since the audio is voice only.

3) Mp3 Codec and Bitrates: There are three codecs in CEP 2.1 to choose from (Fast, Medium, and High Quality). The menue indicates to use the High Quality (slowest) but when clicking on the help button, it says to use the Fast codec as the default. Which one will produce the better result in mp3? Also, for voice recordings, will a bitrate of 256 kbps give a better result than 128 kbps? Maybe beyound a certain kbps level for voice recordings it doesn't improve the listening quality.

Thanks, Michel

SteveG · Location: United Kingdom

1) No. You would need to be able to digitise signals at somewhere between 1/4 and 1/2 MHz to be able to do this. That's RF! Soundcards can only digitise audio.

2) For seminar speech, 44.1k/16-bit is going to be fine. You won't need to convert anything at all. You are going to end up on MP3, for heaven's sake! And this lot doesn't sound to be too great in the first place...

3) You need a codec expert to tell you about the coding speed choices, but I can't see that anything above 128k is going to improve things for your listeners, whatever speed you code it.

didj · Posted - Sat May 31, 2003 2:27 pm

1) I have a Yamaha MT4X which will play back standard cassettes at 2x speed. Then... If I recorded at 88200hz and played back at 44100 all would be well. This setup would save you 45 minutes out of 90.
I havn't heard of anything that would do it any faster and 4x would be the max at 196khz.
If the quality is that important you probably should transfer it in real time anyway.

2) For the NR, EQ... etc. I'd record at 88200/32, do the dirty work, then downsample to 44100/16 for your cd.

SteveG · Location: United Kingdom

ozpeter · Location: Australia

If you had the means to digitise to CD offline (ie from cassette to a audio CD recorder) you could be getting on with the processing and burning of tape A while tape B was digitising - you'd then rip the CD of tape B into you computer in a few minutes. It would be worth thinking about if you had a lot of these to do and someone was paying you a lot to do it!

- Ozpeter

SteveG · Location: United Kingdom

Yes, there are lots of different ways of saving time. I archive stuff using two networked PCs - you digitise on one and process on the other - this saves a lot of inter-stage CDs!

MDC · Posted - Sun Jun 01, 2003 1:00 pm

Steve,

Can you please further explain the steps about the dynamic range:

"... you make a decision about the dynamic range of the speech, possibly clip the top 6dB off it, and compress it to the extent that the quieter parts aren't a strain to hear. Normalise it, and have a listen."

That is, I've used NR with great results in reducing back ground noise such as hiss and I know that reducing too much BG noise will create a robotic or metallic sound for the speaker's voice. It should be noted that new Spectral Decay Rate feature in CEP 2.1 doesn't work well in speech recordings (I have to leave it a zero). Now, I tried the Dynamic Range Processing but it didn't seem to work well (I guess I don't know how to use it).

Thanks, MDC

SteveG · Location: United Kingdom

Hmm... okay... some further thoughts and explanations about a possible approach to making speech easier to listen to for longer periods:

One thing that you have to be aware of with most speech is the unnecessarily high peak to average ratios, and that's what this is all about. Most of the important part of the sound isn't in these peaks at all - and if you just hard-limit the top 6dB, it hardly sounds any different - unless it's a really high-class recording most people couldn't tell at all. But if you do this, it means that the main body of the speech can then be normalised 6dB higher. So unless you had a really bad, heavily limited recording in the first place, that's 6dB louder for the speech for nothing!

You have to watch it slightly with cassettes, though. If you record at anything like the maximum level, an effect called 'tape squash' occurs, which is very similar in effect to what I've just described. So, as with all things audio, you have to listen to the results carefully and decide what's best for what you've actually got - you might not need to go as far as I've suggested.

As far as extended dynamics goes, this varies rather from speaker to speaker. But in general, whilst a 12-15dB dynamic range is relatively easy to listen to, a greater one can cause some people to have to strain to catch the quieter parts of speech. Note that this isn't the dynamic range of the entire recording I'm talking about, but the variations in the speech itself - you can have the noise floor really quite noisy, and have speech that's hard to listen to. But equally, you can have a quiet noise floor and it's still hard to listen to! You have to do experiments, and it's never the same twice.

And it's worth pointing out that the easy way to reduce the dynamics is from the top down - I said 6dB, but there have been occasions when I've gone twice that far... but they were pretty exceptional. And that brings me neatly to the decision you will have to make about limiting and compression. It's fine to limit 6dB at the top of most speech - this you will get away with. But if you need to reduce the dynamic range more than this, then you should consider squashing the dynamics progressively - ie compression. You are not going to need a massively high compression ratio if you can do this across the entire speech envelope - if the envelope (which you can get some idea of by zooming right out and looking at the approximate outline of the tops) is 20dB across, then you will only need a slope that reduces 20dB to 15 at the top - and that's not too much to ask, as a rule - just make sure that the release time is quite short. So typically, this is a 1:1 ratio except for the top 20dB, where the slope will be slightly shallower, ending at -5dB. Then you normalise afterwards - I wouldn't use makeup gain, because the results of the compression can be variable.

This approach will not bounce the BG noise up and down at all, and with care, will result in speech that's easier to listen to for longer periods. But you will have to experiment a bit to get this right, because it varies a lot. And as with any processing of this sort, there are no guarantees - you have to learn to judge the results for yourself and act accordingly.

I agree about the NR spectral decay with speech - there doesn't seem to be an inaudible way to use this at all. I'd rather leave the BG a little higher and not go that way at all. It's much better on musical applications, though.

Havoc · Posted - Sun Jun 01, 2003 2:34 pm

Other way to speed it up: loan a few tape machines and record in parallel with the multitrack. As it are seminars, they all will be about the same lenght so it could be eficient.

Speech is a funny thing. On one hand NR works not as good, on the other hand, it stays intellegible even when severly mutilated and bandwidth limited (telephone anyone?). So cutting off above 10-15 kHz could be possible. Same for MP3, 128kb will do.

SteveG · Location: United Kingdom

Since most cassette recordings fall off at an alarming rate beyond 12kHz, you could probably go a little lower than you were suggesting and still have it perfectly intelligible!

OBuckley · Posted - Mon Jun 02, 2003 5:34 am

Also bear in mind, you don't actually have to be there watching while all this happens. Set up the tape on auto-reverse to record in CEP through your sound card. Set the input level so it doesn't clip. Then do the washing up, file some papers, go out to the pub -whatever you would usually do for the next 90 minutes if you weren't doing this. Or set it up each night before you go to bed, although you may need to apply some thought to avoid using up all the space on your HDD.

jkrantman · Posted - Tue Jul 08, 2003 11:38 pm

I don't know whether the quality of this device will be of any help, but the upload to the computer is blindingly fast. I am using a SONY Memory Stick recorder. This records voice at 16Kbps. I tie it directly into the main sound board whenever possible. With a 128 MB stick, it can hold 18 hours of standard play talk.

And the best part is that with a USB connection, I can upload hours of recordings to the computer in seconds.

Good luck.

MusicConductor · Location: USA

With all respect for Rantman and the applications he uses the memory stick for, Don't Do This! There is no need to further degrade the quality of the final result by using very, very lossy compression for the capture.

Go to Options/Timed Record Mode and enable this tool; next time you click on the Record button, it will give you the options of specifying when you record, including "right away," as well as setting a maximum length. That auto-reverse deck, a 93-minute file, and a good night's rest may make good companions.

dpower128 · Posted - Wed Jul 23, 2003 8:29 pm

Hello everyone, thought I'd add a question to this thread rather than starting a new one since I'm doing ABOUT the same thing.

The only other thing I'm doing in addition to the person who started this thread is I will be creating some quick playing .wav files for my web pages to invoke the Windows Media player.

The original seminars I've created are on cassette. They were recorded from the telephone. At 44,1 and 16 they sound fine. But I've resampled to 8 and 8 and now there's hiss.

Can someone give me the quick and dirty to be able to create a low-hiss .wav file with what I've got?

All is well other than this. I just don't know how to properly do it. The hiss is constant and uniform throughout the 8 and 8 file.

Thank you,

David

Craig Jackman · Location: Canada

Since these are voice only, I would suggest that you use mono 44.1/16 to record and clean up any noise problems. You'll get a smaller file size and faster transforms. When you convert to mp3 using those file properites, the best bit rate you'll be offered is 128kps, which is equivilant to 256kbs on a sterso file. Once you've cleaned up the files, then you'll be able to judge if you want to use a lower still bit rate. You should be able to get away with something like 96kbs on a mono voice file with pretty good quality and a small file size.
You'll have to convert the mono voices files to stereo to make them audio CDs though.

dpower128 · Posted - Thu Jul 24, 2003 7:16 am

Thanks Craig, I guess I'm really looking for the specifics within CE2000 for how to reduce the hiss that occurs when going from 44.1/16 to 8/8. But maybe you answered that and I didn't catch it. You could put everything I know about doing this in a thimble.

David

MusicConductor · Location: USA

Any 8-bit format will have a noticeable hiss level -- there are only 256 possible amplitudes! This noise is called "quantization error" (i.e., the noise amounts to the discrepancies between the source amplitudes and the paltry choices available to represent it) and is unavoidable with such a coarse format. There is no way to mask it at such a low sample rate, either. You're far, far better off to use compression, not a linear format like 8/8.

BTW, 8/8 amounts to 64kbs. A mono mp3 at that rate will sound very, very close to the original.

dpower128 · Posted - Thu Jul 24, 2003 1:15 pm

Music Conductor,
So would I first record as 44,1/16 stereo (then save it) and then CONVERT to maybe 16/8 mono instead of 8/8 mono?

I really have no idea what you are telling me to actually do. I'm not in love with 8/8. I just want it to play quickly from the web via Windows Media player as a .wav file. If 16/8 mono will work great! I'm all for it. Looking for WHAT to do. And thanks for at least letting me know that 8/8 will likely never work. No problem. I just need to know a good, alternative, with maybe a few steps thrown in to help me see what to do. As I said, this is NOT my expertise.

Thanks,

David

didj · Posted - Tue Aug 05, 2003 5:56 pm

If you just want to stream quickly from the web then .wav isn't the
best format for your audio. mp3, .wma, real audio are some other choices.

And, two months later, thanks Steve for pointing out that playing back a tape at double speed would lop off a good chunk of hf. Top octave of whatever the machines response is I guess.