New VorbisGain Patch For Ogg123
I was searching for a vorbisgain patch for ogg123, and stumbled upon vgplay. It’s a patch for ogg123, that was derived from the xmms patch for vorbisgain.
Why is this so important? Well, firstly I thought that amaroK supported vorbisgain normalizing. In reality, it doesn’t (long answer: it does, but that’s through a filter script, and I don’t really like that approach).
But my songs don’t have equal/similar loudness! How should I listen to them? Should I continue the practice of volume-knob-turning-for-every-song-change? I’ve decided that the answer is: NO.
Now that I’ve got vgplay on my hands, I started playing with it. In short, it works! But that’s only if I give ogg123 a command line option to enable the gain (it’s disabled by default). So I decided to make the patch simpler, without user-configurable options, but with reasonable compiled-in defaults.
To cut the long story short, here are my changes:
1. remove album gain
this implies removing the options for setting the gain. now ogg123 will
always use vorbisgain’s track gain (or none if the tag is not set)
2. remove preamp
and use the default of 6.0 dB (controllable via a #define)
3. remove hard limiter
and use the better scale_hard limiter
4. remove fade option
5. remove state changes
6. rename vgplay_read to ov_read_vg and resync it
look at the comments for the function
The first four changes are easy, it’s the last two (and especially the last one) that’s troublesome for me. In the end, it still works, so it’s worth it.
As a side note, I considered modifying the ov_read() function in libvorbis instead, so that all programs can benefit from this change. But this turned out to be a bad idea, because
1. Vorbis decoders would be affected as well, which means that vorbis files decoded into pcm might be distorted (vorbisgain isn’t perfect, you know)
2. Users would then have no control over the vorbisgain settings.
As such, I propose some change in libvorbis:
1. libvorbis should read and interpret the vorbisgain tags
2. ov_read() should accept an extra parameter (maybe similar to vgain_state) that contains at least: the track gain, track peak, album gain, album peak, whether vorbisgain is applied, the type of vorbisgain applied, the type of limiting desired, and the PreAmp in dB. This parameter will be used to determine the gain applied.
3. Alternatively, ov_read() can accept a new parameter for a callback "filter" function. This filter function can be a vorbisgain applier function, or some other filter function defined by the player. This will require minimal changes to ov_read(), but more changes to the player.
Actually I’m not sure of which option is the better one. Both will break the ABI, and both will require changes, however minor, to existing player software. Maybe I’ll mail the developers and let them know about this idea.
This is vgplay.h:
/* -*- mode: c; c-basic-indent: 2; indent-tabs-mode: nil -*-
*
* vgplay.h 1.0 (c) 2003 John Morton
*
* Portions of this file are (C) COPYRIGHT 1994-2002 by
* the XIPHOPHORUS Company http://www.xiph.org/
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* - Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* - Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the
* distribution.
*
* - Neither the name of the Xiph.org Foundation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
* STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
* OF THE POSSIBILITY OF SUCH DAMAGE.
*
**********************************************************************
*
* vgplay - a drop in replacement for ov_read from libvorbisfile that
* supports replaygain volume adjustments with limiting, preamp, and
* fading.
*
*/
#ifndef __VGPLAY_H
#define __VGPLAY_H
#undef VGPLAY_DEBUG
#ifdef __cplusplus
extern "C" {
#endif
#include <stdio.h>
#include <math.h>
#include <ogg/os_types.h>
#include <vorbis/codec.h>
#include <vorbis/vorbisfile.h>
/* BEGIN : STOLEN CODE FROM os.h IN LIBVORBIS */
#if defined(__i386__) && defined(__GNUC__) && !defined(__BEOS__)
# define VORBIS_FPU_CONTROL
/* both GCC and MSVC are kinda stupid about rounding/casting to int.
Because of encapsulation constraints (GCC can’t see inside the asm
block and so we end up doing stupid things like a store/load that
is collectively a noop), we do it this way */
/* we must set up the fpu before this works!! */
typedef ogg_int16_t vorbis_fpu_control;
static inline void vorbis_fpu_setround(vorbis_fpu_control *fpu){
ogg_int16_t ret;
ogg_int16_t temp;
__asm__ __volatile__("fnstcw %0\n\t"
"movw %0,%%dx\n\t"
"orw $62463,%%dx\n\t"
"movw %%dx,%1\n\t"
"fldcw %1\n\t":"=m"(ret):"m"(temp): "dx");
*fpu=ret;
}
static inline void vorbis_fpu_restore(vorbis_fpu_control fpu){
__asm__ __volatile__("fldcw %0":: "m"(fpu));
}
/* assumes the FPU is in round mode! */
static inline int vorbis_ftoi(double f){ /* yes, double! Otherwise,
we get extra fst/fld to
truncate precision */
int i;
__asm__("fistl %0": "=m"(i) : "t"(f));
return(i);
}
#endif
#if defined(_WIN32) && !defined(__GNUC__) && !defined(__BORLANDC__)
# define VORBIS_FPU_CONTROL
typedef ogg_int16_t vorbis_fpu_control;
static __inline int vorbis_ftoi(double f){
int i;
__asm{
fld f
fistp i
}
return i;
}
static __inline void vorbis_fpu_setround(vorbis_fpu_control *fpu){
}
static __inline void vorbis_fpu_restore(vorbis_fpu_control fpu){
}
#endif
#ifndef VORBIS_FPU_CONTROL
typedef int vorbis_fpu_control;
static int vorbis_ftoi(double f){
return (int)(f+.5);
}
/* We don’t have special code for this compiler/arch, so do it the slow way */
# define vorbis_fpu_setround(vorbis_fpu_control) {}
# define vorbis_fpu_restore(vorbis_fpu_control) {}
#endif
/* END : STOLEN CODE FROM os.h IN LIBVORBIS */
/* Default PreAmp in dB */
#define VG_PREAMP_DB 6.0
typedef struct {
/* Internal scale factor */
float scale_factor; /* The scale factor */
float max_scale; /* The maximum scale factor before clipping occurs */
} vgain_state;
/* Stores the replaygain settings of a track in the vgain_state structure,
* setting the target scale factor, and other internals.
*/
extern void vg_new_track(vgain_state *vg_state, vorbis_comment *vc);
/* A replacement for ov_read, just supply it with an initialized vgain_state
* and it will behave in the same way.
*/
extern long ov_read_vg(vgain_state *vg_state, OggVorbis_File *vf,
char *buffer, int length, int bigendianp,
int word, int sgned, int *bitstream);
/* If you have other sound processing you want to do before the pcm is
* packed into integers, first call ov_read_float, call vplay_apply_gain on
* the pcm before or after your transforms, then call vg_pack_pcm to
* output integer packed pcm to the buffer. vg_read is implemented in this
* way.
*/
extern void vg_apply_gain(float scale_factor, float max_scale,
float **pcm, long samples, long channels);
#ifdef __cplusplus
}
#endif
#endif /* __VGPLAY_H */
This is vgplay.c:
/* -*- mode: c; c-basic-indent: 2; indent-tabs-mode: nil -*-
*
* vgplay.c 1.0 (c) 2003 John Morton
*
* Portions of this file are (C) COPYRIGHT 1994-2002 by
* the XIPHOPHORUS Company http://www.xiph.org/
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* - Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* - Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the
* distribution.
*
* - Neither the name of the Xiph.org Foundation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
* STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
* OF THE POSSIBILITY OF SUCH DAMAGE.
*
**********************************************************************
*
* vgplay - a drop in replacement for ov_read from libvorbisfile that
* supports replaygain volume adjustments.
*
*/
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <math.h>
#include <ogg/os_types.h>
#include <vorbis/codec.h>
#include <vorbis/vorbisfile.h>
#include "vgplay.h"
/* Copied from vorbisfile, which doesn’t export the function. */
static int host_is_big_endian() {
ogg_int32_t pattern = 0xfeedface; /* deadbeef */
unsigned char *bytewise = (unsigned char *)&pattern;
if (bytewise[0] == 0xfe) return 1;
return 0;
}
/* Set up a new track, extracting replay gain information for vorbiscomments. */
void vg_new_track(vgain_state *vg, vorbis_comment *vc) {
float track_gain_db = 0.00, track_peak = 1.00;
char *tag = NULL;
if (vc) {
if ((tag = vorbis_comment_query(vc, "replaygain_track_gain", 0))
|| (tag = vorbis_comment_query(vc, "rg_radio", 0)))
track_gain_db = atof(tag);
if ((tag = vorbis_comment_query(vc, "replaygain_track_peak", 0))
|| (tag = vorbis_comment_query(vc, "rg_peak", 0)))
track_peak = atof(tag);
}
vg->scale_factor = pow(10.0, (track_gain_db + VG_PREAMP_DB)/20);
vg->max_scale = 1.0 / track_peak;
}
void vg_apply_gain(float scale_factor, float max_scale,
float **pcm, long samples, long channels) {
int i, j;
float cur_sample;
/* Apply the gain, and any limiting necessary */
if (scale_factor > 0.0) {
for(j = 0; j < samples; j++) {
for(i = 0; i < channels; i++) {
cur_sample = pcm[i][j];
cur_sample *= scale_factor;
/* This is essentially the scaled hard-limiting algorithm */
if (scale_factor > max_scale) {
if (cur_sample < -0.5)
cur_sample = tanh((cur_sample + 0.5) / (1-0.5)) * (1-0.5) - 0.5;
else if (cur_sample > 0.5)
cur_sample = tanh((cur_sample - 0.5) / (1-0.5)) * (1-0.5) + 0.5;
}
pcm[i][j] = cur_sample;
}
}
}
}
/* This function is derived from ov_read in libvorbisfile.
* In fact, we have introduced only a minimal set of changes, namely:
* 1. Change the name from ov_read to ov_read_vg
* 2. Add a new parameter vgain_state *vg before all the other parameters
* 3. Replace the line
* if(vf->ready_state<OPENED)return(OV_EINVAL);
* and the while(1) loop just below it, with
* samples=ov_read_float(vf, &pcm, length/(word*ov_info(vf,-1)->channels), bitstream);
* 4. Add the line
* vg_apply_gain(vg->scale_factor, vg->max_scale, pcm, samples, channels);
* after the line
* if(samples>length/bytespersample)samples=length/bytespersample;
* 5. Remove the lines
* vorbis_synthesis_read(&vf->vd,samples);
* vf->pcm_offset+=samples;
* if(bitstream)*bitstream=vf->current_link;
* near the end
* Because of this, we have introduced some dead code, namely:
* 1. The line
* if(samples>length/bytespersample)samples=length/bytespersample;
* because the if condition can never be true, since we passed
* length/(word*ov_info(vf,-1)->channels)
* to ov_read_float
* 2. The lines
* if(samples <= 0)
* return OV_EINVAL;
* because in this part of the code, samples>0 is guaranteed by the if
* statement of the enclosing block, and also because the line
* if(samples>length/bytespersample)samples=length/bytespersample;
* is dead code, as shown above.
*/
long ov_read_vg(vgain_state *vg, OggVorbis_File *vf,char *buffer,int length,
int bigendianp,int word,int sgned,int *bitstream) {
int i,j;
int host_endian = host_is_big_endian();
float **pcm;
long samples;
samples=ov_read_float(vf, &pcm, length/(word*ov_info(vf,-1)->channels), bitstream);
if(samples>0){
/* yay! proceed to pack data into the byte buffer */
long channels=ov_info(vf,-1)->channels;
long bytespersample=word * channels;
vorbis_fpu_control fpu;
if(samples>length/bytespersample)samples=length/bytespersample;
vg_apply_gain(vg->scale_factor, vg->max_scale, pcm, samples, channels);
if(samples <= 0)
return OV_EINVAL;
/* a tight loop to pack each size */
{
int val;
if(word==1){
int off=(sgned?0:128);
vorbis_fpu_setround(&fpu);
for(j=0;j<samples;j++)
for(i=0;i<channels;i++){
val=vorbis_ftoi(pcm[i][j]*128.f);
if(val>127)val=127;
else if(val<-128)val=-128;
*buffer++=val+off;
}
vorbis_fpu_restore(fpu);
}else{
int off=(sgned?0:32768);
if(host_endian==bigendianp){
if(sgned){
vorbis_fpu_setround(&fpu);
for(i=0;i<channels;i++) { /* It’s faster in this order */
float *src=pcm[i];
short *dest=((short *)buffer)+i;
for(j=0;j<samples;j++) {
val=vorbis_ftoi(src[j]*32768.f);
if(val>32767)val=32767;
else if(val<-32768)val=-32768;
*dest=val;
dest+=channels;
}
}
vorbis_fpu_restore(fpu);
}else{
vorbis_fpu_setround(&fpu);
for(i=0;i<channels;i++) {
float *src=pcm[i];
short *dest=((short *)buffer)+i;
for(j=0;j<samples;j++) {
val=vorbis_ftoi(src[j]*32768.f);
if(val>32767)val=32767;
else if(val<-32768)val=-32768;
*dest=val+off;
dest+=channels;
}
}
vorbis_fpu_restore(fpu);
}
}else if(bigendianp){
vorbis_fpu_setround(&fpu);
for(j=0;j<samples;j++)
for(i=0;i<channels;i++){
val=vorbis_ftoi(pcm[i][j]*32768.f);
if(val>32767)val=32767;
else if(val<-32768)val=-32768;
val+=off;
*buffer++=(val>>8);
*buffer++=(val&0xff);
}
vorbis_fpu_restore(fpu);
}else{
int val;
vorbis_fpu_setround(&fpu);
for(j=0;j<samples;j++)
for(i=0;i<channels;i++){
val=vorbis_ftoi(pcm[i][j]*32768.f);
if(val>32767)val=32767;
else if(val<-32768)val=-32768;
val+=off;
*buffer++=(val&0xff);
*buffer++=(val>>8);
}
vorbis_fpu_restore(fpu);
}
}
}
return(samples*bytespersample);
}else{
return(samples);
}
}
This is the patch for oggvorbis_format.c:
— vorbis-tools-1.1.1/ogg123/oggvorbis_format.c 2005-06-03 18:15:09.000000000 +0800
+++ vorbis-tools-1.1.1/ogg123/oggvorbis_format.c 2007-06-23 00:55:40.430659487 +0800
@@ -27,6 +27,7 @@
#include "vorbis_comments.h"
#include "utf8.h"
#include "i18n.h"
+#include "vgplay.h"
typedef struct ovf_private_t {
@@ -38,6 +39,7 @@
int bos; /* At beginning of logical bitstream */
decoder_stats_t stats;
+ vgain_state vg;
} ovf_private_t;
/* Forward declarations */
@@ -87,6 +89,9 @@
private->stats.current_time = 0.0;
private->stats.instant_bitrate = 0;
private->stats.avg_bitrate = 0;
+
+ private->vg.scale_factor = 1.0;
+ private->vg.max_scale = 1.0;
} else {
fprintf(stderr, _("Error: Out of memory.\n"));
exit(1);
@@ -123,6 +128,8 @@
decoder->actual_fmt.rate = priv->vi->rate;
decoder->actual_fmt.channels = priv->vi->channels;
+
+ vg_new_track(&priv->vg, priv->vc);
print_vorbis_stream_info(decoder);
@@ -136,7 +143,7 @@
while (nbytes > 0) {
old_section = priv->current_section;
- ret = ov_read(&priv->vf, ptr, nbytes, audio_fmt->big_endian,
+ ret = ov_read_vg(&priv->vg, &priv->vf, ptr, nbytes, audio_fmt->big_endian,
audio_fmt->word_size, audio_fmt->signed_sample,
&priv->current_section);
This is the patch for the makefiles:
— vorbis-tools-1.1.1/ogg123/Makefile.am 2005-06-13 21:11:44.000000000 +0800
+++ vorbis-tools-1.1.1/ogg123/Makefile.am 2007-06-23 00:51:34.716657044 +0800
@@ -32,11 +32,11 @@
cfgfile_options.c cmdline_options.c \
file_transport.c format.c http_transport.c \
ogg123.c oggvorbis_format.c playlist.c \
- status.c transport.c vorbis_comments.c \
+ status.c transport.c vorbis_comments.c vgplay.c \
audio.h buffer.h callbacks.h compat.h \
cfgfile_options.h cmdline_options.h \
format.h ogg123.h playlist.h status.h \
- transport.h vorbis_comments.h \
+ transport.h vorbis_comments.h vgplay.h \
$(flac_sources) $(speex_sources)
man_MANS = ogg123.1
— vorbis-tools-1.1.1/ogg123/Makefile.in 2005-06-27 17:29:11.000000000 +0800
+++ vorbis-tools-1.1.1/ogg123/Makefile.in 2007-06-23 00:51:34.716657044 +0800
@@ -71,7 +71,7 @@
cmdline_options.$(OBJEXT) file_transport.$(OBJEXT) \
format.$(OBJEXT) http_transport.$(OBJEXT) ogg123.$(OBJEXT) \
oggvorbis_format.$(OBJEXT) playlist.$(OBJEXT) status.$(OBJEXT) \
- transport.$(OBJEXT) vorbis_comments.$(OBJEXT) $(am__objects_1) \
+ transport.$(OBJEXT) vorbis_comments.$(OBJEXT) vgplay.$(OBJEXT) $(am__objects_1) \
$(am__objects_2)
ogg123_OBJECTS = $(am_ogg123_OBJECTS)
DEFAULT_INCLUDES = -I. -I$(srcdir) -I$(top_builddir)
@@ -279,11 +279,11 @@
cfgfile_options.c cmdline_options.c \
file_transport.c format.c http_transport.c \
ogg123.c oggvorbis_format.c playlist.c \
- status.c transport.c vorbis_comments.c \
+ status.c transport.c vorbis_comments.c vgplay.c \
audio.h buffer.h callbacks.h compat.h \
cfgfile_options.h cmdline_options.h \
format.h ogg123.h playlist.h status.h \
- transport.h vorbis_comments.h \
+ transport.h vorbis_comments.h vgplay.h \
$(flac_sources) $(speex_sources)
man_MANS = ogg123.1