View Single Post
  #51  
Old 2020-02-16, 10:57 PM
Five's Avatar
Five Five is offline
 
Join Date: Oct 2004
Location: Canada
Re: The Validity of MD5 Checksums

shntool tutorial by Jason Jordan, 2004-05-05

part three of three

Code:
======================
3. Miscellaneous modes
======================

The following modes don't quite fit in the categories listed above.

------------
3a. cat mode
------------

The purpose of cat mode is to write (catenate) the WAVE header, WAVE data,
and/or extra RIFF chunks from one or more files to standard output.  This can be
useful for things like on-the-fly CD burning or streaming audio.  Typing:

% shntool cat filename

or

% echo "filename" | shntool cat

will write the WAVE header, WAVE data, and any extra RIFF chunks from the given
file to standard output.  If you want to suppress the WAVE header and extra RIFF
chunks (which, if not suppressed, can cause 'clicks' at the beginning and/or end
of tracks when piping output to a CD-burning program that expects raw WAVE
data), then give the '-nh' (no header) and '-nr' (no extra RIFF chunks)
switches:

% shntool cat -nh -nr filename

If you only need to output the WAVE header, then use the '-nd' (no data) and
'-nr' switches:

% shntool cat -nd -nr filename

The only use I can think of for this at the moment is for bug reporting.
Sometimes I only need to see WAVE headers to debug a problem, and this provides
a way for you to get them to me.  I hope you never have to use this option for
that reason, though.  :^)


------------
3b. cmp mode
------------

The purpose of cmp mode is to compare the WAVE data contained within two files.
This is useful if you want to verify the results CD-audio extraction against the
original files.  This can also be used to compare the WAVE data within two files
of different formats (e.g. file1.wav and file2.shn), for which other methods of
verification such as md5sum will not do.  Since this mode ignores WAVE headers
and extra RIFF chunks when making the comparison, it can also be used to compare
files that have been stripped (see strip mode above) to their unstripped
counterparts, which is another situation where md5sum would not be sufficient.

Before running the comparison, the WAVE headers are examined for differences.
If any value in the headers differ other than the reported data size or the
block align, then an error is reported and the files are not compared.
Otherwise, if the size of the WAVE data differs in the two files to be compared,
then a comparison is run only up to the size of the smaller of the two; and if
the block align values differ, then a warning is printed (since some CD-quality
WAVE files have headers that report a block align of 4 instead of 2), but the
comparison continues (since it seems that the block align is often ignored by
programs).

When run normally, cmp mode will either say the files are identical, or exit at
the first differing byte.  If you want to see all differing bytes (and their
values), use the -l option.  If any bytes differ, you will see a list of
offsets, similar to 'cmp -l' under UNIX.  In particular, offsets are 1-based,
meaning the first byte is offset 1, not 0.  The byte values of the differing
bytes in each file are also shown for reference.

Sometimes you might want to compare data in two files, one of which might
contain extra bytes at the beginning (such as a file ripped from a CD burned TAO
which might have an initial 2-second gap of silence, depending on the program
used to rip it).  In this case, you can use the -s option to have shntool
determine whether one of the files contains extra bytes at the beginning of the
WAVE data.  This option can also help identify a CD burner/CD reader combined
read/write offset.  Currently, only the first 529200 bytes (3 seconds of
CD-quality WAVE data) are searched for identicalness, but this should be more
than enough for most purposes.  If for some reason you believe that the files
are byte-shifted, but shntool does not think so, you can use the -f switch to
give shntool a "fuzz factor" that it will use.  This fuzz factor is simply a
positive integer that represents the maximum number of allowable byte mismatches
within the first 529200 bytes.  This allows you to check for differing bytes
between to files that (a) are byte-shifted and (b) contain at least one error
within the first 529200 bytes (an error that could have been cause by an
unreadable section of the CD, an unreliable CD reader, a bad hard drive/hard
drive cable, a network error, buggy hardware drivers, etc.).  The higher the
fuzz factor, the longer the -s option takes, so set it low at first (e.g. 8),
and increase it in small steps if needed.  Note that the -f option can only be
used with the -s option, since that's the only time the fuzz factor is used.

Here's an example comparison:

% shntool cmp test.wav test2.shn
comparing WAVE data in files 'test.wav' and 'test2.shn' ... 

contents of these files are identical.
%

Here's what you will see if the WAVE data sizes differ, but are identical up to
the WAVE data size of the smaller file:

% shntool cmp cmp1a.wav cmp1b.wav
shntool [cmp]: warning: size of data to be compared differs between these files -
                        WAVE data will only be compared up to the smaller size
comparing WAVE data in files 'cmp1a.wav' and 'cmp1b.wav' ... 

contents of these files are identical (up to the first 29332188 bytes of WAVE data).
%

Here's an example of what you might see if the WAVE data itself differs:

% shntool cmp test.wav test3.wav
comparing WAVE data in files 'test.wav' and 'test3.wav' ... 

WAVE data differs at byte offset 9847801.
%

Curious what the differences were in the above case?  Let's see:

% shntool cmp -l test.wav test3.wav
comparing WAVE data in files 'test.wav' and 'test3.wav' ... 

    offset   1   2
   ----------------
   9847801  13 157
  10619542 216  47

contents of these files differed as indicated above.
%

Now let's check a ripped file against its pre-burned couterpart:

% shntool cmp preburned.shn ripped.wav
shntool [cmp]: warning: size of data to be compared differs between these files -
                        WAVE data will only be compared up to the smaller size
comparing WAVE data in files 'preburned.shn' and 'ripped.wav' ... 

WAVE data differs at byte offset 1.
%

Oops, we should have used the -s option:

% shntool cmp -s preburned.shn ripped.wav
checking for byte-shift between input files...

file 'ripped.wav' seems to have:

  350448 extra bytes (87612 extra samples, or 149 extra sectors)

these extra bytes will be discarded before comparing the data.

preparing to do full comparison...

comparing aligned WAVE data in files 'preburned.shn' and 'ripped.wav' ... 

aligned contents of these files are identical.
%

Let's check two files that we think are identical, but one of which was ripped
from a sun-bleached, flaking, scratched CD:

% shntool cmp -s preburned2.shn ripped2.wav
checking for byte-shift between input files...

files 'preburned2.shn' and 'ripped2.wav' do not share identical data within the first 529200 bytes.
%

Hmm, I still think they are identical... let's use a fuzz factor to check:

% shntool cmp -s -f 8 preburned2.shn ripped2.wav
checking for byte-shift between input files...

with fuzz factor 8, file 'ripped2.wav' seems to have:

  350448 extra bytes (87612 extra samples, or 149 extra sectors)

these extra bytes will be discarded before comparing the data.

preparing to do full comparison...

shntool [cmp]: warning: size of data to be compared differs between these files -
                        WAVE data will only be compared up to the smaller size
comparing aligned WAVE data in files 'preburned2.shn' and 'ripped2.wav' ... 

WAVE data differs at byte offset 137.
%

Hmm, I wonder where else they differ:

% shntool cmp -s -f 8 -l preburned2.shn ripped2.wav
checking for byte-shift between input files...

with fuzz factor 8, file 'ripped2.wav' seems to have:

  350448 extra bytes (87612 extra samples, or 149 extra sectors)

these extra bytes will be discarded before comparing the data.

preparing to do full comparison...

shntool [cmp]: warning: size of data to be compared differs between these files -
                        WAVE data will only be compared up to the smaller size
comparing aligned WAVE data in files 'preburned2.shn' and 'ripped2.wav' ... 

    offset   1   2
   ----------------
       137 227 228
      1653 216 215
     56820  13 157

aligned contents of these files differed as indicated above.
%


========================
4. Custom format modules
========================

There is only one custom format module, 'cust'.  It is described below.

---------------
4a. cust format
---------------

The cust format provides the user with a means of specifying the precise
program and arguments shntool should use to encode output files.  This is
useful for overriding shntool's defaults for existing output formats, and
for enabling shntool to create files in a format that it does not yet
officially support.  The cust format has a simple format:

  { program argument_1 argument_2 ... argument_N }

This will create files with an extension of .custom by default.  If you wish
to provide your own extension (e.g. 'abc'), use this format:

  ext=abc { program argument_1 argument_2 ... argument_N }

NOTE:
-----
  At least one of the arguments must contain the string '%f', which is the
  placeholder for the output filename.  The cust format module will stick
  the output filename here, with the appropriate extension.

NOTE TO WINDOWS USERS:
----------------------
  Due to the way the Windows command prompt operates, you will need to put
  quotes around the curly braces, i.e.:  '{' program argument_1 ... '}'
  Also, if you use cust mode inside a batch file, you must use '%%f' instead
  of '%f'.


Here are two example uses for the cust output format.  The first shows how
you can use it to override an existing format module's default options, and
the second shows how you can use it to encode to a totally new format.

1.  Suppose you want to fix a set of files, and in the process create .shn's
    that are NOT seekable.  Here's one way to do it.  Note that although
    the output shows an extension of .custom, the files will really have the
    specified extension of .shn:

% shntool fix -noskip -o cust ext=shn { shorten -v2 - %f } *.shn
shntool [fix]: warning: no shift direction specified - assuming backward shift
gd80-10-11d1t01.shn --> gd80-10-11d1t01-fixed.custom ... done.
gd80-10-11d1t02.shn --> gd80-10-11d1t02-fixed.custom ... done.
gd80-10-11d1t03.shn --> gd80-10-11d1t03-fixed.custom ... done.
gd80-10-11d1t04.shn --> gd80-10-11d1t04-fixed.custom ... done.
gd80-10-11d1t05.shn --> gd80-10-11d1t05-fixed.custom ... done.
gd80-10-11d1t06.shn --> gd80-10-11d1t06-fixed.custom ... done.
gd80-10-11d1t07.shn --> gd80-10-11d1t07-fixed.custom ... done.
gd80-10-11d1t08.shn --> gd80-10-11d1t08-fixed.custom ... done.
gd80-10-11d1t09.shn --> gd80-10-11d1t09-fixed.custom ... done.
gd80-10-11d1t10.shn --> gd80-10-11d1t10-fixed.custom ... done.
No padding needed for 'gd80-10-11d1t10-fixed.custom'.
% shntool info *fixed.shn | grep seekable
  seekable:                   no
  seekable:                   no
  seekable:                   no
  seekable:                   no
  seekable:                   no
  seekable:                   no
  seekable:                   no
  seekable:                   no
  seekable:                   no
  seekable:                   no
%

An alternate way of specifying the .shn extension is as follows:

% shntool fix -noskip -o cust ext= { shorten -v2 - %fshn } *.shn


2.  Suppose you want to convert a set of files to a new format that shntool
    does not currently support.  Let's make one up named 'crunch'.  Assuming
    this fictitious format has a command-line program that will encode WAVE data
    read on standard input, all you have to do is figure out the proper
    arguments for doing so and plug it into the cust format module:

% shntool conv -o cust ext=crunch { wavcrunch -level 9 -input - -output new-%f } *.wav
converting 'test1.wav' to 'test1.custom' ... done.
converting 'test2.wav' to 'test2.custom' ... done.
converting 'test3.wav' to 'test3.custom' ... done.
%

Again, the files created above would have the specified extension of .crunch, not
the .custom extension shown.  Also, each file would be prefixed with the string
"new-", since that's what was specified to the encoder.  Thus, the three files
created in the above example would be named "new-test1.crunch", "new-test2.crunch"
and "new-test3.crunch".


==================
Document revision:
==================

$Id: TUTORIAL,v 1.77 2004/05/05 07:37:46 jason Exp $
__________________
Checksums Demystified | ask for help in Technobabble

thetradersden.org | ttd recommended free software/freeware webring
shntool tlh eac foobar2000 spek audacity cdwave vlc

Quote:
Originally posted by oxymoron
Here you are in a place of permanent madness, be careful!
Reply With Quote Reply with Nested Quotes