Importing Timestamp Information
Audacity cannot yet read the timestamp information which can be stored by professional recording hardware using the Broadcast Wave (BWF) format. Timestamps are useful in identifying different parts of the recording.
This page proposes a workaround using a Python program to write the timestamp information to a label file which can then be imported into Audacity. |
Quick Links:
Contents
Python analysis program
The following python program analyzes a WAV file in the Broadcast WAV (BWF) format that is for example produced by the M-Audio Microtrack recorder. The program writes the timestamp data to the file labels.txt that can be imported as a label track into Audacity, using the
menu item (under the File menu in current Audacity and legacy 1.3, or the Project menu in legacy Audacity 1.2).Amendment made to script 11 May 2013
Added an error message if no cues are found, and changed syntax to (hopefully) run with python3.
Amendment made to script 29 February 2016
The code line
cuechunk=="data" and cuename==i and cuepos==cueoffset):
was updated to
cuechunk=="data" and cuename==i):
because Rmunn observed that the "cuepos==cueoffset" check was too strict for the Zoom H4n which creates markers with cuepos always equal to 0, and cueoffset equal to the actual position. The cuepos value is never used by that Python code, so it really doesn't matter what it is.
Script
#!/usr/bin/python def formatline(position1,position2,name): # for the labels.txt file num1 = "%8.6f" % position1 num2 = "%8.6f" % position2 return num1+ "\t"+ num2 +"\t" +name+"\n" from sys import exit,argv import wave def error(s): print (s) exit() def readnumber(f): c = f.read(4) if len(c)<4: error("Sorry, no cue information found.") return sum(ord(c[i])*256**i for i in range(4)) def findcues(filename): f = wave.open(filename,"r") framerate = f.getframerate() channels = f.getnchannels() bytespersample = f.getsampwidth() totalframes = f.getnframes() totalduration = float(totalframes)/framerate byterate = framerate * channels * bytespersample print (str(framerate) + " samples = "+str(byterate)+ " bytes per second") f.close() f = open(filename,"r") if f.read(4) != "RIFF": error("Unknown file format (not RIFF)") f.read(4) if f.read(4) != "WAVE": error("Unknown file format (not WAVE)") name = f.read(4) while name != "cue ": leng= readnumber(f) f.seek(leng,1) # relative skip name = f.read(4) leng= readnumber(f) num = readnumber(f) if leng != 4+24*num: error("Inconsistent length of cue chunk") print (str(num) + "MARKER(S) found *********") if num>0: oldmarker = 0.0 out=open("labels.txt","w") for i in range(1,num+1): cuename = readnumber(f) cuepos = readnumber(f) cuechunk = f.read(4) cuechunkstart = readnumber(f) cueblockstart = readnumber(f) cueoffset = readnumber(f) if not (cuechunkstart==0 and cueblockstart==0 and cuechunk=="data" and cuename==i): print (cuename, cuepos, cuechunk, cuechunkstart, cueblockstart, cueoffset) error("unexpected marker data") else: position = float(cueoffset)/framerate print("Marker",i, " offset =",cueoffset,"samples =",position,"seconds") if position>oldmarker: # prefer to mark them as regions (intervals) out.write(formatline(oldmarker,position, "Section "+str(i))) else: out.write(formatline(position,position, "Marker "+str(i))) oldmarker = position if totalduration>oldmarker: out.write(formatline(oldmarker,totalduration, "Section "+str(num+1))) out.close() print ("Marker data was written to labels.txt") else: print ("No marker data file was written.") f.close() if __name__ == "__main__": if len(argv)<=1: print ("Usage: python "+ argv[0] + " WAV-file") exit() filename = argv[1] print ("extract position markers from file " +filename) findcues(filename)
How to run
You need python to run this. I wrote this file for python version 2.7.6. I have not tested it with the new python version 3, but I think it should run. Store this text in a file called findcues.py. Run it with
python findcues.py your-WAVE-filename.wav
from the shell command line, or with
import findcues findcues.findcues("your-WAVE-filename.wav")in a python window (assuming that findcues.py is in a place where the python system finds it). If it finds timestamp information, it will generate a file labels.txt. So far, I have tested it for the files of the M-Audio Microtrack-II recorder. If someone has positive or negative experience with files produced by other devices, please mention this here.
- Also works with Sony PCM-M10 when using the "T-MARK" button to set track marks. BTW: the Python program crashes when I try it on a wav file without cues.
Acknowledgements
I have benefited from the info at the following pages
- http://www.it.fht-esslingen.de/~schmidt/vorlesungen/mm/seminar/ss00/HTML/node119.html
- http://www.sonicspot.com/guide/wavefiles.html#cue
- http://www.blitter.com/~russtopia/MIDI/~jglatt/tech/wave.htm
The following program lists all chunks of a WAV file and helped me to analyze and understand the structure of the files that I had:
def readnumber(f): s=0 c = f.read(4) for i in range(4): s += ord(c[i])*256**i # print " ***", repr(c),"EEE" return s def readchunk(f,level=0,searchcues=False): pos = f.tell() name= f.read(4) leng= readnumber(f) totleng = leng+8 print " "*level,name,"len-8 =%8d"%leng," start of chunk =",pos,"bytes" if name in ("RIFF","list"): print " "*level,f.read(4),"recursive sublist" sublen = leng-4 while sublen>0: sublen -= readchunk(f,level+1,searchcues) if sublen !=0: print "ERROR:",sublen elif searchcues and name=="cue ": sublen=leng-4 num = readnumber(f) print num,"MARKER(S) *********" for i in range(num): sublen -= 24 cuename = readnumber(f) cuepos = readnumber(f) cuechunk = f.read(4) cuechunkstart = readnumber(f) cueblockstart = readnumber(f) cueoffset = readnumber(f) if not (cuechunkstart==0 and cueblockstart==0 and cuechunk=="data" and cuename==i+1 and cuepos==cueoffset): print "unexpected marker data", cuename, cuepos, cuechunk,\ cuechunkstart, cueblockstart, cueoffset else: print "Marker#",i+1," offset =",cueoffset,"bytes ***" if sublen !=0: print "ERROR:",sublen elif searchcues and name=="labl" and level==2: sublen=leng-4 labelname = readnumber(f) labeltext = f.read(sublen) print "Label #",labelname," name = >>"+labeltext.rstrip("\x00")+"<<" else: f.seek(leng,1) # relative skip return totleng def allchunks(f): readchunk(f) c = f.read() if c != '': print "error", len(c), c[:20] def cues(f): readchunk(f,searchcues=True) if __name__ == "__main__": from sys import argv if len(argv)<=1: print "Usage: python", argv[0],"WAV-file" exit() filename = argv[1] print "analyze chunk structure from WAV-file",filename f = open(filename,"r") cues(f) f.close()
Future features
might want to look at "list/adtl" chunks in order to find the true names of the breakpoints. Hopefully, this will be found in Audacity itself soon. Also, my files have a "regn" chunk that seems to contain some "regions", called "Section 1" etc. --Guenterrote 22:49, 9 March 2010 (CST)