An Automated Tool for Detecting Speech Onset
The Chronset website allows users to upload a set of .wav files and receive a list of onset latency estimates via e-mail.
A full description of the Chronset algorithm is available in Roux, F., Armstrong, B. C., & Carreiras, M. (2016). Chronset: an automated tool for detecting speech onset. Behavior Research Methods
Run via website. To use the chronset website, create a .zip file that contains your .wav files (and *only* your wav files, without any subdirectory structure) and upload it to the website. Currently, Chronset requires that these files be readable using matlab's wavread2 routine, which is able to read most standard .wav files. There have been some issues with using wav2read in the latest versions of matlab, so the program will try to use the audioread routine in these later versions as a fallback. This fallback routine has not been tested on as wide a range of systems as the wav2read functionality, however.
If your recording is in stereo, the first channel will be used. Better performance can be achieved by using a sampling rate of 22kHz or higher (e.g., 44kHz) with the current version of Chronset. Once your files have been uploaded, you will receive a confirmation message indicating that they have been scheduled for processing.
Run on your computer. Alternatively, you can download the Chronset source code (written in Matlab) and run it on your local computer. Click here to download the source code.
Webserver processing time
When the server is not under load and your job begins immediately, it will take approximately 10 seconds to process each .wav file, including Chronset startup time on the server (under 20 minutes for 100 high quality .wav files of 3 seconds in duration @ 44kHz stereo). However, as the popularity of Chronset has increased, there are times when your Chronset job will be queued behind other Chronset jobs. In some instances, you may find yourself behind multiple jobs started by multiple users. The current number of jobs in the queue appears just above the the submission link at the bottom of this page to give you a basic idea of how many jobs must be completed ahead of yours. Jobs are processed in the order in which they are received, one at a time. Load is typically lowest on weekends.
In response to popular demand, we have already completed one hardware upgrade on the server to facilitate processing for our end users and another upgrade is planned in the medium term.
Tips and troubleshooting
Users are strongly encouraged to upload an initial small job (~4 recordings) to confirm that they are uploading files in the correct format and to get a sense for current wait times. Such small jobs should complete within minutes of upload if there are no other jobs in the queue.
We have observed that better performance can be achieved by using a sampling rate of 22kHz or higher (e.g., 44kHz) with the current version of Chronset.
Errors will sometimes occur if you upload very short (less than ~50 ms) recordings or blank recordings with no sound in them whatsoever. This may generate an error like the following. If you encounter such an error, we suggest looking for very short/small files and checking if removing those files addresses your issue.
Error using parallel_function (line 589) Undefined function or variable "a". Error stack: runline.m at 33 locdetrend.m at 44 compute_feat_data.m at 7 chronset_batch>(parfor body) at 87 Error in chronset_batch (line 41)
Chronset has been found to work well in standard psycholinguistic experiments. Use in extremely noisy environments (e.g., fMRI) is not recommended at this time. NANs are produced when no speech is detected in a recording relative to the file's baseline noise, and -1's are produced when there is a speech like signal throughout recording starting at waveform onset. Lower sampling rates tend to increase the likelihood of such effects due to reduced information in the input file, although the number of NANs typically remains very low in standard recordings. As reported in the Chronset paper, such issues, or misestimations of a small number of onsets are very unlikely to affect the results of an experiment with a modest amount of experimental power. Similarly, if you have higher numbers of NaNs for only a few participants, it may be due to some specific attribute of the recordings for that participant (e.g., line noise, microphone/sound card powering down and no longer recording any signal).
Please feel free to contact us if you have questions about the output that Chronset is producing or if Chronset does not appear to be behaving as expected given the description in the paper and the notes in the previous paragraph. Ideally, send us a few .wav files so that we can inspect the file properties when responding to you. Our contact information appears below.
Questions, comments, suggestions, data sharing?
If you do not receive your onsets from the webserver in a reasonable time frame based on the load when you submitted your job, please contact us.
Likewise, we would be very happy if you can share your experience with Chronset with us to help us target further fine tuning of the software to maximize its performance in a wide variety of contexts.
In particular, if you have manually annoted .wav files that you are willing to share (ideally from a published article), this would be very useful for future optimization tasks. We would be happy to acknowledge such a contribution in any future Chronset work.
Privacy and data protection policy
The developers of Chronset respect the privacy and security of all data uploaded to the Chronset server. Your data is handled within a secure server environment by an automatic script. None of the Chronset developers have access to any of the files that are uploaded, and all uploaded files are deleted automatically after the processing job is complete.