whisper.cpp

Running

ggerganov commited on Nov 23, 2022

Commit

679d38e

unverified ·

1 Parent(s): ff21a60

talk.wasm : add audio pre-processing + bump memory

Files changed (4) hide show

examples/talk.wasm/CMakeLists.txt CHANGED Viewed

@@ -31,8 +31,8 @@ set_target_properties(${TARGET} PROPERTIES LINK_FLAGS " \
     --bind \
     -s USE_PTHREADS=1 \
     -s PTHREAD_POOL_SIZE=8 \
-    -s INITIAL_MEMORY=1400MB \
-    -s TOTAL_MEMORY=1400MB \
     -s FORCE_FILESYSTEM=1 \
     -s EXPORTED_RUNTIME_METHODS=\"['print', 'printErr', 'ccall', 'cwrap']\" \
     ${EXTRA_FLAGS} \

     --bind \
     -s USE_PTHREADS=1 \
     -s PTHREAD_POOL_SIZE=8 \
+    -s INITIAL_MEMORY=1600MB \
+    -s TOTAL_MEMORY=1600MB \
     -s FORCE_FILESYSTEM=1 \
     -s EXPORTED_RUNTIME_METHODS=\"['print', 'printErr', 'ccall', 'cwrap']\" \
     ${EXTRA_FLAGS} \

examples/talk.wasm/README.md CHANGED Viewed

@@ -34,7 +34,7 @@ In order to run this demo efficiently, you need to have the following:
 - Latest Chrome or Firefox browser (Safari is not supported)
 - Run this on a desktop or laptop with modern CPU (a mobile phone will likely not be good enough)
 - Speak phrases that are no longer than 10 seconds - this is the audio context of the AI
-- The web-page uses about 1.4GB of RAM
 Notice that this demo is using the smallest GPT-2 model, so the generated text responses are not always very good.
 Also, the prompting strategy can likely be improved to achieve better results.

 - Latest Chrome or Firefox browser (Safari is not supported)
 - Run this on a desktop or laptop with modern CPU (a mobile phone will likely not be good enough)
 - Speak phrases that are no longer than 10 seconds - this is the audio context of the AI
+- The web-page uses about 1.6GB of RAM
 Notice that this demo is using the smallest GPT-2 model, so the generated text responses are not always very good.
 Also, the prompting strategy can likely be improved to achieve better results.

examples/talk.wasm/gpt-2.cpp CHANGED Viewed

@@ -513,7 +513,7 @@ bool gpt2_eval(
     const int n_head  = hparams.n_head;
     const int n_vocab = hparams.n_vocab;
-    static size_t buf_size = 512u*1024*1024;
     static void * buf = malloc(buf_size);
     if (mem_per_token > 0 && mem_per_token*N > buf_size) {

     const int n_head  = hparams.n_head;
     const int n_vocab = hparams.n_vocab;
+    static size_t buf_size = 640u*1024*1024;
     static void * buf = malloc(buf_size);
     if (mem_per_token > 0 && mem_per_token*N > buf_size) {

examples/talk.wasm/index-tmpl.html CHANGED Viewed

@@ -504,7 +504,13 @@
             function startRecording() {
                 if (!context) {
-                    context = new AudioContext({sampleRate: 16000, noiseSuppression: true});
                 }
                 Module.set_status("");

             function startRecording() {
                 if (!context) {
+                    context = new AudioContext({
+                        sampleRate: 16000,
+                        channelCount: 1,
+                        echoCancellation: true,
+                        autoGainControl:  true,
+                        noiseSuppression: true,
+                    });
                 }
                 Module.set_status("");