ユーザ用ツール

サイト用ツール


サイドバー

最新の10件
一覧
openai:whisper.cpp
$ git clone https://github.com/ggerganov/whisper.cpp
$ make
cc  -I.              -O3 -std=c11   -pthread -mfma -mf16c -mavx -mavx2   -c ggml.c -o ggml.o
g++ -I. -I./examples -O3 -std=c++11 -pthread -c whisper.cpp -o whisper.o
g++ -I. -I./examples -O3 -std=c++11 -pthread examples/main/main.cpp ggml.o whisper.o -o main 
./main -h

usage: ./main [options] file0.wav file1.wav ...

options:
  -h,       --help           show this help message and exit
  -s SEED,  --seed SEED      RNG seed (default: -1)
  -t N,     --threads N      number of threads to use during computation (default: 4)
  -p N,     --processors N   number of processors to use during computation (default: 1)
  -ot N,    --offset-t N     time offset in milliseconds (default: 0)
  -on N,    --offset-n N     segment index offset (default: 0)
  -d  N,    --duration N     duration of audio to process in milliseconds (default: 0)
  -mc N,    --max-context N  maximum number of text context tokens to store (default: max)
  -ml N,    --max-len N      maximum segment length in characters (default: 0)
  -wt N,    --word-thold N   word timestamp probability threshold (default: 0.010000)
  -su,      --speed-up       speed up audio by factor of 2 (faster processing, reduced accuracy, default: false)
  -v,       --verbose        verbose output
            --translate      translate from source language to english
  -otxt,    --output-txt     output result in a text file
  -ovtt,    --output-vtt     output result in a vtt file
  -osrt,    --output-srt     output result in a srt file
  -owts,    --output-words   output script for generating karaoke video
  -ps,      --print_special  print special tokens
  -pc,      --print_colors   print colors
  -nt,      --no_timestamps  do not print timestamps
  -l LANG,  --language LANG  spoken language (default: en)
  -m FNAME, --model FNAME    model path (default: models/ggml-base.en.bin)
  -f FNAME, --file FNAME     input WAV file path
$ bash ./models/download-ggml-model.sh base.en
Downloading ggml model base.en from https://huggingface.co/datasets/ggerganov/whisper.cpp ...
ggml-base.en.bin                                     100%[=====================================================================================================================>] 141.11M   434KB/s 時間 3m 56s
Done! Model base.en saved in models/ggml-base.en.bin
You can now use it like this:

  $ ./main -m models/ggml-base.en.bin -f samples/jfk.wav

$ ./main -m models/ggml-base.en.bin -f samples/jfk.wav
whisper_model_load: loading model from models/ggml-base.en.bin
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 2
whisper_model_load: mem_required  = 506.00 MB
whisper_model_load: adding 1607 extra tokens
whisper_model_load: ggml ctx size = 140.60 MB
whisper_model_load: memory size =    22.83 MB
whisper_model_load: model size  =   140.54 MB

system_info: n_threads = 4 / 4 | AVX2 = 1 | AVX512 = 0 | NEON = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 |

main: processing samples/jfk.wav (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:11.000]   And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.


whisper_print_timings:     load time =   542.31 ms
whisper_print_timings:      mel time =   174.94 ms
whisper_print_timings:   sample time =    14.82 ms
whisper_print_timings:   encode time =  6282.06 ms / 1047.01 ms per layer
whisper_print_timings:   decode time =   373.59 ms / 62.27 ms per layer
whisper_print_timings:    total time =  7390.82 ms

コメント

コメントを入力. Wiki文法が有効です:
   ___   __  __ ______  ____   ____ 
  / _ ) / / / //_  __/ / __ \ / __ \
 / _  |/ /_/ /  / /   / /_/ // /_/ /
/____/ \____/  /_/    \____/ \___\_\
 
openai/whisper.cpp.txt · 最終更新: 2022/11/21 03:04 by matoken