-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative? #81
Comments
Hi @msqr1 ! Great initiative :) software evolves and needs to be maintained. I do not have time to dedicate to this repository so it is good that better alternatives surge and gain traction. I'll have a deeper look at your work later this week. In the end, users decide based on the developer experience and the features of these libraries so I'd be interested on what other users like @Yahweasel or @erikh2000 think. |
The core thing I need out of vosk-browser is to not have an AudioContext-level API. I do all of my own audio capturing and ten other layers of processing. Further, although in my own project I do use threads, so SharedArrayBuffer is a nonissue, it's valuable to have a version that runs synchronously, because some users (including myself) manage their own threads. I would rather have a vosk running synchronously with a Worker thread I created on my own than running asynchronously with a Worker thread created by a library. To excessively toot my own horn, my own libav.js allows the user to load it in a synchronous mode, a worker mode, or a threaded mode, and provides the same API in all three. Basically: I wouldn't mind a more up-to-date vosk adapter, but as stands, your API is too opinionated for me. |
You're right, I try to make this as easy to use as possible, just some minimal setup and you can start recognizing. I agree that more features should be added, but as this is the first version, I want to make it as fast and easy to setup as possible. Other use cases can be addressed later. |
@msqr1 I'm interested in your project, but I'm likely to stick with vosk-browser out of inertia and not having any complaints with it. The main thing I saw in Vosklet that I'd like to see in vosk-browser, if practical, is more of the Vosk functions exposed. I had told myself that at some point I'd get vosk-browser building and try to contribute that myself, but I never got around to it. The faster processing time is intriguing too. What kind of metrics are you seeing? |
I didn't really measured it, ngl, so maybe I should remove that line. But, I moved hot computations to c++ like |
No worries, @msqr1. I don't expect you to be super-scientific in your claims. I was just curious about what kind of speed increase you might be seeing. Your changes for performance seem promising. |
FYI, simd will do not a damned thing (other than make it not work on Safari) unless the code is specifically written to use it. wasm simd is broadly compatible with x86 simd, but only the C API, and nobody uses the C API. I would be stunned to learn that that's gaining you anything. I had a simd version of libav.js for years and finally ditched it because it wasn't actually beneficial. |
Well, the thing is kaldi just refuses to compile with simd off, so I have to turn it on. It may or may not do anything though. |
Oh, well that's just lovely X-D |
Just curious, how do you use a speech recognition library with your libav project? Isn't that for audio formats? |
I do not. I use both in Ennuicastr. |
I can make a sync version, I just don't know how it is possible. If you block the current thread to recognize, how do you stop it? Synchronous model and recognizer loading should be easy. I'm not sure about the recognizer loop. |
We're on an issue submitted to a synchronous version of the same API ;) |
The recognizer, I can't see how it is synchronous? It can't be blocking the one thread that is controlling itself. |
The API of Vosk just takes a chunk at a time. That API is synchronous. |
I get it, but wouldn't that block itself from other actions? I can surely add acceptWaveformSync() that recognize (will block) on the same thread and return the result. Will that fit your use case? Ngl, a fully synchronous API, is even easier than the current one. I only need to translate it over without managing task queues and other stuff |
My case is that I have vosk-browser loaded in a Worker thread which is also responsible for echo cancellation, noise suppression, audio metrics, and encoding. Each of these steps takes raw Float32Array audio in and spits raw Float32Array audio out, and I want them all to be synchronous because I'm managing all the threading myself. What I mean when I say that your API is opinionated is that it's doing more than just vosk: it's handling capture, it's handling threading, it's handling formats. For some people, that's presumably very useful. For me, that's actively unhelpful. Also, to be clear: you should not be writing your code to fit my use case if that doesn't help you in any way. I'm perfectly happy with vosk-browser, and have no urgent need for a more updated version, though as a general principle I'd like for things to be up to date. I'm only presenting my case on this thread because I was asked to. |
No, I just want to find out how you use it, because I just want to see what use case would synchronous vosk be needed, so thanks for your information! The above really helped me learn! |
I can be totally precise: https://github.com/ennuicastr/ennuicastr/blob/3b3830fc979b039c245429a5ec7657594af4a705/awp/ennuicastr-worker.ts#L786 There's my call to acceptWaveformFloat :) |
I completely understand it now :))))))) |
@ccoreilly did you go over it? |
FWIW I'd also be interested in a "updated" alternative that is actively maintained. Yet I would need to better understand in what the alternative is different. If it is entirely compatible, e.g
even without providing any improvement, I would probably be interested. Yet, if it does have any trade off, e.g breaks compatibility with some context, like older browsers, Chromium only, etc, then IMHO they should be made explicit. PS: to clarify even though https://github.com/ccoreilly/vosk-browser/tree/master/examples/modern-vanilla is 2 years old, it works for me even in rather "exotic" context, e.g Oculus browser for WebXR. |
I have Vosklet that i make as an alternative. You would want to check it out @Utopiah! It does need SABs though. I can make it SAB-less but I think it is just too much work |
This is not really an issue. But I went through remaking the repo from scratch using newer web technology and features: https://github.com/msqr1/Vosklet. Can I merge some changes over there to here, there are lots of stuff to be improve, as this is getting outdated. @ccoreilly
The text was updated successfully, but these errors were encountered: