Make lazy loading of the language models optional #79

davidecaroselli · 2020-11-23T15:04:42Z

Hello!

running the very first detection really takes a lot of time. Moreover, it depends on the actual language detected:

detector.detectLanguageOf("This is an example") // takes ~8 seconds
detector.detectLanguageOf("This is an example") // takes ~4 ms
detector.detectLanguageOf("Questo è un esempio") // takes ~14 seconds

So my question is: how can I create a "warmup" procedure to run once and have all models loaded?
Very trivial implementation: run detect with a multi-language sample. Is there anything more elegant than that?

Thank you!

pemistahl · 2020-11-23T16:27:58Z

Hi Davide, thanks for your question.

The behavior you are describing is on purpose. The language models are loaded only for those languages which are possible for the given input text. That is why the first detection takes more time than the second one. The third detection takes again more time because special characters are encountered for which all fitting models are loaded.

I could change this behavior to load all models before the first detection but why should I do that? The total running time of your three detections would be the same.

Do you have any specific problem with the current behavior?

davidecaroselli · 2020-11-24T10:04:37Z

Hi @pemistahl !

No doubt this behavior is perfect for some use cases! Not intended to complain as it was bug :)
My idea would be to have an optional way to pre-load all models at once (and maybe even in parallel with multithreading).

Why this can be useful? Imagine if you want to expose the service to your users: you would like to pre-load models at startup time, this way you avoid having some initial requests take 1000x what it is supposed to take in terms of time per request.

Is there a way I can achieve that without having a list of sentences in all languages, and manually detect them at the beginning?

Thanks!

pemistahl · 2020-11-24T10:22:55Z

You are right. For this use case, preloading all models at once would be beneficial. At the moment, this is not possible. But I could implement an optional setting to allow this in the next release. I will put it on my bucket list. :)

Thanks again for your input and for using my library. Very much appreciated.

davidecaroselli · 2020-11-24T10:29:17Z

Thank you @pemistahl !

pemistahl changed the title ~~How to warmup models?~~ Make lazy loading of the language models optional Dec 9, 2020

pemistahl added the new feature label Dec 9, 2020

pemistahl added this to the Lingua 1.1.0 milestone Dec 9, 2020

pemistahl added a commit that referenced this issue Feb 25, 2021

Add option to preload all language models (#79)

6e1d711

pemistahl closed this as completed Feb 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make lazy loading of the language models optional #79

Make lazy loading of the language models optional #79

davidecaroselli commented Nov 23, 2020

pemistahl commented Nov 23, 2020

davidecaroselli commented Nov 24, 2020

pemistahl commented Nov 24, 2020 •

edited

Loading

davidecaroselli commented Nov 24, 2020

Make lazy loading of the language models optional #79

Make lazy loading of the language models optional #79

Comments

davidecaroselli commented Nov 23, 2020

pemistahl commented Nov 23, 2020

davidecaroselli commented Nov 24, 2020

pemistahl commented Nov 24, 2020 • edited Loading

davidecaroselli commented Nov 24, 2020

pemistahl commented Nov 24, 2020 •

edited

Loading