stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suman Saurabh <ss.sumansaurab...@gmail.com>
Subject Multiple model files names need to be parsed to ContentItem simultaneously, is it possible
Date Fri, 18 Jul 2014 22:41:48 GMT
Hi Rupert,

I want information on whether client can input multiple model files at same
time,
for e.g. Sphinx requires acoustic( "feat.params", "mdef", "means",
"mixture_weights", "noisedict", "transition_matrices", "variances") ,
language(en-us.lm.dmp), dictionary(en-cmudict.0.6d) model simultaneously.
Also acoustic model files can't be interchanged with other acoustic model
files.


I went through various source code - only a 'single model' file is always
parsed.

1) Is to possible to parse multiple model files names to ContentItem, if
yes please provide me brief details of usage?

2) Can client parse bundle-name of the model file (i.e
org.apache.stanbol.data.model.wsj ) to the Content-Item?

If yes, it will be helpful for client. Just by *installing* and *parsing*
bundle-name of his own set of acoustic, language and dictionary bundle , he
can use the sphinx engine, instead of asking such large no. of model files.

In ModelProvider Interface I have done the above ( passing bundle-name to
the method LanguageModel getModel(String lang, String bundleName) ), bit
different from what you asked.

If it can parse *multiple Model* file names - than I will update the code
accordingly.

3) I am building the code for SpeechToTextEngine (I am referencing  Tika
Engine Source code), is there any more thing that I must know for building
the Engine.


Regards,
Suman Saurabh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message