Polyglot requires a model for each task and language. These models are essential for the library to function. Given the large size of some of the models, we distribute the models through a download manager separately. The download manager has several modes of operation.
In [2]:
!polyglot download --help
In [3]:
!polyglot download morph2.en
In [ ]:
!polyglot download
In [ ]:
from polyglot.downloader import downloader
downloader.download("embeddings2.en")
You noticed, by now, that we can install a specific model by specifying its name and the target language.
Package name format is task_name.language_code
Packages are grouped by language. For example, if we want to download all the models that are specific to Arabic, the arabic collection of models name is LANG: followed by the language code of Arabic which is ar
.
Therefore, we can just run:
In [6]:
!polyglot download LANG:ar
In [7]:
downloader.download("TASK:transliteration2", quiet=True)
Out[7]:
We can query our download manager for which tasks are supported by polyglot, as the following:
In [8]:
downloader.supported_tasks(lang="en")
Out[8]:
We can query our download manager for which languages are supported by polyglot named entity recognition subsystem, as the following:
In [9]:
print(downloader.supported_languages_table(task="ner2"))
You can view all the available and/or installed collections or packages through the list function
In [14]:
downloader.list(show_packages=False)