Author: Stephan stephan@bayesimpact.org
Skip the run test because it would take too much time to train the models.
This notebook crashes when I run it in the Docker, I suspect because something runs out of memory. It works when I executed directly on my machine.
This notebook explains how to create the vector space models from Wikipedia used in the job description to skill analysis. We use the Gensim python library to do all the heavy lifting.
First you have to download the latest wikipedia dump:
ftp://wikipedia.c3sl.ufpr.br/wikipedia/frwiki/
The file to download ends with -pages-articles.xml.bz2
Gensim comes with a handy script that operates on exactly this dump. You can even leave it in its zipped state to save space on disk. To create a Tf-idf representation of the text, and its related dictionary, you can simply run (watch out, this is going to take around 6 hours):
python -m gensim.scripts.make_wiki DUMPNAME OUTPUT_PREFIX
I used frwiki as the OUTPUT_PREFIX and stored dump and results of the script in bob_emploi/data/wiki. Other notebooks rely on this location and naming convention.
To train the LSA model, simply run the following cells. This took about 4 hours on my MacBook pro.
In [1]:
from __future__ import division
import logging
import pickle
import pandas as pd
from gensim.models.lsimodel import LsiModel
from gensim.corpora import WikiCorpus, MmCorpus, Dictionary
dict_path = '../../data/wiki/frwiki_wordids.txt.bz2'
corpus_path = '../../data/wiki/frwiki_tfidf.mm'
lsi_model_path = '../../data/wiki/frwiki_lsi'
title2id_path = '../../data/wiki/title2id_mapping.pckl'
skills_wiki_path = '../../data/linkedin_skills_wiki.csv'
skills_corpus_path = '../../wiki/data/skills_corpus.json'
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
In [2]:
dictionary = Dictionary.load_from_text(dict_path)
tfidf_corpus = MmCorpus(corpus_path)
In [3]:
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
lsi_model = LsiModel(corpus=tfidf_corpus, id2word=dictionary, num_topics=400)
lsi_model.save(lsi_model_path)
INFO:gensim.models.lsimodel:using serial LSI version on this node
INFO:gensim.models.lsimodel:updating model with new documents
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.013% of energy spectrum)
INFO:gensim.models.lsimodel:processed documents up to #20000
INFO:gensim.models.lsimodel:topic #0(18.130): 0.173*"roi" + 0.128*"calvados" + 0.119*"américain" + 0.107*"église" + 0.103*"communes" + 0.099*"ii" + 0.098*"calendrier" + 0.097*"jpg" + 0.096*"empereur" + 0.089*"peintre"
INFO:gensim.models.lsimodel:topic #1(14.910): -0.357*"calvados" + -0.220*"communes" + -0.187*"altitudes" + -0.183*"manche" + -0.177*"normandie" + -0.174*"peuplée" + -0.169*"ign" + -0.161*"monuments" + -0.154*"coordonnées" + -0.146*"insee"
INFO:gensim.models.lsimodel:topic #2(11.772): 0.303*"député" + 0.209*"législature" + -0.205*"roi" + 0.170*"ve" + -0.152*"av" + -0.148*"calendrier" + 0.140*"maire" + 0.135*"république" + -0.125*"empereur" + -0.122*"ii"
INFO:gensim.models.lsimodel:topic #3(11.358): -0.370*"américain" + -0.248*"acteur" + -0.196*"actrice" + 0.185*"av" + -0.138*"chanteur" + -0.132*"canadien" + -0.130*"compositeur" + -0.129*"britannique" + -0.128*"québécois" + -0.127*"américaine"
INFO:gensim.models.lsimodel:topic #4(10.976): 0.376*"député" + 0.262*"législature" + 0.214*"ve" + 0.194*"maire" + 0.145*"république" + 0.138*"circonscription" + -0.129*"genre" + 0.124*"roi" + 0.120*"conseil" + 0.117*"ump"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.580% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 13.130% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #40000
INFO:gensim.models.lsimodel:topic #0(22.604): 0.146*"roi" + 0.099*"jpg" + 0.099*"communes" + 0.092*"église" + 0.082*"ii" + 0.078*"américain" + 0.077*"fichier" + 0.074*"louis" + 0.073*"empereur" + 0.069*"film"
INFO:gensim.models.lsimodel:topic #1(16.156): -0.303*"communes" + -0.295*"calvados" + -0.173*"monuments" + -0.159*"altitudes" + -0.156*"manche" + -0.155*"normandie" + -0.154*"église" + -0.152*"habitants" + -0.149*"peuplée" + -0.146*"canton"
INFO:gensim.models.lsimodel:topic #2(14.026): 0.268*"roi" + -0.188*"genre" + 0.148*"ii" + 0.139*"empereur" + 0.123*"calendrier" + 0.119*"av" + 0.115*"duc" + -0.102*"espèces" + 0.097*"règne" + -0.096*"plantes"
INFO:gensim.models.lsimodel:topic #3(12.997): -0.307*"genre" + 0.294*"député" + 0.215*"législature" + 0.174*"président" + 0.163*"maire" + 0.161*"république" + 0.151*"ve" + -0.146*"genres" + -0.145*"av" + -0.141*"plantes"
INFO:gensim.models.lsimodel:topic #4(12.523): 0.545*"genre" + 0.255*"genres" + 0.228*"plantes" + 0.208*"espèces" + 0.203*"député" + 0.184*"dicotylédones" + 0.151*"législature" + 0.137*"botanique" + 0.125*"tropicales" + 0.116*"maire"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.430% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 10.485% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #60000
INFO:gensim.models.lsimodel:topic #0(26.176): 0.121*"roi" + 0.102*"communes" + 0.099*"jpg" + 0.094*"canton" + 0.089*"église" + 0.079*"film" + 0.073*"fichier" + 0.072*"louis" + 0.067*"ii" + 0.066*"américain"
INFO:gensim.models.lsimodel:topic #1(17.951): -0.416*"canton" + -0.338*"communes" + -0.188*"calvados" + -0.158*"département" + -0.142*"monuments" + -0.139*"habitants" + -0.134*"église" + -0.126*"démographie" + -0.111*"insee" + -0.109*"maire"
INFO:gensim.models.lsimodel:topic #2(15.607): 0.274*"roi" + 0.148*"ii" + 0.137*"empereur" + 0.123*"duc" + 0.109*"pape" + -0.107*"genre" + 0.103*"calendrier" + 0.100*"comte" + 0.097*"fils" + 0.097*"charles"
INFO:gensim.models.lsimodel:topic #3(14.822): 0.573*"canton" + -0.220*"calvados" + 0.151*"cantons" + -0.144*"église" + 0.134*"député" + -0.129*"normandie" + -0.129*"monuments" + -0.125*"altitudes" + -0.116*"peuplée" + -0.112*"ign"
INFO:gensim.models.lsimodel:topic #4(14.062): 0.349*"film" + -0.174*"canton" + 0.173*"américain" + -0.156*"roi" + -0.150*"av" + 0.144*"acteur" + 0.112*"actrice" + 0.101*"meilleur" + 0.101*"prix" + 0.099*"calvados"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.201% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 9.647% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #80000
INFO:gensim.models.lsimodel:topic #0(29.955): 0.166*"communes" + 0.132*"canton" + 0.131*"église" + 0.108*"jpg" + 0.098*"roi" + 0.084*"population" + 0.084*"département" + 0.079*"habitants" + 0.079*"château" + 0.075*"monuments"
INFO:gensim.models.lsimodel:topic #1(21.902): -0.364*"bar" + -0.285*"text" + -0.280*"canton" + -0.272*"communes" + -0.162*"till" + -0.158*"fontsize" + -0.141*"shift" + -0.121*"église" + -0.120*"département" + -0.113*"monuments"
INFO:gensim.models.lsimodel:topic #2(21.508): 0.529*"bar" + 0.418*"text" + 0.240*"till" + 0.229*"fontsize" + -0.206*"communes" + 0.206*"shift" + 0.159*"from" + -0.155*"canton" + 0.138*"at" + -0.107*"église"
INFO:gensim.models.lsimodel:topic #3(16.798): 0.508*"canton" + 0.150*"roi" + 0.122*"cantons" + 0.117*"député" + 0.099*"président" + 0.097*"ministre" + 0.090*"parti" + -0.089*"église" + 0.087*"législature" + 0.085*"république"
INFO:gensim.models.lsimodel:topic #4(16.656): 0.473*"canton" + -0.228*"roi" + -0.135*"église" + -0.121*"ii" + 0.112*"cantons" + -0.105*"empereur" + -0.104*"duc" + -0.090*"comte" + -0.087*"calvados" + -0.086*"fils"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.486% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 8.143% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #100000
INFO:gensim.models.lsimodel:topic #0(32.447): 0.150*"communes" + 0.121*"église" + 0.121*"canton" + 0.104*"jpg" + 0.093*"roi" + 0.080*"film" + 0.079*"population" + 0.077*"département" + 0.075*"château" + 0.074*"habitants"
INFO:gensim.models.lsimodel:topic #1(24.141): -0.634*"bar" + -0.488*"text" + -0.284*"till" + -0.264*"fontsize" + -0.243*"shift" + -0.184*"from" + -0.154*"at" + -0.092*"canton" + -0.069*"value" + -0.064*"id"
INFO:gensim.models.lsimodel:topic #2(23.044): -0.346*"communes" + -0.337*"canton" + -0.163*"église" + -0.156*"département" + -0.142*"monuments" + 0.140*"bar" + -0.136*"démographie" + -0.119*"lieux" + -0.116*"administration" + 0.114*"film"
INFO:gensim.models.lsimodel:topic #3(18.049): -0.376*"canton" + -0.165*"parti" + -0.150*"député" + -0.145*"ministre" + -0.140*"président" + -0.137*"roi" + 0.134*"genre" + -0.115*"république" + 0.111*"espèces" + -0.104*"législature"
INFO:gensim.models.lsimodel:topic #4(17.728): 0.527*"canton" + -0.216*"roi" + -0.154*"église" + 0.129*"cantons" + -0.113*"ii" + -0.102*"duc" + 0.099*"genre" + -0.097*"empereur" + 0.095*"film" + -0.090*"comte"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.674% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.935% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #120000
INFO:gensim.models.lsimodel:topic #0(37.300): 0.259*"communes" + 0.190*"église" + 0.136*"canton" + 0.134*"monuments" + 0.133*"département" + 0.124*"lieux" + 0.121*"administration" + 0.119*"démographie" + 0.115*"liées" + 0.115*"jpg"
INFO:gensim.models.lsimodel:topic #1(28.015): -0.298*"communes" + -0.160*"monuments" + -0.150*"démographie" + -0.144*"liées" + -0.143*"département" + -0.142*"lieux" + -0.142*"église" + -0.133*"canton" + -0.132*"personnalités" + -0.131*"administration"
INFO:gensim.models.lsimodel:topic #2(24.726): -0.667*"bar" + -0.488*"text" + -0.290*"till" + -0.260*"fontsize" + -0.241*"shift" + -0.189*"from" + -0.154*"at" + -0.073*"value" + -0.068*"id" + -0.050*"color"
INFO:gensim.models.lsimodel:topic #3(19.749): -0.710*"canton" + -0.171*"cantons" + 0.135*"église" + -0.119*"parti" + -0.113*"élections" + -0.110*"arrondissement" + -0.108*"député" + -0.094*"maire" + -0.093*"conseillers" + 0.093*"monuments"
INFO:gensim.models.lsimodel:topic #4(18.698): -0.229*"roi" + 0.173*"genre" + 0.140*"canton" + 0.135*"espèces" + -0.122*"ii" + -0.117*"ministre" + -0.116*"parti" + -0.114*"duc" + -0.103*"louis" + -0.101*"président"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.584% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.469% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #140000
INFO:gensim.models.lsimodel:topic #0(40.463): 0.272*"communes" + 0.204*"église" + 0.147*"monuments" + 0.144*"département" + 0.136*"lieux" + 0.134*"canton" + 0.131*"administration" + 0.130*"démographie" + 0.127*"liées" + 0.126*"jpg"
INFO:gensim.models.lsimodel:topic #1(30.014): -0.283*"communes" + -0.160*"monuments" + -0.149*"démographie" + -0.146*"liées" + -0.141*"lieux" + -0.140*"département" + -0.133*"église" + -0.133*"personnalités" + -0.130*"administration" + 0.127*"film"
INFO:gensim.models.lsimodel:topic #2(25.486): -0.666*"bar" + -0.488*"text" + -0.291*"till" + -0.257*"fontsize" + -0.240*"shift" + -0.190*"from" + -0.153*"at" + -0.075*"value" + -0.070*"id" + -0.053*"color"
INFO:gensim.models.lsimodel:topic #3(20.974): -0.725*"canton" + -0.170*"cantons" + 0.139*"église" + -0.117*"élections" + -0.115*"arrondissement" + -0.107*"parti" + -0.100*"conseillers" + -0.095*"député" + 0.095*"jpg" + 0.094*"monuments"
INFO:gensim.models.lsimodel:topic #4(19.944): 0.513*"film" + 0.146*"canton" + 0.143*"meilleur" + 0.142*"vf" + 0.126*"album" + 0.119*"acteur" + 0.108*"américain" + 0.108*"série" + 0.106*"cinéma" + 0.099*"prix"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.391% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.109% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #160000
INFO:gensim.models.lsimodel:topic #0(44.052): 0.308*"communes" + 0.214*"église" + 0.170*"monuments" + 0.161*"département" + 0.155*"lieux" + 0.150*"liées" + 0.149*"démographie" + 0.149*"administration" + 0.141*"personnalités" + 0.132*"géographie"
INFO:gensim.models.lsimodel:topic #1(32.362): -0.266*"communes" + -0.156*"monuments" + -0.146*"liées" + -0.145*"démographie" + -0.137*"lieux" + -0.132*"personnalités" + -0.131*"département" + -0.126*"administration" + -0.124*"géographique" + 0.123*"film"
INFO:gensim.models.lsimodel:topic #2(25.987): -0.671*"bar" + -0.483*"text" + -0.294*"till" + -0.250*"fontsize" + -0.234*"shift" + -0.192*"from" + -0.149*"at" + -0.078*"value" + -0.072*"id" + -0.063*"canton"
INFO:gensim.models.lsimodel:topic #3(21.591): -0.674*"canton" + 0.185*"film" + -0.155*"cantons" + -0.116*"élections" + -0.115*"parti" + -0.108*"arrondissement" + 0.107*"église" + 0.102*"monuments" + -0.099*"député" + 0.097*"liées"
INFO:gensim.models.lsimodel:topic #4(21.031): 0.449*"film" + 0.263*"canton" + -0.140*"jpg" + 0.134*"communes" + 0.131*"meilleur" + 0.128*"album" + 0.125*"vf" + -0.116*"église" + 0.105*"acteur" + 0.104*"américain"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.184% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.844% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #180000
INFO:gensim.models.lsimodel:topic #0(47.231): 0.326*"communes" + 0.214*"église" + 0.182*"monuments" + 0.172*"département" + 0.167*"lieux" + 0.161*"liées" + 0.160*"démographie" + 0.159*"administration" + 0.152*"personnalités" + 0.142*"géographie"
INFO:gensim.models.lsimodel:topic #1(34.407): -0.256*"communes" + -0.150*"monuments" + -0.143*"liées" + -0.141*"démographie" + -0.133*"lieux" + -0.130*"personnalités" + -0.126*"département" + -0.123*"géographique" + -0.122*"administration" + 0.121*"film"
INFO:gensim.models.lsimodel:topic #2(27.102): -0.669*"bar" + -0.483*"text" + -0.294*"till" + -0.250*"fontsize" + -0.235*"shift" + -0.192*"from" + -0.148*"at" + -0.078*"value" + -0.073*"id" + -0.065*"canton"
INFO:gensim.models.lsimodel:topic #3(22.592): -0.442*"canton" + 0.310*"film" + 0.150*"club" + 0.113*"liées" + 0.113*"coupe" + 0.106*"album" + 0.106*"football" + 0.106*"monuments" + 0.105*"géographique" + 0.105*"personnalités"
INFO:gensim.models.lsimodel:topic #4(22.429): -0.463*"canton" + -0.281*"club" + -0.208*"coupe" + -0.202*"football" + -0.157*"joueur" + -0.150*"fc" + 0.131*"église" + -0.131*"championnat" + -0.128*"équipe" + -0.127*"champion"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.106% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 4.951% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #200000
INFO:gensim.models.lsimodel:topic #0(49.733): 0.340*"communes" + 0.211*"église" + 0.183*"monuments" + 0.173*"département" + 0.168*"lieux" + 0.164*"liées" + 0.161*"démographie" + 0.161*"administration" + 0.154*"personnalités" + 0.143*"géographie"
INFO:gensim.models.lsimodel:topic #1(36.127): -0.256*"communes" + -0.142*"monuments" + -0.137*"liées" + -0.135*"démographie" + -0.126*"lieux" + -0.124*"personnalités" + 0.122*"film" + -0.119*"département" + -0.116*"administration" + -0.116*"géographique"
INFO:gensim.models.lsimodel:topic #2(27.942): -0.667*"bar" + -0.485*"text" + -0.295*"till" + -0.251*"fontsize" + -0.235*"shift" + -0.194*"from" + -0.149*"at" + -0.079*"value" + -0.073*"id" + -0.063*"canton"
INFO:gensim.models.lsimodel:topic #3(24.157): 0.345*"club" + 0.258*"coupe" + 0.251*"football" + 0.210*"joueur" + 0.191*"fc" + 0.162*"équipe" + 0.158*"championnat" + 0.152*"champion" + 0.137*"saison" + 0.135*"vainqueur"
INFO:gensim.models.lsimodel:topic #4(23.692): -0.324*"canton" + 0.291*"film" + -0.287*"taux" + -0.202*"population" + -0.156*"club" + -0.152*"calais" + -0.124*"départemental" + -0.117*"coupe" + -0.117*"football" + 0.103*"album"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.814% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 4.585% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #220000
INFO:gensim.models.lsimodel:topic #0(51.339): 0.337*"communes" + 0.209*"église" + 0.183*"monuments" + 0.174*"département" + 0.169*"lieux" + 0.165*"liées" + 0.162*"démographie" + 0.162*"administration" + 0.155*"personnalités" + 0.144*"géographie"
INFO:gensim.models.lsimodel:topic #1(37.685): -0.257*"communes" + -0.143*"monuments" + -0.139*"liées" + -0.136*"démographie" + -0.128*"lieux" + -0.126*"personnalités" + 0.122*"film" + -0.121*"département" + -0.119*"géographique" + -0.118*"administration"
INFO:gensim.models.lsimodel:topic #2(28.126): -0.667*"bar" + -0.483*"text" + -0.296*"till" + -0.249*"fontsize" + -0.234*"shift" + -0.195*"from" + -0.148*"at" + -0.079*"value" + -0.073*"id" + -0.071*"canton"
INFO:gensim.models.lsimodel:topic #3(26.330): -0.344*"club" + -0.284*"coupe" + -0.256*"joueur" + -0.248*"football" + -0.228*"fc" + -0.211*"équipe" + -0.172*"rugby" + -0.168*"championnat" + -0.159*"champion" + -0.143*"vainqueur"
INFO:gensim.models.lsimodel:topic #4(24.621): 0.400*"film" + -0.255*"canton" + -0.173*"taux" + 0.155*"album" + -0.137*"population" + 0.121*"série" + 0.114*"vf" + 0.103*"géographique" + 0.099*"meilleur" + 0.099*"liées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.901% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 4.287% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #240000
INFO:gensim.models.lsimodel:topic #0(52.339): 0.333*"communes" + 0.202*"église" + 0.174*"monuments" + 0.170*"département" + 0.161*"lieux" + 0.158*"démographie" + 0.157*"administration" + 0.156*"liées" + 0.148*"personnalités" + 0.144*"canton"
INFO:gensim.models.lsimodel:topic #1(38.914): -0.267*"communes" + -0.142*"monuments" + -0.138*"démographie" + -0.137*"liées" + -0.127*"lieux" + -0.125*"département" + -0.123*"personnalités" + 0.122*"film" + -0.119*"administration" + -0.116*"géographique"
INFO:gensim.models.lsimodel:topic #2(28.627): -0.301*"club" + -0.268*"joueur" + -0.249*"coupe" + -0.232*"rugby" + -0.222*"équipe" + -0.219*"bar" + -0.199*"football" + -0.194*"fc" + -0.158*"text" + -0.150*"xv"
INFO:gensim.models.lsimodel:topic #3(28.489): -0.615*"bar" + -0.435*"text" + -0.270*"till" + -0.223*"fontsize" + -0.209*"shift" + -0.179*"from" + -0.173*"canton" + -0.133*"at" + 0.114*"club" + 0.111*"joueur"
INFO:gensim.models.lsimodel:topic #4(26.950): -0.745*"canton" + 0.178*"bar" + -0.176*"cantons" + 0.128*"text" + 0.116*"église" + 0.112*"monuments" + -0.111*"conseillers" + -0.107*"arrondissement" + 0.105*"liées" + 0.100*"lieux"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.294% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 4.062% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #260000
INFO:gensim.models.lsimodel:topic #0(52.977): 0.322*"communes" + 0.196*"église" + 0.167*"monuments" + 0.165*"département" + 0.154*"lieux" + 0.151*"administration" + 0.151*"démographie" + 0.149*"liées" + 0.141*"personnalités" + 0.139*"canton"
INFO:gensim.models.lsimodel:topic #1(40.084): -0.280*"communes" + -0.147*"monuments" + -0.142*"démographie" + -0.140*"liées" + -0.131*"département" + -0.131*"lieux" + -0.126*"personnalités" + 0.123*"film" + -0.123*"administration" + -0.119*"géographique"
INFO:gensim.models.lsimodel:topic #2(30.956): -0.316*"joueur" + -0.295*"club" + -0.290*"rugby" + -0.262*"équipe" + -0.257*"coupe" + -0.193*"fc" + -0.191*"xv" + -0.188*"football" + -0.138*"champion" + -0.138*"championnat"
INFO:gensim.models.lsimodel:topic #3(28.713): -0.654*"bar" + -0.463*"text" + -0.288*"till" + -0.236*"fontsize" + -0.222*"shift" + -0.199*"canton" + -0.190*"from" + -0.141*"at" + -0.079*"value" + -0.073*"id"
INFO:gensim.models.lsimodel:topic #4(27.184): -0.679*"canton" + 0.166*"bar" + -0.160*"cantons" + 0.152*"film" + 0.118*"text" + 0.118*"monuments" + 0.113*"liées" + 0.106*"lieux" + 0.103*"géographique" + 0.102*"personnalités"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.471% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 4.915% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #280000
INFO:gensim.models.lsimodel:topic #0(54.096): 0.320*"communes" + 0.191*"église" + 0.162*"monuments" + 0.160*"département" + 0.154*"administration" + 0.147*"lieux" + 0.144*"démographie" + 0.142*"liées" + 0.136*"personnalités" + 0.135*"géographie"
INFO:gensim.models.lsimodel:topic #1(41.102): -0.290*"communes" + -0.149*"monuments" + -0.140*"démographie" + -0.138*"liées" + -0.131*"département" + -0.131*"administration" + -0.130*"lieux" + -0.126*"personnalités" + 0.124*"film" + -0.122*"géographie"
INFO:gensim.models.lsimodel:topic #2(32.186): -0.322*"joueur" + -0.309*"rugby" + -0.282*"club" + -0.265*"équipe" + -0.247*"coupe" + -0.206*"xv" + -0.183*"fc" + -0.180*"football" + -0.146*"sélections" + -0.135*"champion"
INFO:gensim.models.lsimodel:topic #3(29.243): -0.658*"bar" + -0.465*"text" + -0.291*"till" + -0.237*"fontsize" + -0.224*"shift" + -0.192*"from" + -0.178*"canton" + -0.142*"at" + -0.080*"value" + -0.074*"id"
INFO:gensim.models.lsimodel:topic #4(28.587): 0.498*"tv" + 0.299*"film" + 0.256*"série" + 0.144*"album" + 0.131*"acteur" + 0.102*"américain" + 0.096*"communes" + 0.089*"actrice" + 0.088*"réalisateur" + 0.083*"vf"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.932% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 3.638% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #300000
INFO:gensim.models.lsimodel:topic #0(54.874): 0.310*"communes" + 0.186*"église" + 0.155*"monuments" + 0.154*"département" + 0.148*"administration" + 0.141*"lieux" + 0.137*"démographie" + 0.136*"liées" + 0.130*"géographie" + 0.130*"personnalités"
INFO:gensim.models.lsimodel:topic #1(42.258): -0.300*"communes" + -0.151*"monuments" + -0.142*"démographie" + -0.140*"liées" + -0.135*"département" + -0.134*"administration" + -0.132*"lieux" + -0.127*"personnalités" + -0.124*"géographie" + 0.124*"film"
INFO:gensim.models.lsimodel:topic #2(34.528): -0.360*"rugby" + -0.326*"joueur" + -0.271*"équipe" + -0.252*"club" + -0.241*"xv" + -0.220*"coupe" + -0.159*"fc" + -0.154*"football" + -0.151*"sélections" + -0.137*"match"
INFO:gensim.models.lsimodel:topic #3(31.123): 0.685*"tv" + 0.278*"série" + 0.179*"film" + 0.130*"acteur" + 0.098*"communes" + 0.090*"américain" + 0.082*"actrice" + 0.081*"album" + 0.076*"réalisateur" + 0.074*"épisode"
INFO:gensim.models.lsimodel:topic #4(29.366): -0.658*"bar" + -0.464*"text" + -0.291*"till" + -0.236*"fontsize" + -0.223*"shift" + -0.193*"from" + -0.180*"canton" + -0.143*"at" + -0.080*"value" + -0.074*"id"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.955% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 3.686% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #320000
INFO:gensim.models.lsimodel:topic #0(55.694): 0.299*"communes" + 0.181*"église" + 0.148*"département" + 0.147*"monuments" + 0.142*"administration" + 0.134*"lieux" + 0.131*"démographie" + 0.129*"liées" + 0.125*"géographie" + 0.124*"canton"
INFO:gensim.models.lsimodel:topic #1(43.208): -0.310*"communes" + -0.154*"monuments" + -0.144*"démographie" + -0.141*"liées" + -0.139*"département" + -0.137*"administration" + -0.135*"lieux" + -0.129*"personnalités" + -0.127*"géographie" + 0.122*"joueur"
INFO:gensim.models.lsimodel:topic #2(36.151): -0.358*"rugby" + -0.324*"joueur" + -0.261*"équipe" + -0.251*"club" + -0.241*"xv" + -0.215*"coupe" + -0.158*"fc" + -0.154*"football" + -0.146*"sélections" + -0.128*"match"
INFO:gensim.models.lsimodel:topic #3(32.440): 0.711*"tv" + 0.277*"série" + 0.151*"film" + 0.127*"acteur" + 0.102*"communes" + 0.087*"américain" + 0.077*"actrice" + 0.074*"réalisateur" + 0.071*"album" + 0.070*"épisode"
INFO:gensim.models.lsimodel:topic #4(29.922): -0.681*"bar" + -0.455*"text" + -0.292*"till" + -0.230*"fontsize" + -0.218*"shift" + -0.193*"from" + -0.145*"canton" + -0.139*"at" + -0.080*"value" + -0.074*"id"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.934% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 3.382% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #340000
INFO:gensim.models.lsimodel:topic #0(56.481): 0.287*"communes" + 0.174*"église" + 0.142*"département" + 0.140*"monuments" + 0.135*"administration" + 0.127*"lieux" + 0.124*"démographie" + 0.121*"liées" + 0.120*"canton" + 0.119*"géographie"
INFO:gensim.models.lsimodel:topic #1(44.096): -0.321*"communes" + -0.156*"monuments" + -0.146*"démographie" + -0.143*"département" + -0.142*"liées" + 0.141*"tv" + -0.140*"administration" + -0.137*"lieux" + -0.130*"personnalités" + -0.129*"géographie"
INFO:gensim.models.lsimodel:topic #2(37.393): -0.366*"rugby" + -0.325*"joueur" + -0.255*"équipe" + -0.247*"club" + -0.242*"xv" + -0.207*"coupe" + 0.171*"tv" + -0.158*"fc" + -0.151*"football" + -0.142*"sélections"
INFO:gensim.models.lsimodel:topic #3(34.440): 0.741*"tv" + 0.266*"série" + 0.120*"acteur" + 0.113*"communes" + 0.099*"film" + 0.081*"américain" + 0.072*"scénariste" + 0.072*"réalisateur" + 0.069*"actrice" + 0.068*"filmographie"
INFO:gensim.models.lsimodel:topic #4(30.716): -0.674*"bar" + -0.452*"text" + -0.305*"till" + -0.236*"fontsize" + -0.209*"shift" + -0.202*"from" + -0.135*"at" + -0.107*"canton" + -0.085*"value" + -0.082*"width"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.118% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 3.232% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #360000
INFO:gensim.models.lsimodel:topic #0(57.314): 0.277*"communes" + 0.169*"église" + 0.137*"département" + 0.133*"monuments" + 0.129*"administration" + 0.121*"lieux" + 0.118*"démographie" + 0.115*"liées" + 0.115*"canton" + 0.114*"géographie"
INFO:gensim.models.lsimodel:topic #1(44.900): -0.330*"communes" + -0.158*"monuments" + 0.155*"tv" + -0.148*"département" + -0.148*"démographie" + -0.143*"liées" + -0.142*"administration" + -0.138*"lieux" + -0.131*"géographie" + -0.131*"personnalités"
INFO:gensim.models.lsimodel:topic #2(38.369): -0.361*"rugby" + -0.328*"joueur" + -0.250*"équipe" + -0.242*"club" + -0.238*"xv" + 0.211*"tv" + -0.203*"coupe" + -0.159*"fc" + -0.148*"football" + -0.141*"sélections"
INFO:gensim.models.lsimodel:topic #3(35.690): 0.739*"tv" + 0.257*"série" + 0.123*"communes" + 0.115*"acteur" + 0.079*"film" + 0.076*"américain" + 0.071*"scénariste" + 0.068*"rugby" + 0.068*"réalisateur" + 0.067*"filmographie"
INFO:gensim.models.lsimodel:topic #4(30.894): -0.673*"bar" + -0.452*"text" + -0.305*"till" + -0.236*"fontsize" + -0.210*"shift" + -0.203*"from" + -0.135*"at" + -0.102*"canton" + -0.086*"value" + -0.081*"width"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.264% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 3.115% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #380000
INFO:gensim.models.lsimodel:topic #0(58.128): 0.264*"communes" + 0.163*"église" + 0.131*"département" + 0.125*"monuments" + 0.123*"administration" + 0.115*"lieux" + 0.111*"démographie" + 0.110*"canton" + 0.109*"géographie" + 0.108*"liées"
INFO:gensim.models.lsimodel:topic #1(45.647): -0.338*"communes" + -0.159*"monuments" + 0.156*"tv" + -0.151*"département" + -0.149*"démographie" + -0.144*"liées" + -0.144*"administration" + -0.140*"lieux" + 0.139*"joueur" + -0.133*"géographie"
INFO:gensim.models.lsimodel:topic #2(39.454): -0.345*"rugby" + -0.328*"joueur" + -0.246*"équipe" + -0.236*"club" + -0.227*"xv" + 0.214*"tv" + -0.201*"coupe" + -0.161*"fc" + -0.148*"football" + -0.142*"sélections"
INFO:gensim.models.lsimodel:topic #3(36.479): 0.736*"tv" + 0.252*"série" + 0.131*"communes" + 0.114*"acteur" + 0.078*"film" + 0.077*"américain" + 0.070*"scénariste" + 0.067*"réalisateur" + 0.067*"filmographie" + 0.065*"rugby"
INFO:gensim.models.lsimodel:topic #4(31.140): -0.674*"bar" + -0.451*"text" + -0.305*"till" + -0.234*"fontsize" + -0.208*"shift" + -0.203*"from" + -0.135*"at" + -0.098*"canton" + -0.086*"value" + -0.081*"width"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.208% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.898% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #400000
INFO:gensim.models.lsimodel:topic #0(58.990): 0.253*"communes" + 0.157*"église" + 0.126*"département" + 0.119*"monuments" + 0.117*"administration" + 0.108*"lieux" + 0.106*"canton" + 0.105*"démographie" + 0.105*"jpg" + 0.104*"géographie"
INFO:gensim.models.lsimodel:topic #1(46.301): -0.345*"communes" + -0.160*"monuments" + -0.154*"département" + -0.150*"démographie" + 0.149*"joueur" + -0.145*"administration" + 0.145*"tv" + -0.144*"liées" + -0.141*"lieux" + -0.138*"église"
INFO:gensim.models.lsimodel:topic #2(40.397): -0.335*"joueur" + -0.326*"rugby" + -0.241*"équipe" + -0.236*"club" + -0.214*"xv" + -0.201*"coupe" + 0.187*"tv" + -0.173*"fc" + -0.150*"football" + -0.139*"sélections"
INFO:gensim.models.lsimodel:topic #3(36.808): 0.727*"tv" + 0.255*"série" + 0.134*"communes" + 0.117*"acteur" + 0.097*"film" + 0.080*"américain" + 0.071*"scénariste" + 0.070*"réalisateur" + 0.067*"filmographie" + 0.067*"actrice"
INFO:gensim.models.lsimodel:topic #4(31.461): -0.673*"bar" + -0.448*"text" + -0.307*"till" + -0.231*"fontsize" + -0.206*"shift" + -0.204*"from" + -0.133*"at" + -0.093*"canton" + -0.088*"value" + -0.083*"width"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.323% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.923% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #420000
INFO:gensim.models.lsimodel:topic #0(59.805): 0.241*"communes" + 0.151*"église" + 0.121*"département" + 0.112*"monuments" + 0.111*"administration" + 0.103*"jpg" + 0.103*"lieux" + 0.101*"canton" + 0.100*"démographie" + 0.099*"géographie"
INFO:gensim.models.lsimodel:topic #1(46.859): -0.351*"communes" + -0.161*"monuments" + -0.157*"département" + 0.153*"joueur" + -0.152*"démographie" + -0.147*"administration" + -0.145*"liées" + -0.143*"église" + -0.142*"lieux" + 0.139*"tv"
INFO:gensim.models.lsimodel:topic #2(41.168): -0.338*"joueur" + -0.320*"rugby" + -0.237*"club" + -0.236*"équipe" + -0.209*"xv" + -0.199*"coupe" + -0.178*"fc" + 0.174*"tv" + -0.151*"football" + -0.135*"sélections"
INFO:gensim.models.lsimodel:topic #3(37.184): 0.711*"tv" + 0.257*"série" + 0.139*"communes" + 0.118*"acteur" + 0.114*"film" + 0.082*"américain" + 0.072*"réalisateur" + 0.072*"scénariste" + 0.069*"actrice" + -0.069*"parti"
INFO:gensim.models.lsimodel:topic #4(31.901): -0.441*"tv" + 0.403*"film" + 0.354*"album" + 0.151*"sorti" + -0.149*"parti" + -0.119*"ministre" + 0.110*"musique" + 0.103*"vf" + -0.093*"député" + -0.086*"président"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.017% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 3.293% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #440000
INFO:gensim.models.lsimodel:topic #0(60.650): 0.232*"communes" + 0.147*"église" + 0.116*"département" + 0.107*"monuments" + 0.107*"administration" + 0.102*"jpg" + 0.100*"démographie" + 0.098*"lieux" + 0.098*"canton" + 0.095*"géographie"
INFO:gensim.models.lsimodel:topic #1(47.439): -0.349*"communes" + 0.165*"joueur" + -0.159*"monuments" + -0.156*"département" + -0.156*"démographie" + -0.145*"administration" + -0.144*"église" + -0.142*"liées" + -0.140*"lieux" + -0.136*"géographie"
INFO:gensim.models.lsimodel:topic #2(42.021): -0.341*"joueur" + -0.312*"rugby" + -0.236*"club" + -0.231*"équipe" + -0.204*"xv" + -0.195*"coupe" + -0.179*"fc" + 0.165*"tv" + -0.148*"football" + -0.132*"sélections"
INFO:gensim.models.lsimodel:topic #3(37.562): 0.699*"tv" + 0.257*"série" + 0.144*"communes" + 0.125*"film" + 0.118*"acteur" + 0.083*"américain" + -0.074*"parti" + 0.073*"réalisateur" + 0.072*"scénariste" + 0.070*"actrice"
INFO:gensim.models.lsimodel:topic #4(32.549): -0.452*"tv" + 0.385*"film" + 0.347*"album" + -0.152*"parti" + 0.148*"sorti" + -0.112*"ministre" + -0.111*"serbie" + 0.107*"musique" + 0.098*"vf" + -0.091*"série"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.361% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 3.218% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #460000
INFO:gensim.models.lsimodel:topic #0(61.584): 0.221*"communes" + 0.142*"église" + 0.112*"département" + 0.104*"population" + 0.104*"démographie" + 0.102*"monuments" + 0.102*"administration" + 0.100*"jpg" + 0.096*"canton" + 0.094*"village"
INFO:gensim.models.lsimodel:topic #1(48.259): -0.319*"communes" + 0.171*"joueur" + -0.171*"démographie" + -0.151*"serbie" + -0.146*"monuments" + -0.143*"département" + -0.134*"église" + -0.134*"population" + -0.133*"administration" + -0.130*"liées"
INFO:gensim.models.lsimodel:topic #2(43.018): -0.315*"joueur" + -0.312*"serbie" + -0.269*"rugby" + -0.213*"club" + -0.207*"équipe" + -0.176*"coupe" + -0.175*"xv" + -0.170*"fc" + -0.135*"football" + -0.124*"population"
INFO:gensim.models.lsimodel:topic #3(42.191): 0.527*"serbie" + -0.211*"communes" + 0.183*"population" + 0.164*"district" + 0.163*"localité" + 0.159*"évolution" + 0.159*"serbe" + 0.159*"maplandia" + 0.159*"localités" + 0.152*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(37.848): 0.683*"tv" + 0.256*"série" + 0.146*"communes" + 0.139*"film" + 0.118*"acteur" + 0.085*"américain" + -0.083*"parti" + 0.078*"album" + 0.075*"réalisateur" + 0.072*"scénariste"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.283% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.918% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #480000
INFO:gensim.models.lsimodel:topic #0(62.400): 0.210*"communes" + 0.138*"église" + 0.108*"département" + 0.105*"population" + 0.101*"démographie" + 0.099*"jpg" + 0.098*"administration" + 0.097*"monuments" + 0.092*"village" + 0.092*"canton"
INFO:gensim.models.lsimodel:topic #1(48.951): -0.307*"communes" + 0.187*"joueur" + -0.174*"serbie" + -0.173*"démographie" + -0.147*"population" + -0.140*"monuments" + -0.138*"département" + 0.132*"rugby" + -0.132*"église" + -0.128*"administration"
INFO:gensim.models.lsimodel:topic #2(44.073): -0.329*"serbie" + -0.313*"joueur" + -0.260*"rugby" + -0.206*"club" + -0.197*"équipe" + -0.169*"xv" + -0.167*"coupe" + -0.167*"fc" + -0.136*"population" + -0.131*"football"
INFO:gensim.models.lsimodel:topic #3(43.063): 0.495*"serbie" + -0.232*"communes" + 0.172*"population" + 0.167*"district" + 0.154*"localité" + 0.151*"serbe" + 0.148*"évolution" + 0.148*"maplandia" + 0.148*"localités" + -0.147*"joueur"
INFO:gensim.models.lsimodel:topic #4(38.257): 0.663*"tv" + 0.251*"série" + 0.154*"communes" + 0.154*"film" + 0.117*"acteur" + 0.090*"album" + 0.086*"américain" + -0.086*"parti" + 0.075*"réalisateur" + 0.073*"épisode"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.547% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.755% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #500000
INFO:gensim.models.lsimodel:topic #0(63.189): 0.201*"communes" + 0.134*"église" + 0.104*"département" + 0.102*"population" + 0.097*"jpg" + 0.096*"démographie" + 0.093*"administration" + 0.093*"monuments" + 0.091*"village" + 0.089*"film"
INFO:gensim.models.lsimodel:topic #1(49.508): -0.306*"communes" + 0.200*"joueur" + -0.171*"démographie" + -0.166*"serbie" + -0.147*"population" + -0.140*"département" + -0.139*"monuments" + 0.136*"rugby" + -0.135*"église" + 0.130*"club"
INFO:gensim.models.lsimodel:topic #2(44.775): -0.335*"joueur" + -0.266*"rugby" + -0.231*"serbie" + -0.216*"club" + -0.205*"équipe" + -0.180*"fc" + -0.174*"coupe" + -0.173*"xv" + -0.137*"football" + 0.135*"film"
INFO:gensim.models.lsimodel:topic #3(43.246): 0.547*"serbie" + -0.225*"communes" + 0.195*"population" + 0.186*"district" + 0.171*"localité" + 0.168*"serbe" + 0.164*"évolution" + 0.163*"maplandia" + 0.163*"localités" + 0.157*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(38.682): 0.635*"tv" + 0.244*"série" + 0.174*"film" + 0.164*"communes" + 0.115*"acteur" + 0.100*"album" + -0.091*"parti" + 0.088*"américain" + 0.076*"réalisateur" + -0.074*"ministre"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.169% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.396% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #520000
INFO:gensim.models.lsimodel:topic #0(64.036): 0.191*"communes" + 0.130*"église" + 0.099*"département" + 0.099*"population" + 0.095*"jpg" + 0.093*"film" + 0.091*"démographie" + 0.089*"administration" + 0.088*"village" + 0.088*"monuments"
INFO:gensim.models.lsimodel:topic #1(49.985): -0.306*"communes" + 0.211*"joueur" + -0.169*"démographie" + -0.159*"serbie" + -0.147*"population" + -0.140*"département" + 0.139*"rugby" + -0.139*"monuments" + -0.139*"église" + 0.135*"club"
INFO:gensim.models.lsimodel:topic #2(45.469): -0.340*"joueur" + -0.262*"rugby" + -0.216*"club" + -0.204*"équipe" + -0.186*"serbie" + -0.182*"fc" + -0.174*"coupe" + -0.171*"xv" + 0.150*"film" + -0.137*"football"
INFO:gensim.models.lsimodel:topic #3(43.354): 0.564*"serbie" + -0.216*"communes" + 0.206*"population" + 0.193*"district" + 0.177*"localité" + 0.173*"serbe" + 0.169*"évolution" + 0.168*"localités" + 0.168*"maplandia" + 0.162*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(39.078): 0.599*"tv" + 0.235*"série" + 0.196*"film" + 0.174*"communes" + 0.121*"album" + 0.112*"acteur" + -0.098*"parti" + 0.091*"américain" + -0.079*"ministre" + 0.078*"monuments"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.338% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.421% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #540000
INFO:gensim.models.lsimodel:topic #0(64.869): 0.182*"communes" + 0.126*"église" + 0.096*"population" + 0.096*"film" + 0.095*"département" + 0.094*"jpg" + 0.086*"joueur" + 0.086*"village" + 0.086*"démographie" + 0.086*"administration"
INFO:gensim.models.lsimodel:topic #1(50.443): -0.303*"communes" + 0.224*"joueur" + -0.166*"démographie" + -0.153*"serbie" + -0.147*"population" + 0.143*"rugby" + -0.141*"église" + 0.140*"club" + -0.139*"département" + 0.138*"équipe"
INFO:gensim.models.lsimodel:topic #2(46.141): -0.340*"joueur" + -0.252*"rugby" + -0.214*"club" + -0.199*"équipe" + -0.183*"fc" + -0.172*"coupe" + -0.165*"xv" + -0.165*"serbie" + 0.162*"film" + -0.137*"football"
INFO:gensim.models.lsimodel:topic #3(43.481): 0.571*"serbie" + 0.211*"population" + -0.208*"communes" + 0.198*"district" + 0.179*"localité" + 0.176*"serbe" + 0.170*"évolution" + 0.170*"localités" + 0.169*"maplandia" + 0.164*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(39.507): 0.560*"tv" + 0.224*"série" + 0.211*"film" + 0.184*"communes" + 0.148*"album" + 0.108*"acteur" + -0.106*"parti" + 0.092*"américain" + -0.086*"ministre" + 0.083*"monuments"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.400% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.261% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #560000
INFO:gensim.models.lsimodel:topic #0(65.696): 0.173*"communes" + 0.122*"église" + 0.099*"film" + 0.093*"population" + 0.093*"joueur" + 0.092*"jpg" + 0.091*"département" + 0.083*"village" + 0.083*"château" + 0.082*"administration"
INFO:gensim.models.lsimodel:topic #1(51.042): -0.295*"communes" + 0.249*"joueur" + -0.160*"démographie" + 0.153*"rugby" + 0.149*"club" + 0.148*"équipe" + -0.144*"population" + -0.143*"serbie" + -0.142*"église" + -0.137*"département"
INFO:gensim.models.lsimodel:topic #2(46.961): -0.332*"joueur" + -0.236*"rugby" + -0.200*"club" + -0.187*"équipe" + 0.179*"film" + -0.175*"fc" + -0.162*"coupe" + -0.156*"xv" + -0.153*"serbie" + 0.139*"tv"
INFO:gensim.models.lsimodel:topic #3(43.624): 0.576*"serbie" + 0.215*"population" + 0.200*"district" + -0.199*"communes" + 0.181*"localité" + 0.180*"serbe" + 0.171*"évolution" + 0.171*"localités" + 0.170*"maplandia" + 0.165*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(39.984): 0.520*"tv" + 0.223*"film" + 0.214*"série" + 0.197*"communes" + 0.170*"album" + -0.111*"parti" + 0.103*"acteur" + 0.091*"américain" + -0.090*"ministre" + 0.088*"monuments"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.616% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.118% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #580000
INFO:gensim.models.lsimodel:topic #0(66.516): 0.165*"communes" + 0.119*"église" + 0.103*"film" + 0.094*"joueur" + 0.091*"population" + 0.090*"jpg" + 0.088*"département" + 0.081*"village" + 0.081*"château" + 0.079*"administration"
INFO:gensim.models.lsimodel:topic #1(51.440): -0.292*"communes" + 0.256*"joueur" + -0.157*"démographie" + 0.152*"rugby" + 0.152*"club" + 0.149*"équipe" + -0.144*"population" + -0.143*"église" + -0.140*"serbie" + -0.136*"département"
INFO:gensim.models.lsimodel:topic #2(47.530): -0.331*"joueur" + -0.226*"rugby" + -0.199*"club" + 0.189*"film" + -0.183*"équipe" + -0.175*"fc" + -0.159*"coupe" + -0.149*"xv" + -0.144*"serbie" + -0.141*"communes"
INFO:gensim.models.lsimodel:topic #3(43.766): 0.580*"serbie" + 0.218*"population" + 0.200*"district" + -0.189*"communes" + 0.183*"serbe" + 0.182*"localité" + 0.171*"évolution" + 0.171*"localités" + 0.170*"maplandia" + 0.165*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(40.450): 0.467*"tv" + 0.236*"film" + 0.210*"communes" + 0.203*"album" + 0.196*"série" + -0.117*"parti" + -0.097*"ministre" + 0.096*"acteur" + 0.094*"monuments" + 0.094*"sorti"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.177% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.249% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #600000
INFO:gensim.models.lsimodel:topic #0(67.282): 0.158*"communes" + 0.116*"église" + 0.106*"film" + 0.097*"joueur" + 0.089*"jpg" + 0.088*"population" + 0.085*"département" + 0.079*"château" + 0.079*"village" + 0.076*"administration"
INFO:gensim.models.lsimodel:topic #1(51.872): -0.286*"communes" + 0.270*"joueur" + 0.158*"club" + 0.154*"équipe" + 0.154*"rugby" + -0.153*"démographie" + -0.144*"église" + -0.141*"population" + -0.134*"département" + -0.133*"serbie"
INFO:gensim.models.lsimodel:topic #2(48.140): -0.326*"joueur" + -0.213*"rugby" + 0.199*"film" + -0.194*"club" + -0.177*"équipe" + -0.171*"fc" + -0.154*"coupe" + -0.152*"communes" + 0.142*"album" + -0.140*"xv"
INFO:gensim.models.lsimodel:topic #3(43.887): 0.581*"serbie" + 0.220*"population" + 0.201*"district" + 0.185*"serbe" + 0.182*"localité" + -0.178*"communes" + 0.171*"évolution" + 0.171*"localités" + 0.170*"maplandia" + 0.165*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(40.901): 0.410*"tv" + 0.245*"album" + 0.240*"film" + 0.224*"communes" + 0.176*"série" + -0.121*"parti" + 0.106*"sorti" + -0.101*"ministre" + 0.100*"monuments" + 0.094*"liées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.737% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.255% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #620000
INFO:gensim.models.lsimodel:topic #0(68.107): 0.153*"communes" + 0.113*"église" + 0.108*"film" + 0.100*"joueur" + 0.088*"jpg" + 0.086*"population" + 0.083*"département" + 0.078*"château" + 0.078*"village" + 0.076*"tour"
INFO:gensim.models.lsimodel:topic #1(52.455): 0.282*"joueur" + -0.281*"communes" + 0.164*"club" + 0.159*"équipe" + 0.158*"rugby" + -0.147*"démographie" + -0.142*"église" + -0.139*"population" + 0.137*"fc" + 0.134*"coupe"
INFO:gensim.models.lsimodel:topic #2(48.812): -0.315*"joueur" + 0.209*"film" + -0.201*"rugby" + -0.187*"club" + -0.169*"équipe" + -0.166*"fc" + -0.163*"communes" + 0.153*"album" + -0.149*"coupe" + 0.140*"tv"
INFO:gensim.models.lsimodel:topic #3(44.168): 0.576*"serbie" + 0.219*"population" + 0.202*"district" + 0.185*"serbe" + 0.182*"localité" + 0.176*"municipalité" + 0.174*"localités" + -0.172*"communes" + 0.169*"évolution" + 0.167*"maplandia"
INFO:gensim.models.lsimodel:topic #4(41.404): 0.372*"tv" + 0.267*"album" + 0.239*"film" + 0.237*"communes" + 0.163*"série" + -0.123*"parti" + 0.112*"sorti" + -0.104*"ministre" + 0.104*"monuments" + 0.097*"liées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.332% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.082% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #640000
INFO:gensim.models.lsimodel:topic #0(68.923): 0.147*"communes" + 0.112*"film" + 0.110*"église" + 0.102*"joueur" + 0.087*"jpg" + 0.084*"population" + 0.080*"département" + 0.078*"tour" + 0.076*"château" + 0.076*"village"
INFO:gensim.models.lsimodel:topic #1(52.916): 0.293*"joueur" + -0.274*"communes" + 0.170*"club" + 0.164*"équipe" + 0.163*"rugby" + -0.142*"démographie" + -0.141*"église" + 0.141*"fc" + 0.139*"coupe" + -0.136*"population"
INFO:gensim.models.lsimodel:topic #2(49.427): -0.305*"joueur" + 0.222*"film" + -0.193*"rugby" + -0.181*"club" + -0.172*"communes" + 0.165*"album" + -0.163*"équipe" + -0.159*"fc" + -0.144*"coupe" + 0.140*"tv"
INFO:gensim.models.lsimodel:topic #3(44.331): 0.574*"serbie" + 0.220*"population" + 0.202*"district" + 0.185*"serbe" + 0.182*"localité" + 0.178*"municipalité" + 0.174*"localités" + 0.168*"évolution" + 0.166*"maplandia" + 0.163*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(41.905): 0.320*"tv" + 0.290*"album" + 0.252*"communes" + 0.240*"film" + 0.143*"série" + -0.124*"parti" + 0.120*"sorti" + 0.110*"monuments" + -0.107*"ministre" + 0.103*"liées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 7.354% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.086% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #660000
INFO:gensim.models.lsimodel:topic #0(69.713): 0.142*"communes" + 0.113*"film" + 0.108*"église" + 0.103*"joueur" + 0.086*"jpg" + 0.082*"population" + 0.078*"tour" + 0.078*"département" + 0.075*"château" + 0.075*"club"
INFO:gensim.models.lsimodel:topic #1(53.384): 0.307*"joueur" + -0.264*"communes" + 0.176*"club" + 0.169*"équipe" + 0.166*"rugby" + 0.147*"fc" + 0.144*"coupe" + -0.139*"église" + -0.137*"démographie" + -0.133*"population"
INFO:gensim.models.lsimodel:topic #2(50.003): -0.294*"joueur" + 0.231*"film" + -0.181*"rugby" + -0.181*"communes" + 0.176*"album" + -0.173*"club" + -0.155*"équipe" + -0.153*"fc" + 0.141*"tv" + -0.138*"coupe"
INFO:gensim.models.lsimodel:topic #3(44.533): 0.564*"serbie" + 0.221*"population" + 0.201*"district" + 0.183*"municipalité" + 0.183*"serbe" + 0.180*"localité" + 0.173*"localités" + 0.166*"évolution" + 0.163*"maplandia" + 0.160*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(42.384): 0.308*"album" + 0.279*"tv" + 0.266*"communes" + 0.229*"film" + -0.127*"parti" + 0.126*"série" + 0.122*"sorti" + -0.122*"serbie" + 0.116*"monuments" + -0.110*"ministre"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.724% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.330% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #680000
INFO:gensim.models.lsimodel:topic #0(70.494): 0.136*"communes" + 0.116*"film" + 0.105*"église" + 0.105*"joueur" + 0.085*"jpg" + 0.081*"population" + 0.078*"tour" + 0.076*"département" + 0.075*"club" + 0.075*"album"
INFO:gensim.models.lsimodel:topic #1(53.900): 0.319*"joueur" + -0.253*"communes" + 0.181*"club" + 0.174*"équipe" + 0.169*"rugby" + 0.153*"fc" + 0.148*"coupe" + -0.136*"église" + -0.132*"démographie" + -0.132*"population"
INFO:gensim.models.lsimodel:topic #2(50.613): -0.284*"joueur" + 0.240*"film" + 0.186*"album" + -0.184*"communes" + -0.171*"rugby" + -0.166*"club" + -0.148*"fc" + -0.148*"équipe" + 0.143*"tv" + -0.138*"serbie"
INFO:gensim.models.lsimodel:topic #3(44.954): 0.545*"serbie" + 0.218*"population" + 0.197*"municipalité" + 0.195*"district" + 0.181*"serbe" + 0.179*"localité" + 0.172*"localités" + 0.162*"évolution" + 0.159*"maplandia" + 0.156*"cyrillique"
INFO:gensim.models.lsimodel:topic #4(42.860): 0.318*"album" + 0.279*"communes" + 0.246*"tv" + 0.215*"film" + -0.146*"serbie" + -0.131*"parti" + 0.122*"monuments" + 0.122*"sorti" + 0.115*"département" + 0.113*"liées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.874% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.126% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #700000
INFO:gensim.models.lsimodel:topic #0(71.290): 0.132*"communes" + 0.117*"film" + 0.108*"joueur" + 0.104*"église" + 0.084*"jpg" + 0.080*"population" + 0.078*"tour" + 0.076*"club" + 0.076*"album" + 0.074*"département"
INFO:gensim.models.lsimodel:topic #1(54.530): 0.336*"joueur" + -0.238*"communes" + 0.188*"club" + 0.180*"équipe" + 0.173*"rugby" + 0.166*"fc" + 0.156*"coupe" + -0.131*"église" + -0.128*"population" + 0.126*"football"
INFO:gensim.models.lsimodel:topic #2(51.254): -0.266*"joueur" + 0.248*"film" + 0.197*"album" + -0.193*"communes" + -0.156*"rugby" + -0.154*"club" + -0.144*"serbie" + -0.144*"fc" + 0.143*"tv" + -0.137*"équipe"
INFO:gensim.models.lsimodel:topic #3(45.366): 0.524*"serbie" + 0.217*"population" + 0.195*"municipalité" + 0.188*"district" + 0.180*"localité" + 0.180*"serbe" + 0.176*"album" + 0.166*"localités" + 0.159*"évolution" + 0.157*"maplandia"
INFO:gensim.models.lsimodel:topic #4(43.291): 0.335*"album" + 0.291*"communes" + 0.206*"tv" + 0.196*"film" + -0.172*"serbie" + -0.133*"parti" + 0.127*"monuments" + 0.122*"sorti" + 0.120*"département" + 0.116*"liées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.743% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.196% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #720000
INFO:gensim.models.lsimodel:topic #0(72.041): 0.127*"communes" + 0.118*"film" + 0.111*"joueur" + 0.102*"église" + 0.083*"jpg" + 0.079*"club" + 0.078*"population" + 0.078*"album" + 0.078*"tour" + 0.074*"équipe"
INFO:gensim.models.lsimodel:topic #1(55.199): 0.354*"joueur" + -0.220*"communes" + 0.202*"club" + 0.187*"équipe" + 0.183*"fc" + 0.179*"rugby" + 0.165*"coupe" + 0.140*"football" + -0.125*"église" + 0.121*"championnat"
INFO:gensim.models.lsimodel:topic #2(51.868): 0.256*"film" + -0.238*"joueur" + 0.210*"album" + -0.208*"communes" + -0.149*"serbie" + 0.143*"tv" + -0.142*"club" + -0.137*"rugby" + -0.136*"fc" + -0.127*"population"
INFO:gensim.models.lsimodel:topic #3(45.662): 0.508*"serbie" + 0.215*"population" + 0.211*"album" + 0.191*"municipalité" + 0.183*"district" + 0.178*"localité" + 0.176*"serbe" + 0.161*"localités" + 0.155*"évolution" + 0.153*"film"
INFO:gensim.models.lsimodel:topic #4(43.690): 0.355*"album" + 0.300*"communes" + -0.208*"serbie" + 0.169*"film" + 0.161*"tv" + -0.132*"parti" + 0.131*"monuments" + 0.125*"département" + 0.121*"sorti" + 0.119*"liées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.433% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.197% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #740000
INFO:gensim.models.lsimodel:topic #0(72.815): 0.123*"communes" + 0.121*"film" + 0.114*"joueur" + 0.100*"église" + 0.082*"jpg" + 0.082*"club" + 0.080*"album" + 0.078*"tour" + 0.076*"population" + 0.076*"équipe"
INFO:gensim.models.lsimodel:topic #1(55.891): 0.364*"joueur" + 0.212*"club" + -0.204*"communes" + 0.194*"fc" + 0.192*"équipe" + 0.185*"rugby" + 0.171*"coupe" + 0.151*"football" + 0.128*"championnat" + -0.120*"église"
INFO:gensim.models.lsimodel:topic #2(52.444): 0.268*"film" + 0.222*"album" + -0.219*"communes" + -0.210*"joueur" + -0.148*"serbie" + 0.145*"tv" + -0.131*"population" + -0.128*"club" + -0.125*"démographie" + -0.125*"fc"
INFO:gensim.models.lsimodel:topic #3(45.909): 0.489*"serbie" + 0.244*"album" + 0.212*"population" + 0.183*"municipalité" + 0.178*"district" + 0.172*"localité" + 0.169*"serbe" + 0.166*"film" + 0.155*"localités" + 0.150*"évolution"
INFO:gensim.models.lsimodel:topic #4(44.052): 0.351*"album" + 0.310*"communes" + -0.254*"serbie" + 0.143*"film" + 0.135*"monuments" + 0.131*"département" + -0.128*"parti" + 0.124*"tv" + 0.122*"liées" + 0.119*"lieux"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.125% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.063% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #760000
INFO:gensim.models.lsimodel:topic #0(73.608): 0.121*"film" + 0.119*"communes" + 0.117*"joueur" + 0.099*"église" + 0.083*"club" + 0.081*"jpg" + 0.081*"album" + 0.079*"tour" + 0.077*"équipe" + 0.077*"population"
INFO:gensim.models.lsimodel:topic #1(56.682): 0.370*"joueur" + 0.216*"club" + 0.197*"fc" + 0.194*"équipe" + -0.190*"communes" + 0.190*"rugby" + 0.174*"coupe" + 0.158*"football" + 0.133*"championnat" + 0.120*"footballeur"
INFO:gensim.models.lsimodel:topic #2(53.170): 0.267*"film" + 0.224*"album" + -0.210*"communes" + -0.198*"joueur" + -0.167*"serbie" + -0.146*"population" + 0.138*"tv" + -0.129*"démographie" + -0.122*"club" + -0.118*"fc"
INFO:gensim.models.lsimodel:topic #3(47.049): 0.460*"serbie" + 0.235*"album" + 0.207*"population" + 0.189*"municipalité" + 0.188*"serbe" + 0.187*"localité" + 0.168*"film" + 0.168*"district" + 0.150*"évolution" + 0.150*"maplandia"
INFO:gensim.models.lsimodel:topic #4(44.525): 0.397*"album" + 0.310*"communes" + -0.195*"serbie" + 0.134*"monuments" + 0.131*"département" + 0.131*"film" + -0.129*"parti" + 0.123*"sorti" + 0.121*"liées" + 0.119*"lieux"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.545% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.101% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #780000
INFO:gensim.models.lsimodel:topic #0(74.435): 0.121*"film" + 0.117*"joueur" + 0.117*"communes" + 0.098*"église" + 0.085*"population" + 0.084*"club" + 0.081*"album" + 0.080*"jpg" + 0.079*"village" + 0.078*"tour"
INFO:gensim.models.lsimodel:topic #1(57.908): 0.319*"joueur" + 0.187*"club" + -0.183*"communes" + -0.178*"population" + 0.170*"fc" + 0.167*"équipe" + 0.158*"rugby" + -0.153*"bosnie" + -0.152*"serbie" + 0.152*"coupe"
INFO:gensim.models.lsimodel:topic #2(54.965): -0.271*"joueur" + -0.202*"serbie" + -0.201*"bosnie" + 0.198*"film" + -0.172*"herzégovine" + -0.171*"population" + -0.163*"club" + 0.161*"album" + -0.156*"fc" + -0.146*"rugby"
INFO:gensim.models.lsimodel:topic #3(50.284): 0.259*"bosnie" + 0.250*"album" + 0.248*"serbie" + 0.233*"film" + -0.229*"communes" + 0.221*"herzégovine" + 0.151*"localité" + 0.146*"serbe" + 0.135*"municipalité" + 0.135*"tv"
INFO:gensim.models.lsimodel:topic #4(45.106): 0.456*"album" + 0.295*"communes" + 0.135*"sorti" + -0.132*"parti" + 0.128*"monuments" + 0.125*"département" + 0.122*"film" + 0.115*"liées" + 0.113*"lieux" + -0.109*"ministre"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.537% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.061% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #800000
INFO:gensim.models.lsimodel:topic #0(75.286): 0.120*"film" + 0.117*"joueur" + 0.114*"communes" + 0.096*"église" + 0.093*"population" + 0.084*"village" + 0.084*"club" + 0.082*"album" + 0.079*"jpg" + 0.078*"tour"
INFO:gensim.models.lsimodel:topic #1(60.247): -0.332*"bosnie" + -0.289*"herzégovine" + -0.243*"population" + -0.205*"serbie" + -0.189*"localité" + 0.187*"joueur" + -0.172*"village" + -0.159*"municipalité" + -0.155*"habitants" + -0.153*"évolution"
INFO:gensim.models.lsimodel:topic #2(56.729): -0.371*"joueur" + -0.220*"club" + -0.207*"fc" + -0.192*"bosnie" + -0.187*"rugby" + -0.186*"équipe" + -0.178*"coupe" + -0.172*"football" + -0.168*"herzégovine" + -0.136*"championnat"
INFO:gensim.models.lsimodel:topic #3(51.939): 0.287*"album" + 0.277*"film" + -0.265*"communes" + 0.219*"bosnie" + 0.191*"herzégovine" + 0.149*"tv" + -0.146*"église" + 0.135*"sorti" + -0.128*"département" + -0.118*"monuments"
INFO:gensim.models.lsimodel:topic #4(45.704): 0.484*"album" + 0.283*"communes" + -0.139*"parti" + 0.137*"sorti" + 0.125*"monuments" + 0.120*"département" + -0.112*"ministre" + 0.111*"liées" + 0.111*"lieux" + 0.109*"démographie"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.661% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.945% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #820000
INFO:gensim.models.lsimodel:topic #0(76.066): 0.120*"film" + 0.118*"joueur" + 0.111*"communes" + 0.095*"église" + 0.093*"population" + 0.085*"club" + 0.084*"village" + 0.083*"album" + 0.078*"jpg" + 0.078*"tour"
INFO:gensim.models.lsimodel:topic #1(61.217): -0.354*"bosnie" + -0.306*"herzégovine" + -0.246*"population" + -0.201*"serbie" + -0.196*"localité" + -0.172*"village" + 0.172*"joueur" + -0.162*"municipalité" + -0.160*"serbe" + -0.157*"évolution"
INFO:gensim.models.lsimodel:topic #2(57.512): -0.374*"joueur" + -0.224*"club" + -0.210*"fc" + -0.187*"équipe" + -0.184*"bosnie" + -0.183*"rugby" + -0.181*"coupe" + -0.179*"football" + -0.160*"herzégovine" + -0.142*"championnat"
INFO:gensim.models.lsimodel:topic #3(52.507): 0.300*"album" + 0.284*"film" + -0.260*"communes" + 0.211*"bosnie" + 0.183*"herzégovine" + -0.146*"église" + 0.146*"tv" + 0.140*"sorti" + -0.126*"département" + -0.116*"monuments"
INFO:gensim.models.lsimodel:topic #4(46.154): 0.503*"album" + 0.281*"communes" + 0.135*"sorti" + -0.134*"parti" + 0.125*"monuments" + 0.120*"département" + 0.110*"lieux" + 0.110*"liées" + 0.109*"démographie" + -0.109*"ministre"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.312% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.851% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #840000
INFO:gensim.models.lsimodel:topic #0(76.822): 0.121*"joueur" + 0.121*"film" + 0.108*"communes" + 0.095*"population" + 0.093*"église" + 0.087*"club" + 0.084*"village" + 0.084*"album" + 0.078*"équipe" + 0.078*"tour"
INFO:gensim.models.lsimodel:topic #1(62.097): -0.349*"bosnie" + -0.302*"herzégovine" + -0.256*"population" + -0.199*"localité" + -0.198*"serbie" + -0.178*"village" + 0.170*"joueur" + -0.166*"serbe" + -0.166*"municipalité" + -0.160*"évolution"
INFO:gensim.models.lsimodel:topic #2(58.304): -0.375*"joueur" + -0.227*"club" + -0.210*"fc" + -0.187*"équipe" + -0.184*"football" + -0.184*"coupe" + -0.179*"rugby" + -0.174*"bosnie" + -0.151*"herzégovine" + -0.144*"championnat"
INFO:gensim.models.lsimodel:topic #3(52.947): 0.314*"album" + 0.287*"film" + -0.254*"communes" + 0.201*"bosnie" + 0.174*"herzégovine" + -0.146*"église" + 0.144*"sorti" + 0.144*"tv" + -0.124*"département" + -0.114*"monuments"
INFO:gensim.models.lsimodel:topic #4(46.583): 0.522*"album" + 0.279*"communes" + 0.131*"sorti" + -0.129*"parti" + 0.124*"monuments" + 0.120*"département" + 0.110*"église" + 0.109*"lieux" + 0.109*"liées" + 0.108*"démographie"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.019% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.970% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #860000
INFO:gensim.models.lsimodel:topic #0(77.619): 0.124*"joueur" + 0.119*"film" + 0.105*"communes" + 0.101*"population" + 0.092*"église" + 0.089*"club" + 0.088*"village" + 0.084*"album" + 0.079*"équipe" + 0.078*"bosnie"
INFO:gensim.models.lsimodel:topic #1(64.416): -0.394*"bosnie" + -0.342*"herzégovine" + -0.261*"population" + -0.211*"localité" + -0.185*"serbie" + -0.177*"serbe" + -0.175*"village" + -0.171*"municipalité" + -0.168*"évolution" + -0.150*"nationalités"
INFO:gensim.models.lsimodel:topic #2(59.363): -0.394*"joueur" + -0.237*"club" + -0.216*"fc" + -0.195*"football" + -0.193*"équipe" + -0.191*"coupe" + -0.177*"rugby" + -0.150*"championnat" + -0.139*"bosnie" + -0.136*"footballeur"
INFO:gensim.models.lsimodel:topic #3(53.663): 0.333*"album" + 0.290*"film" + -0.252*"communes" + 0.178*"bosnie" + 0.155*"herzégovine" + 0.151*"sorti" + -0.147*"église" + 0.144*"tv" + -0.124*"département" + -0.115*"monuments"
INFO:gensim.models.lsimodel:topic #4(47.103): 0.547*"album" + 0.271*"communes" + 0.129*"sorti" + 0.122*"monuments" + -0.119*"parti" + 0.117*"département" + 0.110*"église" + 0.107*"chanson" + 0.107*"démographie" + 0.106*"lieux"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.218% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.546% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #880000
INFO:gensim.models.lsimodel:topic #0(78.489): 0.125*"joueur" + 0.116*"film" + 0.116*"population" + 0.109*"bosnie" + 0.103*"communes" + 0.097*"village" + 0.096*"herzégovine" + 0.091*"église" + 0.089*"club" + 0.082*"album"
INFO:gensim.models.lsimodel:topic #1(67.670): -0.433*"bosnie" + -0.381*"herzégovine" + -0.251*"population" + -0.215*"localité" + -0.170*"serbe" + -0.169*"évolution" + -0.163*"village" + -0.162*"municipalité" + -0.158*"serbie" + -0.151*"nationalités"
INFO:gensim.models.lsimodel:topic #2(60.365): -0.403*"joueur" + -0.242*"club" + -0.220*"fc" + -0.206*"football" + -0.196*"coupe" + -0.195*"équipe" + -0.174*"rugby" + -0.153*"championnat" + -0.140*"footballeur" + -0.122*"champion"
INFO:gensim.models.lsimodel:topic #3(54.303): 0.339*"album" + 0.296*"film" + -0.250*"communes" + 0.159*"bosnie" + 0.156*"sorti" + -0.148*"église" + 0.143*"tv" + 0.141*"herzégovine" + -0.124*"département" + -0.114*"monuments"
INFO:gensim.models.lsimodel:topic #4(47.521): 0.557*"album" + 0.262*"communes" + 0.128*"sorti" + 0.119*"monuments" + 0.114*"département" + -0.113*"parti" + 0.111*"chanson" + 0.109*"église" + 0.108*"démographie" + 0.102*"lieux"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.800% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 2.120% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #900000
INFO:gensim.models.lsimodel:topic #0(79.296): 0.132*"bosnie" + 0.125*"joueur" + 0.124*"population" + 0.116*"herzégovine" + 0.114*"film" + 0.102*"village" + 0.100*"communes" + 0.089*"club" + 0.089*"église" + 0.085*"habitants"
INFO:gensim.models.lsimodel:topic #1(69.710): -0.447*"bosnie" + -0.393*"herzégovine" + -0.240*"population" + -0.213*"localité" + -0.170*"serbe" + -0.164*"évolution" + -0.158*"municipalité" + -0.155*"village" + -0.149*"nationalités" + -0.142*"serbie"
INFO:gensim.models.lsimodel:topic #2(61.122): -0.406*"joueur" + -0.243*"club" + -0.218*"fc" + -0.208*"football" + -0.195*"coupe" + -0.194*"équipe" + -0.169*"rugby" + -0.157*"championnat" + -0.140*"footballeur" + -0.124*"champion"
INFO:gensim.models.lsimodel:topic #3(54.822): 0.349*"album" + 0.302*"film" + -0.244*"communes" + 0.161*"sorti" + 0.150*"bosnie" + -0.147*"église" + 0.139*"tv" + 0.133*"herzégovine" + -0.121*"département" + -0.112*"monuments"
INFO:gensim.models.lsimodel:topic #4(47.965): 0.570*"album" + 0.255*"communes" + 0.125*"sorti" + 0.116*"monuments" + 0.112*"chanson" + 0.111*"département" + 0.109*"église" + 0.106*"démographie" + -0.105*"parti" + 0.099*"lieux"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.371% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.938% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #920000
INFO:gensim.models.lsimodel:topic #0(79.916): 0.130*"joueur" + 0.123*"bosnie" + 0.118*"population" + 0.117*"film" + 0.107*"herzégovine" + 0.098*"village" + 0.097*"communes" + 0.093*"club" + 0.088*"église" + 0.083*"album"
INFO:gensim.models.lsimodel:topic #1(69.856): -0.447*"bosnie" + -0.393*"herzégovine" + -0.243*"population" + -0.214*"localité" + -0.171*"serbe" + -0.165*"évolution" + -0.159*"municipalité" + -0.158*"village" + -0.150*"nationalités" + -0.143*"serbie"
INFO:gensim.models.lsimodel:topic #2(61.853): -0.406*"joueur" + -0.244*"club" + -0.219*"fc" + -0.210*"football" + -0.193*"coupe" + -0.191*"équipe" + -0.164*"rugby" + -0.159*"championnat" + -0.139*"footballeur" + -0.123*"champion"
INFO:gensim.models.lsimodel:topic #3(55.233): 0.358*"album" + 0.305*"film" + -0.237*"communes" + 0.165*"sorti" + 0.151*"bosnie" + -0.147*"église" + 0.135*"tv" + 0.133*"herzégovine" + -0.119*"département" + -0.109*"monuments"
INFO:gensim.models.lsimodel:topic #4(48.396): -0.578*"album" + -0.250*"communes" + -0.120*"sorti" + -0.114*"chanson" + -0.113*"monuments" + -0.111*"église" + -0.109*"département" + -0.102*"démographie" + -0.097*"lieux" + -0.096*"guitare"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.127% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.989% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #940000
INFO:gensim.models.lsimodel:topic #0(80.544): 0.136*"joueur" + 0.118*"film" + 0.114*"bosnie" + 0.113*"population" + 0.100*"herzégovine" + 0.096*"club" + 0.095*"communes" + 0.094*"village" + 0.087*"église" + 0.084*"album"
INFO:gensim.models.lsimodel:topic #1(69.990): -0.446*"bosnie" + -0.393*"herzégovine" + -0.246*"population" + -0.215*"localité" + -0.171*"serbe" + -0.165*"évolution" + -0.161*"village" + -0.160*"municipalité" + -0.150*"nationalités" + -0.143*"serbie"
INFO:gensim.models.lsimodel:topic #2(62.621): -0.410*"joueur" + -0.243*"club" + -0.218*"fc" + -0.207*"football" + -0.189*"coupe" + -0.186*"équipe" + -0.159*"championnat" + -0.157*"rugby" + -0.139*"footballeur" + -0.121*"champion"
INFO:gensim.models.lsimodel:topic #3(55.604): 0.364*"album" + 0.308*"film" + -0.230*"communes" + 0.167*"sorti" + 0.151*"bosnie" + -0.146*"église" + 0.133*"herzégovine" + 0.132*"tv" + -0.116*"département" + 0.107*"série"
INFO:gensim.models.lsimodel:topic #4(48.801): -0.573*"album" + -0.240*"communes" + 0.165*"espèce" + -0.115*"chanson" + -0.114*"sorti" + -0.111*"église" + -0.109*"monuments" + 0.109*"endémique" + -0.104*"département" + -0.097*"démographie"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.803% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.797% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #960000
INFO:gensim.models.lsimodel:topic #0(81.197): 0.140*"joueur" + 0.119*"film" + 0.109*"population" + 0.108*"bosnie" + 0.098*"club" + 0.094*"herzégovine" + 0.093*"communes" + 0.092*"village" + 0.092*"église" + 0.084*"album"
INFO:gensim.models.lsimodel:topic #1(70.196): -0.443*"bosnie" + -0.390*"herzégovine" + -0.248*"population" + -0.216*"localité" + -0.173*"serbe" + -0.166*"évolution" + -0.163*"village" + -0.161*"municipalité" + -0.151*"nationalités" + -0.144*"serbie"
INFO:gensim.models.lsimodel:topic #2(63.327): -0.411*"joueur" + -0.241*"club" + -0.218*"fc" + -0.208*"football" + -0.187*"coupe" + -0.183*"équipe" + -0.158*"championnat" + -0.152*"rugby" + -0.139*"footballeur" + -0.119*"champion"
INFO:gensim.models.lsimodel:topic #3(56.110): 0.361*"album" + 0.311*"film" + -0.224*"communes" + 0.168*"sorti" + -0.159*"église" + 0.151*"bosnie" + 0.133*"herzégovine" + 0.130*"tv" + -0.116*"département" + -0.112*"monuments"
INFO:gensim.models.lsimodel:topic #4(49.351): -0.482*"album" + 0.354*"espèce" + 0.240*"endémique" + -0.194*"communes" + 0.123*"genre" + 0.119*"publication" + 0.118*"faune" + 0.115*"distribution" + 0.114*"scientifique" + -0.111*"église"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.439% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.803% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #980000
INFO:gensim.models.lsimodel:topic #0(81.894): 0.147*"joueur" + 0.121*"film" + 0.104*"population" + 0.102*"club" + 0.100*"bosnie" + 0.090*"église" + 0.090*"communes" + 0.089*"village" + 0.088*"herzégovine" + 0.085*"album"
INFO:gensim.models.lsimodel:topic #1(70.371): -0.440*"bosnie" + -0.387*"herzégovine" + -0.250*"population" + -0.216*"localité" + -0.173*"serbe" + -0.166*"évolution" + -0.166*"village" + -0.162*"municipalité" + -0.151*"nationalités" + -0.144*"serbie"
INFO:gensim.models.lsimodel:topic #2(64.216): -0.412*"joueur" + -0.240*"club" + -0.218*"fc" + -0.208*"football" + -0.183*"coupe" + -0.178*"équipe" + -0.157*"championnat" + -0.146*"rugby" + -0.139*"footballeur" + -0.121*"bosnie"
INFO:gensim.models.lsimodel:topic #3(56.601): 0.369*"album" + 0.317*"film" + -0.215*"communes" + 0.173*"sorti" + -0.157*"église" + 0.152*"bosnie" + 0.134*"herzégovine" + 0.124*"tv" + -0.113*"département" + -0.109*"monuments"
INFO:gensim.models.lsimodel:topic #4(50.016): 0.462*"espèce" + -0.375*"album" + 0.311*"endémique" + 0.154*"faune" + 0.152*"publication" + 0.152*"genre" + 0.145*"originale" + 0.144*"distribution" + -0.143*"communes" + 0.141*"scientifique"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.324% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.749% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1000000
INFO:gensim.models.lsimodel:topic #0(82.544): 0.150*"joueur" + 0.123*"film" + 0.104*"club" + 0.100*"population" + 0.094*"bosnie" + 0.091*"église" + 0.088*"communes" + 0.087*"album" + 0.086*"équipe" + 0.086*"village"
INFO:gensim.models.lsimodel:topic #1(70.493): -0.437*"bosnie" + -0.385*"herzégovine" + -0.251*"population" + -0.216*"localité" + -0.172*"serbe" + -0.167*"village" + -0.166*"évolution" + -0.163*"municipalité" + -0.150*"nationalités" + -0.144*"serbie"
INFO:gensim.models.lsimodel:topic #2(64.835): -0.410*"joueur" + -0.238*"club" + -0.216*"fc" + -0.207*"football" + -0.180*"coupe" + -0.176*"équipe" + -0.157*"championnat" + -0.144*"rugby" + -0.138*"footballeur" + -0.130*"bosnie"
INFO:gensim.models.lsimodel:topic #3(57.196): 0.376*"album" + 0.320*"film" + -0.207*"communes" + 0.176*"sorti" + -0.160*"église" + 0.153*"bosnie" + 0.135*"herzégovine" + 0.118*"tv" + -0.113*"monuments" + -0.112*"département"
INFO:gensim.models.lsimodel:topic #4(50.896): 0.517*"espèce" + 0.348*"endémique" + -0.280*"album" + 0.172*"faune" + 0.169*"publication" + 0.163*"originale" + 0.160*"genre" + 0.156*"distribution" + 0.155*"scientifique" + 0.140*"espèces"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.002% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.848% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1020000
INFO:gensim.models.lsimodel:topic #0(83.199): 0.154*"joueur" + 0.125*"film" + 0.106*"club" + 0.096*"population" + 0.090*"église" + 0.088*"bosnie" + 0.088*"album" + 0.088*"équipe" + 0.086*"football" + 0.086*"communes"
INFO:gensim.models.lsimodel:topic #1(70.628): -0.433*"bosnie" + -0.381*"herzégovine" + -0.252*"population" + -0.214*"localité" + -0.171*"serbe" + -0.169*"village" + -0.165*"évolution" + -0.162*"municipalité" + -0.149*"nationalités" + 0.145*"joueur"
INFO:gensim.models.lsimodel:topic #2(65.548): -0.406*"joueur" + -0.235*"club" + -0.213*"fc" + -0.206*"football" + -0.177*"coupe" + -0.171*"équipe" + -0.155*"championnat" + -0.142*"bosnie" + -0.141*"rugby" + -0.136*"footballeur"
INFO:gensim.models.lsimodel:topic #3(57.662): 0.381*"album" + 0.324*"film" + -0.200*"communes" + 0.179*"sorti" + -0.158*"église" + 0.155*"bosnie" + 0.136*"herzégovine" + 0.114*"tv" + -0.110*"monuments" + -0.109*"département"
INFO:gensim.models.lsimodel:topic #4(52.020): 0.552*"espèce" + 0.372*"endémique" + 0.183*"faune" + -0.182*"album" + 0.180*"publication" + 0.176*"originale" + 0.163*"scientifique" + 0.162*"distribution" + 0.157*"genre" + 0.141*"intégral"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.707% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.804% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1040000
INFO:gensim.models.lsimodel:topic #0(83.852): 0.155*"joueur" + 0.126*"film" + 0.107*"club" + 0.093*"population" + 0.091*"église" + 0.089*"album" + 0.088*"équipe" + 0.087*"football" + 0.085*"bosnie" + 0.084*"village"
INFO:gensim.models.lsimodel:topic #1(70.890): -0.425*"bosnie" + -0.374*"herzégovine" + -0.250*"population" + -0.212*"localité" + -0.177*"village" + -0.169*"serbe" + -0.167*"municipalité" + -0.163*"évolution" + 0.154*"joueur" + -0.147*"nationalités"
INFO:gensim.models.lsimodel:topic #2(66.117): -0.400*"joueur" + -0.232*"club" + -0.208*"fc" + -0.203*"football" + -0.174*"coupe" + -0.168*"équipe" + -0.154*"championnat" + -0.152*"bosnie" + -0.136*"rugby" + -0.134*"herzégovine"
INFO:gensim.models.lsimodel:topic #3(58.227): 0.377*"album" + 0.330*"film" + -0.193*"communes" + 0.181*"sorti" + -0.163*"église" + 0.157*"bosnie" + 0.139*"herzégovine" + -0.115*"monuments" + -0.109*"département" + 0.109*"tv"
INFO:gensim.models.lsimodel:topic #4(52.952): -0.559*"espèce" + -0.374*"endémique" + -0.185*"faune" + -0.183*"publication" + -0.178*"originale" + -0.164*"scientifique" + -0.162*"distribution" + 0.158*"album" + -0.155*"genre" + -0.144*"intégral"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.491% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.796% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1060000
INFO:gensim.models.lsimodel:topic #0(84.487): 0.155*"joueur" + 0.128*"film" + 0.107*"club" + 0.093*"église" + 0.092*"population" + 0.090*"album" + 0.088*"village" + 0.088*"équipe" + 0.087*"football" + 0.083*"communes"
INFO:gensim.models.lsimodel:topic #1(71.301): -0.416*"bosnie" + -0.366*"herzégovine" + -0.247*"population" + -0.209*"localité" + -0.194*"village" + -0.178*"municipalité" + -0.166*"serbe" + -0.160*"évolution" + 0.158*"joueur" + -0.144*"nationalités"
INFO:gensim.models.lsimodel:topic #2(66.614): -0.399*"joueur" + -0.231*"club" + -0.206*"fc" + -0.202*"football" + -0.172*"coupe" + -0.166*"équipe" + -0.154*"bosnie" + -0.154*"championnat" + -0.136*"herzégovine" + -0.136*"rugby"
INFO:gensim.models.lsimodel:topic #3(58.828): 0.383*"album" + 0.331*"film" + -0.186*"communes" + 0.183*"sorti" + -0.169*"église" + 0.161*"bosnie" + 0.142*"herzégovine" + -0.119*"monuments" + -0.109*"département" + 0.103*"américain"
INFO:gensim.models.lsimodel:topic #4(53.960): -0.564*"espèce" + -0.376*"endémique" + -0.185*"faune" + -0.184*"publication" + -0.179*"originale" + -0.167*"scientifique" + -0.163*"distribution" + -0.149*"genre" + 0.148*"album" + -0.147*"intégral"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.329% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.688% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1080000
INFO:gensim.models.lsimodel:topic #0(85.097): 0.156*"joueur" + 0.130*"film" + 0.108*"club" + 0.092*"église" + 0.092*"album" + 0.089*"population" + 0.088*"équipe" + 0.087*"village" + 0.087*"football" + 0.081*"communes"
INFO:gensim.models.lsimodel:topic #1(71.455): -0.411*"bosnie" + -0.361*"herzégovine" + -0.246*"population" + -0.207*"localité" + -0.198*"village" + -0.178*"municipalité" + 0.166*"joueur" + -0.164*"serbe" + -0.159*"évolution" + -0.143*"nationalités"
INFO:gensim.models.lsimodel:topic #2(67.078): -0.397*"joueur" + -0.229*"club" + -0.205*"fc" + -0.200*"football" + -0.170*"coupe" + -0.164*"équipe" + -0.161*"bosnie" + -0.153*"championnat" + -0.142*"herzégovine" + -0.133*"rugby"
INFO:gensim.models.lsimodel:topic #3(59.335): 0.388*"album" + 0.331*"film" + 0.185*"sorti" + -0.179*"communes" + -0.168*"église" + 0.163*"bosnie" + 0.143*"herzégovine" + -0.116*"monuments" + -0.106*"département" + 0.104*"chanson"
INFO:gensim.models.lsimodel:topic #4(55.559): -0.572*"espèce" + -0.382*"endémique" + -0.188*"faune" + -0.187*"publication" + -0.182*"originale" + -0.167*"scientifique" + -0.163*"distribution" + -0.150*"intégral" + -0.138*"genre" + -0.127*"araignées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.458% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.572% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1100000
INFO:gensim.models.lsimodel:topic #0(85.757): 0.160*"joueur" + 0.132*"film" + 0.110*"club" + 0.092*"album" + 0.091*"église" + 0.089*"football" + 0.088*"équipe" + 0.086*"population" + 0.085*"village" + 0.080*"saison"
INFO:gensim.models.lsimodel:topic #1(71.656): -0.401*"bosnie" + -0.352*"herzégovine" + -0.244*"population" + -0.203*"localité" + -0.199*"village" + 0.185*"joueur" + -0.176*"municipalité" + -0.161*"serbe" + -0.157*"évolution" + -0.141*"habitants"
INFO:gensim.models.lsimodel:topic #2(67.745): -0.391*"joueur" + -0.223*"club" + -0.199*"fc" + -0.196*"football" + -0.179*"bosnie" + -0.164*"coupe" + -0.158*"équipe" + -0.157*"herzégovine" + -0.149*"championnat" + 0.142*"film"
INFO:gensim.models.lsimodel:topic #3(59.828): 0.385*"album" + 0.337*"film" + 0.187*"sorti" + -0.173*"communes" + -0.167*"église" + 0.166*"bosnie" + 0.146*"herzégovine" + -0.113*"monuments" + 0.105*"américain" + -0.104*"département"
INFO:gensim.models.lsimodel:topic #4(56.553): -0.574*"espèce" + -0.386*"endémique" + -0.189*"faune" + -0.187*"publication" + -0.182*"originale" + -0.168*"scientifique" + -0.162*"distribution" + -0.151*"intégral" + -0.132*"genre" + -0.124*"araignées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.309% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.654% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1120000
INFO:gensim.models.lsimodel:topic #0(86.396): 0.165*"joueur" + 0.133*"film" + 0.111*"club" + 0.092*"album" + 0.090*"football" + 0.090*"église" + 0.090*"équipe" + 0.084*"population" + 0.083*"village" + 0.082*"saison"
INFO:gensim.models.lsimodel:topic #1(71.886): -0.387*"bosnie" + -0.340*"herzégovine" + -0.240*"population" + 0.210*"joueur" + -0.198*"village" + -0.197*"localité" + -0.174*"municipalité" + -0.156*"serbe" + -0.152*"évolution" + -0.141*"habitants"
INFO:gensim.models.lsimodel:topic #2(68.368): -0.380*"joueur" + -0.212*"club" + -0.202*"bosnie" + -0.191*"fc" + -0.186*"football" + -0.178*"herzégovine" + -0.156*"coupe" + -0.150*"équipe" + 0.150*"film" + -0.143*"championnat"
INFO:gensim.models.lsimodel:topic #3(60.245): 0.385*"album" + 0.340*"film" + 0.188*"sorti" + 0.169*"bosnie" + -0.168*"communes" + -0.165*"église" + 0.149*"herzégovine" + -0.111*"monuments" + 0.106*"américain" + 0.104*"chanson"
INFO:gensim.models.lsimodel:topic #4(57.305): -0.574*"espèce" + -0.388*"endémique" + -0.190*"faune" + -0.188*"publication" + -0.183*"originale" + -0.168*"scientifique" + -0.162*"distribution" + -0.152*"intégral" + -0.128*"genre" + -0.120*"araignées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.202% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.497% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1140000
INFO:gensim.models.lsimodel:topic #0(87.021): 0.165*"joueur" + 0.136*"film" + 0.112*"club" + 0.093*"album" + 0.091*"football" + 0.090*"équipe" + 0.090*"église" + 0.082*"saison" + 0.081*"population" + 0.081*"village"
INFO:gensim.models.lsimodel:topic #1(72.062): -0.375*"bosnie" + -0.330*"herzégovine" + -0.236*"population" + 0.228*"joueur" + -0.196*"village" + -0.192*"localité" + -0.170*"municipalité" + -0.152*"serbe" + -0.149*"évolution" + -0.139*"habitants"
INFO:gensim.models.lsimodel:topic #2(68.852): -0.368*"joueur" + -0.218*"bosnie" + -0.206*"club" + -0.192*"herzégovine" + -0.186*"fc" + -0.181*"football" + 0.159*"film" + -0.151*"coupe" + -0.145*"équipe" + -0.140*"championnat"
INFO:gensim.models.lsimodel:topic #3(60.789): 0.371*"album" + 0.340*"film" + 0.186*"sorti" + 0.173*"bosnie" + -0.166*"église" + -0.163*"communes" + 0.152*"herzégovine" + -0.110*"monuments" + 0.106*"américain" + -0.100*"château"
INFO:gensim.models.lsimodel:topic #4(59.364): -0.571*"espèce" + -0.390*"endémique" + -0.190*"faune" + -0.185*"publication" + -0.179*"originale" + -0.167*"scientifique" + -0.154*"distribution" + -0.153*"intégral" + 0.136*"album" + -0.122*"araignées"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.684% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.598% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1160000
INFO:gensim.models.lsimodel:topic #0(87.609): 0.166*"joueur" + 0.138*"film" + 0.112*"club" + 0.093*"album" + 0.091*"football" + 0.090*"équipe" + 0.089*"église" + 0.083*"saison" + 0.083*"village" + 0.080*"population"
INFO:gensim.models.lsimodel:topic #1(72.296): -0.361*"bosnie" + -0.317*"herzégovine" + 0.246*"joueur" + -0.231*"population" + -0.201*"village" + -0.186*"localité" + -0.166*"municipalité" + -0.146*"serbe" + -0.144*"évolution" + 0.139*"club"
INFO:gensim.models.lsimodel:topic #2(69.298): -0.357*"joueur" + -0.234*"bosnie" + -0.206*"herzégovine" + -0.199*"club" + -0.179*"fc" + -0.174*"football" + 0.167*"film" + -0.145*"coupe" + -0.139*"équipe" + -0.136*"championnat"
INFO:gensim.models.lsimodel:topic #3(61.276): 0.340*"album" + 0.326*"film" + 0.178*"espèce" + 0.175*"sorti" + 0.174*"bosnie" + -0.164*"église" + -0.157*"communes" + 0.153*"herzégovine" + 0.125*"endémique" + -0.107*"monuments"
INFO:gensim.models.lsimodel:topic #4(60.516): -0.552*"espèce" + -0.376*"endémique" + 0.189*"album" + -0.183*"faune" + -0.178*"publication" + -0.169*"originale" + -0.162*"scientifique" + 0.155*"film" + -0.148*"intégral" + -0.138*"distribution"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.024% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.476% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1180000
INFO:gensim.models.lsimodel:topic #0(88.193): 0.165*"joueur" + 0.140*"film" + 0.112*"club" + 0.094*"album" + 0.090*"football" + 0.090*"équipe" + 0.089*"église" + 0.084*"village" + 0.082*"saison" + 0.078*"population"
INFO:gensim.models.lsimodel:topic #1(72.518): -0.349*"bosnie" + -0.307*"herzégovine" + 0.256*"joueur" + -0.227*"population" + -0.208*"village" + -0.181*"localité" + -0.163*"municipalité" + 0.145*"club" + -0.142*"serbe" + -0.140*"évolution"
INFO:gensim.models.lsimodel:topic #2(69.639): -0.348*"joueur" + -0.242*"bosnie" + -0.213*"herzégovine" + -0.196*"club" + -0.175*"fc" + 0.174*"film" + -0.171*"football" + -0.143*"coupe" + -0.136*"équipe" + -0.134*"championnat"
INFO:gensim.models.lsimodel:topic #3(61.752): 0.336*"album" + 0.332*"film" + 0.179*"bosnie" + 0.176*"sorti" + 0.173*"espèce" + -0.164*"église" + 0.158*"herzégovine" + -0.152*"communes" + 0.121*"endémique" + -0.106*"monuments"
INFO:gensim.models.lsimodel:topic #4(61.005): -0.555*"espèce" + -0.377*"endémique" + 0.184*"album" + -0.183*"faune" + -0.178*"publication" + -0.170*"originale" + -0.163*"scientifique" + 0.155*"film" + -0.149*"intégral" + -0.139*"distribution"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.670% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.573% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1200000
INFO:gensim.models.lsimodel:topic #0(88.761): 0.164*"joueur" + 0.142*"film" + 0.111*"club" + 0.094*"album" + 0.090*"football" + 0.089*"équipe" + 0.089*"église" + 0.084*"village" + 0.082*"saison" + 0.078*"coupe"
INFO:gensim.models.lsimodel:topic #1(72.733): -0.335*"bosnie" + -0.295*"herzégovine" + 0.271*"joueur" + -0.223*"population" + -0.206*"village" + -0.175*"localité" + -0.158*"municipalité" + 0.154*"club" + -0.137*"serbe" + -0.136*"évolution"
INFO:gensim.models.lsimodel:topic #2(69.994): -0.336*"joueur" + -0.254*"bosnie" + -0.224*"herzégovine" + -0.189*"club" + 0.180*"film" + -0.169*"fc" + -0.165*"football" + -0.138*"coupe" + -0.131*"équipe" + -0.129*"championnat"
INFO:gensim.models.lsimodel:topic #3(62.454): 0.488*"espèce" + 0.334*"endémique" + 0.166*"distribution" + 0.166*"originale" + 0.161*"publication" + 0.161*"faune" + 0.154*"film" + 0.141*"album" + 0.141*"scientifique" + 0.131*"intégral"
INFO:gensim.models.lsimodel:topic #4(61.986): 0.353*"album" + 0.336*"film" + -0.318*"espèce" + -0.214*"endémique" + 0.181*"sorti" + 0.145*"bosnie" + 0.128*"herzégovine" + -0.113*"église" + -0.107*"communes" + -0.106*"faune"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.784% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.221% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1220000
INFO:gensim.models.lsimodel:topic #0(89.410): 0.165*"joueur" + 0.145*"film" + 0.112*"club" + 0.095*"album" + 0.090*"football" + 0.090*"équipe" + 0.088*"église" + 0.083*"saison" + 0.081*"village" + 0.079*"coupe"
INFO:gensim.models.lsimodel:topic #1(72.994): -0.312*"bosnie" + 0.297*"joueur" + -0.274*"herzégovine" + -0.211*"population" + -0.197*"village" + 0.168*"club" + -0.164*"localité" + -0.149*"municipalité" + 0.144*"fc" + 0.142*"football"
INFO:gensim.models.lsimodel:topic #2(70.419): -0.311*"joueur" + -0.277*"bosnie" + -0.244*"herzégovine" + 0.193*"film" + -0.176*"club" + -0.156*"fc" + -0.153*"football" + -0.146*"population" + 0.134*"album" + -0.134*"localité"
INFO:gensim.models.lsimodel:topic #3(62.769): 0.321*"espèce" + 0.282*"film" + 0.261*"album" + 0.222*"endémique" + 0.172*"bosnie" + 0.151*"herzégovine" + -0.148*"église" + 0.145*"sorti" + 0.138*"distribution" + -0.131*"communes"
INFO:gensim.models.lsimodel:topic #4(62.373): -0.487*"espèce" + -0.329*"endémique" + 0.261*"album" + 0.253*"film" + -0.161*"faune" + -0.154*"publication" + -0.144*"scientifique" + -0.141*"originale" + 0.136*"sorti" + -0.128*"intégral"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.193% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.372% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1240000
INFO:gensim.models.lsimodel:topic #0(90.058): 0.165*"joueur" + 0.147*"film" + 0.112*"club" + 0.095*"album" + 0.090*"équipe" + 0.090*"football" + 0.088*"église" + 0.083*"saison" + 0.080*"village" + 0.078*"coupe"
INFO:gensim.models.lsimodel:topic #1(73.283): 0.317*"joueur" + -0.287*"bosnie" + -0.252*"herzégovine" + -0.199*"population" + -0.187*"village" + 0.179*"club" + 0.153*"fc" + -0.153*"localité" + 0.152*"football" + -0.139*"municipalité"
INFO:gensim.models.lsimodel:topic #2(70.777): -0.297*"bosnie" + -0.285*"joueur" + -0.261*"herzégovine" + 0.203*"film" + -0.162*"club" + -0.161*"population" + -0.145*"localité" + -0.143*"fc" + -0.141*"football" + 0.140*"album"
INFO:gensim.models.lsimodel:topic #3(63.402): 0.377*"espèce" + 0.259*"endémique" + 0.250*"film" + 0.220*"album" + 0.167*"bosnie" + 0.149*"distribution" + 0.147*"herzégovine" + -0.143*"église" + 0.135*"originale" + 0.127*"sorti"
INFO:gensim.models.lsimodel:topic #4(63.029): -0.448*"espèce" + -0.302*"endémique" + 0.287*"film" + 0.284*"album" + 0.152*"sorti" + -0.148*"faune" + -0.140*"publication" + -0.133*"scientifique" + -0.127*"originale" + -0.116*"intégral"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.685% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.296% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1260000
INFO:gensim.models.lsimodel:topic #0(90.691): 0.164*"joueur" + 0.151*"film" + 0.111*"club" + 0.096*"album" + 0.090*"équipe" + 0.089*"football" + 0.087*"église" + 0.082*"saison" + 0.079*"coupe" + 0.078*"village"
INFO:gensim.models.lsimodel:topic #1(73.624): -0.328*"joueur" + 0.272*"bosnie" + 0.239*"herzégovine" + 0.193*"population" + -0.186*"club" + 0.181*"village" + -0.158*"fc" + -0.157*"football" + 0.149*"localité" + -0.137*"équipe"
INFO:gensim.models.lsimodel:topic #2(71.217): -0.302*"bosnie" + -0.270*"joueur" + -0.266*"herzégovine" + 0.212*"film" + -0.170*"population" + -0.154*"club" + -0.152*"localité" + -0.142*"village" + 0.142*"album" + -0.135*"fc"
INFO:gensim.models.lsimodel:topic #3(64.489): 0.569*"espèce" + 0.388*"endémique" + 0.187*"faune" + 0.183*"publication" + 0.183*"originale" + 0.168*"distribution" + 0.165*"scientifique" + 0.149*"intégral" + 0.147*"araignées" + 0.113*"texte"
INFO:gensim.models.lsimodel:topic #4(63.710): 0.390*"film" + 0.348*"album" + 0.198*"sorti" + 0.192*"bosnie" + 0.169*"herzégovine" + -0.142*"église" + -0.137*"espèce" + -0.121*"communes" + 0.115*"américain" + -0.097*"monuments"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.001% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.396% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1280000
INFO:gensim.models.lsimodel:topic #0(91.329): 0.164*"joueur" + 0.154*"film" + 0.111*"club" + 0.096*"album" + 0.090*"équipe" + 0.089*"football" + 0.086*"église" + 0.082*"saison" + 0.079*"américain" + 0.079*"coupe"
INFO:gensim.models.lsimodel:topic #1(74.239): 0.319*"joueur" + -0.275*"bosnie" + -0.241*"herzégovine" + -0.202*"population" + -0.183*"village" + 0.179*"club" + -0.162*"localité" + 0.152*"fc" + 0.151*"football" + -0.140*"municipalité"
INFO:gensim.models.lsimodel:topic #2(71.871): -0.283*"bosnie" + -0.281*"joueur" + -0.249*"herzégovine" + 0.219*"film" + -0.166*"population" + -0.158*"club" + -0.154*"localité" + 0.141*"album" + -0.138*"fc" + -0.137*"football"
INFO:gensim.models.lsimodel:topic #3(65.411): -0.579*"espèce" + -0.395*"endémique" + -0.191*"faune" + -0.186*"publication" + -0.184*"originale" + -0.167*"scientifique" + -0.164*"distribution" + -0.160*"araignées" + -0.150*"intégral" + -0.114*"texte"
INFO:gensim.models.lsimodel:topic #4(64.279): 0.403*"film" + 0.336*"album" + 0.200*"sorti" + 0.191*"bosnie" + 0.169*"herzégovine" + -0.145*"église" + -0.122*"communes" + 0.118*"américain" + -0.097*"monuments" + 0.093*"chanson"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.420% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.429% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1300000
INFO:gensim.models.lsimodel:topic #0(91.961): 0.166*"joueur" + 0.155*"film" + 0.111*"club" + 0.095*"album" + 0.091*"équipe" + 0.089*"football" + 0.085*"église" + 0.082*"saison" + 0.080*"coupe" + 0.080*"américain"
INFO:gensim.models.lsimodel:topic #1(74.767): -0.332*"joueur" + 0.255*"bosnie" + 0.223*"herzégovine" + 0.195*"population" + -0.186*"club" + 0.183*"village" + -0.155*"fc" + 0.155*"localité" + -0.155*"football" + -0.139*"coupe"
INFO:gensim.models.lsimodel:topic #2(72.355): -0.290*"bosnie" + -0.263*"joueur" + -0.255*"herzégovine" + 0.227*"film" + -0.178*"population" + -0.162*"localité" + -0.149*"village" + -0.148*"club" + 0.143*"album" + -0.136*"serbe"
INFO:gensim.models.lsimodel:topic #3(65.607): 0.578*"espèce" + 0.394*"endémique" + 0.190*"faune" + 0.185*"publication" + 0.184*"originale" + 0.167*"scientifique" + 0.166*"distribution" + 0.160*"araignées" + 0.149*"intégral" + 0.114*"texte"
INFO:gensim.models.lsimodel:topic #4(64.720): 0.409*"film" + 0.328*"album" + 0.200*"sorti" + 0.193*"bosnie" + 0.170*"herzégovine" + -0.142*"église" + 0.119*"américain" + -0.117*"communes" + -0.102*"espèce" + -0.093*"monuments"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.132% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.476% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1320000
INFO:gensim.models.lsimodel:topic #0(92.604): 0.168*"joueur" + 0.156*"film" + 0.112*"club" + 0.095*"album" + 0.091*"équipe" + 0.089*"football" + 0.084*"église" + 0.082*"saison" + 0.081*"jeux" + 0.081*"américain"
INFO:gensim.models.lsimodel:topic #1(75.267): -0.351*"joueur" + 0.228*"bosnie" + 0.200*"herzégovine" + -0.194*"club" + 0.180*"population" + 0.180*"village" + -0.163*"fc" + -0.162*"football" + -0.145*"coupe" + -0.143*"équipe"
INFO:gensim.models.lsimodel:topic #2(72.798): -0.301*"bosnie" + -0.264*"herzégovine" + -0.235*"joueur" + 0.233*"film" + -0.191*"population" + -0.174*"village" + -0.170*"localité" + 0.145*"album" + -0.142*"serbe" + -0.139*"municipalité"
INFO:gensim.models.lsimodel:topic #3(66.489): -0.584*"espèce" + -0.398*"endémique" + -0.190*"faune" + -0.185*"publication" + -0.182*"originale" + -0.169*"scientifique" + -0.167*"araignées" + -0.162*"distribution" + -0.149*"intégral" + -0.114*"texte"
INFO:gensim.models.lsimodel:topic #4(65.178): 0.412*"film" + 0.316*"album" + 0.202*"bosnie" + 0.199*"sorti" + 0.178*"herzégovine" + -0.141*"église" + 0.120*"américain" + -0.116*"communes" + -0.092*"monuments" + 0.091*"localité"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.316% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.439% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1340000
INFO:gensim.models.lsimodel:topic #0(93.272): 0.172*"joueur" + 0.157*"film" + 0.113*"club" + 0.094*"album" + 0.092*"équipe" + 0.090*"football" + 0.084*"village" + 0.083*"église" + 0.083*"saison" + 0.083*"jeux"
INFO:gensim.models.lsimodel:topic #1(76.072): -0.340*"joueur" + 0.241*"bosnie" + 0.210*"herzégovine" + 0.203*"village" + 0.186*"population" + -0.185*"club" + -0.156*"fc" + -0.155*"football" + 0.145*"localité" + -0.140*"coupe"
INFO:gensim.models.lsimodel:topic #2(73.594): -0.287*"bosnie" + -0.253*"joueur" + -0.251*"herzégovine" + 0.234*"film" + -0.183*"village" + -0.177*"population" + -0.158*"localité" + 0.142*"album" + -0.138*"club" + -0.134*"serbe"
INFO:gensim.models.lsimodel:topic #3(66.690): -0.583*"espèce" + -0.397*"endémique" + -0.190*"faune" + -0.184*"publication" + -0.182*"originale" + -0.169*"scientifique" + -0.166*"araignées" + -0.163*"distribution" + -0.148*"intégral" + -0.113*"texte"
INFO:gensim.models.lsimodel:topic #4(65.681): 0.417*"film" + 0.309*"album" + 0.206*"bosnie" + 0.199*"sorti" + 0.180*"herzégovine" + -0.138*"église" + 0.122*"américain" + -0.113*"communes" + 0.090*"localité" + -0.090*"monuments"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.879% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.288% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1360000
INFO:gensim.models.lsimodel:topic #0(93.926): 0.175*"joueur" + 0.158*"film" + 0.114*"club" + 0.094*"album" + 0.093*"équipe" + 0.091*"football" + 0.085*"village" + 0.084*"jeux" + 0.083*"saison" + 0.082*"église"
INFO:gensim.models.lsimodel:topic #1(76.699): -0.351*"joueur" + 0.220*"bosnie" + 0.209*"village" + 0.192*"herzégovine" + -0.189*"club" + 0.175*"population" + -0.159*"football" + -0.159*"fc" + -0.143*"coupe" + -0.140*"équipe"
INFO:gensim.models.lsimodel:topic #2(74.171): -0.282*"bosnie" + -0.247*"herzégovine" + -0.239*"joueur" + 0.238*"film" + -0.211*"village" + -0.179*"population" + -0.157*"localité" + 0.143*"album" + -0.132*"serbe" + -0.130*"municipalité"
INFO:gensim.models.lsimodel:topic #3(67.206): -0.584*"espèce" + -0.397*"endémique" + -0.190*"faune" + -0.184*"publication" + -0.182*"originale" + -0.170*"araignées" + -0.169*"scientifique" + -0.161*"distribution" + -0.148*"intégral" + -0.113*"texte"
INFO:gensim.models.lsimodel:topic #4(66.139): 0.410*"film" + 0.294*"album" + 0.229*"bosnie" + 0.201*"herzégovine" + 0.194*"sorti" + -0.134*"église" + 0.119*"américain" + -0.107*"communes" + 0.103*"localité" + 0.093*"serbe"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.730% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.247% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1380000
INFO:gensim.models.lsimodel:topic #0(94.595): 0.179*"joueur" + 0.159*"film" + 0.115*"club" + 0.094*"équipe" + 0.094*"album" + 0.093*"football" + 0.087*"village" + 0.086*"jeux" + 0.084*"saison" + 0.084*"coupe"
INFO:gensim.models.lsimodel:topic #1(77.471): -0.351*"joueur" + 0.226*"village" + 0.215*"bosnie" + 0.187*"herzégovine" + -0.186*"club" + 0.172*"population" + -0.158*"football" + -0.157*"fc" + -0.141*"coupe" + -0.138*"équipe"
INFO:gensim.models.lsimodel:topic #2(74.958): -0.263*"bosnie" + -0.244*"joueur" + 0.239*"film" + -0.229*"herzégovine" + -0.229*"village" + -0.168*"population" + -0.156*"voïvodie" + -0.144*"localité" + 0.142*"album" + -0.141*"powiat"
INFO:gensim.models.lsimodel:topic #3(67.919): 0.399*"voïvodie" + 0.363*"powiat" + -0.247*"espèce" + 0.242*"gmina" + 0.238*"mazovie" + -0.230*"bosnie" + 0.214*"village" + -0.202*"herzégovine" + 0.191*"pologne" + -0.165*"endémique"
INFO:gensim.models.lsimodel:topic #4(67.746): -0.534*"espèce" + -0.362*"endémique" + -0.199*"voïvodie" + -0.181*"powiat" + -0.173*"faune" + -0.168*"publication" + -0.167*"originale" + -0.161*"araignées" + -0.155*"scientifique" + -0.149*"distribution"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.392% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.384% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1400000
INFO:gensim.models.lsimodel:topic #0(95.253): 0.183*"joueur" + 0.158*"film" + 0.116*"club" + 0.095*"équipe" + 0.094*"football" + 0.093*"album" + 0.092*"village" + 0.087*"jeux" + 0.085*"coupe" + 0.084*"saison"
INFO:gensim.models.lsimodel:topic #1(78.628): -0.301*"joueur" + 0.295*"village" + 0.223*"voïvodie" + 0.222*"bosnie" + 0.200*"powiat" + 0.193*"herzégovine" + 0.178*"population" + -0.159*"club" + -0.134*"football" + -0.133*"fc"
INFO:gensim.models.lsimodel:topic #2(76.167): -0.304*"joueur" + 0.224*"film" + -0.219*"village" + -0.217*"voïvodie" + -0.193*"powiat" + -0.166*"bosnie" + -0.159*"club" + -0.145*"herzégovine" + -0.137*"football" + -0.137*"fc"
INFO:gensim.models.lsimodel:topic #3(70.898): 0.387*"voïvodie" + 0.345*"powiat" + -0.307*"bosnie" + -0.270*"herzégovine" + 0.229*"mazovie" + 0.227*"gmina" + 0.184*"pologne" + -0.156*"localité" + -0.148*"population" + 0.148*"village"
INFO:gensim.models.lsimodel:topic #4(68.238): -0.589*"espèce" + -0.399*"endémique" + -0.191*"faune" + -0.184*"publication" + -0.182*"originale" + -0.180*"araignées" + -0.171*"scientifique" + -0.159*"distribution" + -0.148*"intégral" + -0.113*"texte"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 5.263% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.299% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1420000
INFO:gensim.models.lsimodel:topic #0(95.944): 0.186*"joueur" + 0.156*"film" + 0.118*"club" + 0.104*"village" + 0.096*"équipe" + 0.095*"football" + 0.092*"album" + 0.088*"jeux" + 0.086*"coupe" + 0.085*"saison"
INFO:gensim.models.lsimodel:topic #1(80.902): -0.373*"voïvodie" + -0.373*"village" + -0.322*"powiat" + -0.211*"gmina" + 0.190*"joueur" + -0.189*"bosnie" + -0.186*"mazovie" + -0.177*"pologne" + -0.165*"herzégovine" + -0.165*"population"
INFO:gensim.models.lsimodel:topic #2(77.509): -0.382*"joueur" + -0.198*"club" + 0.191*"film" + -0.188*"voïvodie" + -0.170*"football" + -0.169*"fc" + -0.161*"powiat" + -0.151*"coupe" + -0.143*"équipe" + -0.142*"championnat"
INFO:gensim.models.lsimodel:topic #3(72.751): -0.372*"bosnie" + -0.327*"herzégovine" + 0.295*"voïvodie" + 0.255*"powiat" + -0.191*"population" + -0.189*"localité" + 0.167*"gmina" + 0.165*"film" + -0.161*"serbe" + -0.154*"municipalité"
INFO:gensim.models.lsimodel:topic #4(68.637): -0.590*"espèce" + -0.399*"endémique" + -0.191*"faune" + -0.184*"publication" + -0.182*"araignées" + -0.181*"originale" + -0.171*"scientifique" + -0.158*"distribution" + -0.147*"intégral" + -0.113*"texte"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 20000) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 4.812% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 1.454% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1440000
INFO:gensim.models.lsimodel:topic #0(96.547): 0.188*"joueur" + 0.156*"film" + 0.119*"club" + 0.110*"village" + 0.097*"équipe" + 0.096*"football" + 0.091*"album" + 0.090*"jeux" + 0.087*"coupe" + 0.085*"saison"
INFO:gensim.models.lsimodel:topic #1(82.771): -0.437*"voïvodie" + -0.385*"village" + -0.368*"powiat" + -0.240*"gmina" + -0.201*"pologne" + -0.197*"mazovie" + -0.148*"bosnie" + 0.144*"joueur" + -0.140*"population" + -0.128*"herzégovine"
INFO:gensim.models.lsimodel:topic #2(78.317): -0.399*"joueur" + -0.206*"club" + 0.177*"film" + -0.176*"football" + -0.175*"fc" + -0.158*"coupe" + -0.154*"voïvodie" + -0.150*"équipe" + -0.149*"championnat" + -0.131*"footballeur"
INFO:gensim.models.lsimodel:topic #3(73.404): -0.385*"bosnie" + -0.338*"herzégovine" + 0.247*"voïvodie" + 0.208*"powiat" + -0.207*"population" + -0.198*"localité" + 0.189*"film" + -0.168*"serbe" + -0.164*"municipalité" + -0.146*"évolution"
INFO:gensim.models.lsimodel:topic #4(69.261): -0.591*"espèce" + -0.398*"endémique" + -0.190*"faune" + -0.187*"araignées" + -0.183*"publication" + -0.180*"originale" + -0.171*"scientifique" + -0.156*"distribution" + -0.146*"intégral" + -0.112*"texte"
INFO:gensim.models.lsimodel:preparing a new chunk of documents
DEBUG:gensim.models.lsimodel:converting corpus to csc format
INFO:gensim.models.lsimodel:using 100 extra samples and 2 power iterations
INFO:gensim.models.lsimodel:1st phase: constructing (100000, 500) action matrix
INFO:gensim.models.lsimodel:orthonormalizing (100000, 500) action matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.models.lsimodel:running 2 power iterations
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
DEBUG:gensim.matutils:computing QR of (100000, 500) dense matrix
INFO:gensim.models.lsimodel:2nd phase: running dense svd on (500, 2838) matrix
INFO:gensim.models.lsimodel:computing the final decomposition
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 6.065% of energy spectrum)
INFO:gensim.models.lsimodel:merging projections: (100000, 400) + (100000, 400)
DEBUG:gensim.models.lsimodel:constructing orthogonal component
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to FORTRAN order
DEBUG:gensim.models.lsimodel:converting (100000, 400) array self.u to C order
DEBUG:gensim.matutils:computing QR of (100000, 400) dense matrix
DEBUG:gensim.models.lsimodel:computing SVD of (800, 800) dense matrix
INFO:gensim.models.lsimodel:keeping 400 factors (discarding 0.364% of energy spectrum)
DEBUG:gensim.models.lsimodel:updating orthonormal basis U
DEBUG:gensim.models.lsimodel:converting (100000, 400) array q to C order
INFO:gensim.models.lsimodel:processed documents up to #1442838
INFO:gensim.models.lsimodel:topic #0(96.631): 0.188*"joueur" + 0.156*"film" + 0.119*"club" + 0.111*"village" + 0.097*"équipe" + 0.096*"football" + 0.091*"album" + 0.090*"jeux" + 0.087*"coupe" + 0.085*"saison"
INFO:gensim.models.lsimodel:topic #1(83.131): -0.447*"voïvodie" + -0.386*"village" + -0.373*"powiat" + -0.244*"gmina" + -0.204*"pologne" + -0.198*"mazovie" + -0.141*"bosnie" + 0.138*"joueur" + -0.135*"population" + -0.122*"herzégovine"
INFO:gensim.models.lsimodel:topic #2(78.430): -0.402*"joueur" + -0.208*"club" + -0.177*"football" + -0.176*"fc" + 0.175*"film" + -0.159*"coupe" + -0.151*"équipe" + -0.150*"championnat" + -0.148*"voïvodie" + -0.132*"footballeur"
INFO:gensim.models.lsimodel:topic #3(73.498): -0.387*"bosnie" + -0.339*"herzégovine" + 0.239*"voïvodie" + -0.210*"population" + 0.200*"powiat" + -0.199*"localité" + 0.192*"film" + -0.169*"serbe" + -0.165*"municipalité" + -0.147*"évolution"
INFO:gensim.models.lsimodel:topic #4(69.304): -0.591*"espèce" + -0.398*"endémique" + -0.190*"faune" + -0.187*"araignées" + -0.183*"publication" + -0.180*"originale" + -0.171*"scientifique" + -0.155*"distribution" + -0.146*"intégral" + -0.112*"texte"
INFO:gensim.utils:saving Projection object under ../../data/wiki/frwiki_lsi.projection, separately None
INFO:gensim.utils:storing numpy array 'u' to ../../data/wiki/frwiki_lsi.projection.u.npy
INFO:gensim.utils:saving LsiModel object under ../../data/wiki/frwiki_lsi, separately None
INFO:gensim.utils:not storing attribute projection
INFO:gensim.utils:not storing attribute dispatcher
Content source: bayesimpact/bob-emploi
Similar notebooks: