バージョン 1 から バージョン 2 における更新: OwlimSe
- 更新日時:
- 2012/03/07 15:17:53 (13 年 前)
凡例:
- 変更なし
- 追加
- 削除
- 変更
-
OwlimSe
v1 v2 17 17 18 18 {{{ 19 NumberOfBuffers = 6500000 20 MaxDirtyBuffers = 5000000 21 AsyncQueryMaxThreads = 18 22 ThreadsPerQuery = 18 23 IndexTreeMaps = 512 24 ThreadCleanupInterval = 1 25 ResourcesCleanupInterval = 1 19 JVMSetting: 20 -Xmx55G -Xms30G -XX:+UseG1GC -XX:+TieredCompilation 21 -Druleset=empty -Dentity-index-size=1147483647 -Dcache-memory=16645m -Dtuple-index-memory=15G -DenablePredicateList=false -DftsIndexPolicy=never -Dbuild-pcsot=false -Dbuild-ptsoc=false -Djournaling=true -Drepository-type=file-repository -Dentity-id-size=32 26 22 }}} 27 23 28 29 More information please refer to [http://docs.openlinksw.com/virtuoso/databaseadmsrv.html] [http://www.openlinksw.com/weblog/oerling/?id=1665] 24 More information please refer to [http://docs.openlinksw.com/virtuoso/databaseadmsrv.html] 30 25 31 26 32 27 === Load Performance === #load 33 28 34 46mins22secs 29 '''Approach 1:''' 30 31 '''Approach 2:''' 32 33 The idea is from uniprot, which uses owlim as an library as follows: 34 35 Basically They have one specific loader program, where there is one java thread that reads the triples into a blocking queue. Then a different number of threads take triples from that queue and insert the data into OWLIM-se (or any other sesame API compatible triplestore). Normally one inserting thread per owlim file-repository fragment. The inserter treads use transactions that commit every half a million statements. The basic is to add statements not files. 36 37 final org.openrdf.model.Statement sesameStatement = getSesameStatement(object); 38 39 //Takes one from the blocking queue filled by the other thread 40 41 connection.add(sesameStatement, graph); 42 43 and every millionth statement , do connection.commit(); 35 44 36 45