バージョン 2 (更新者: wu, 13 年 前) |
---|
OwlimSE 配置
JVMSetting: -Xmx55G -Xms30G -XX:+UseG1GC -XX:+TieredCompilation -Druleset=empty -Dentity-index-size=1147483647 -Dcache-memory=16645m -Dtuple-index-memory=15G -DenablePredicateList=false -DftsIndexPolicy=never -Dbuild-pcsot=false -Dbuild-ptsoc=false -Djournaling=true -Drepository-type=file-repository -Dentity-id-size=32
More information please refer to http://docs.openlinksw.com/virtuoso/databaseadmsrv.html
Load Performance
Approach 1:
Approach 2:
The idea is from uniprot, which uses owlim as an library as follows:
Basically They have one specific loader program, where there is one java thread that reads the triples into a blocking queue. Then a different number of threads take triples from that queue and insert the data into OWLIM-se (or any other sesame API compatible triplestore). Normally one inserting thread per owlim file-repository fragment. The inserter treads use transactions that commit every half a million statements. The basic is to add statements not files.
final org.openrdf.model.Statement sesameStatement = getSesameStatement(object);
//Takes one from the blocking queue filled by the other thread
connection.add(sesameStatement, graph);
and every millionth statement , do connection.commit();
Allie upload
PDBJ upload
Uniprot upload
DDBJ upload