バージョン 2 (更新者: wu, 13 年 前)

--

* OwlimSE 配置

* Load performance

* Sparql query performance

OwlimSE 配置

JVMSetting:
 -Xmx55G -Xms30G -XX:+UseG1GC -XX:+TieredCompilation
-Druleset=empty -Dentity-index-size=1147483647 -Dcache-memory=16645m -Dtuple-index-memory=15G -DenablePredicateList=false  -DftsIndexPolicy=never  -Dbuild-pcsot=false -Dbuild-ptsoc=false  -Djournaling=true -Drepository-type=file-repository  -Dentity-id-size=32  

More information please refer to  http://docs.openlinksw.com/virtuoso/databaseadmsrv.html

Load Performance

Approach 1:

Approach 2:

The idea is from uniprot, which uses owlim as an library as follows:

Basically They have one specific loader program, where there is one java thread that reads the triples into a blocking queue. Then a different number of threads take triples from that queue and insert the data into OWLIM-se (or any other sesame API compatible triplestore). Normally one inserting thread per owlim file-repository fragment. The inserter treads use transactions that commit every half a million statements. The basic is to add statements not files.

final org.openrdf.model.Statement sesameStatement = getSesameStatement(object);

//Takes one from the blocking queue filled by the other thread

connection.add(sesameStatement, graph);

and every millionth statement , do connection.commit();

Allie upload

PDBJ upload

Uniprot upload

DDBJ upload

Sparql query performance

PDBJ query performance

Uniprot query performance

DDBJ query performance