* [#Configure OwlimSE 配置] * [#load Load performance] * [#allieload Allie upload] * [#pdbjload PDBJ upload] * [#uniprotload Uniprot upload] * [#ddbjload DDBJ upload] * [#Sparql Sparql query performance] * [#alliequery Allie query performance] * [#pdbjquery PDBJ query performance] * [#uniprotquery Uniprot query performance] * [#ddbjquery DDBJ query performance] === OwlimSE 配置 === #Configure {{{ JVMSetting: -Xmx55G -Xms30G -XX:+UseG1GC -XX:+TieredCompilation -Druleset=empty -Dentity-index-size=1147483647 -Dcache-memory=16645m -Dtuple-index-memory=15G -DenablePredicateList=false -DftsIndexPolicy=never -Dbuild-pcsot=false -Dbuild-ptsoc=false -Djournaling=true -Drepository-type=file-repository -Dentity-id-size=32 }}} More information please refer to [http://docs.openlinksw.com/virtuoso/databaseadmsrv.html] === Load Performance === #load '''Approach 1:''' '''Approach 2:''' The idea is from uniprot, which uses owlim as an library as follows: Basically They have one specific loader program, where there is one java thread that reads the triples into a blocking queue. Then a different number of threads take triples from that queue and insert the data into OWLIM-se (or any other sesame API compatible triplestore). Normally one inserting thread per owlim file-repository fragment. The inserter treads use transactions that commit every half a million statements. The basic is to add statements not files. final org.openrdf.model.Statement sesameStatement = getSesameStatement(object); //Takes one from the blocking queue filled by the other thread connection.add(sesameStatement, graph); and every millionth statement , do connection.commit(); === Allie upload === #allieload === PDBJ upload === #pdbjload === Uniprot upload === #uniprotload === DDBJ upload === #ddbjload === Sparql query performance === #alliequery === PDBJ query performance === #pdbjquery === Uniprot query performance === #uniprotquery === DDBJ query performance === #ddbjqu