== Triple Store Survey for Life Science Data == * [#overview Overview] * [#platform Platform] * [#data Data] * Database * Bigdata [wiki:bigdata => Bigdata ] * 4store [wiki:4store => 4store ] * Virtuoso [wiki:Virtuoso => Virtuoso ] * Owlim-se [wiki:OwlimSe => OwlimSe ] * Mulgara [wiki:Mulgara => Mulgara ] * Summarize [wiki:summarize =>summarize] === Overview === #overview === Platform === #platform * Machine: * OS: GNU/linux * CPU: GenuineIntel 6; model name : Intel(R) Xeon(R) CPU E5649 @ 2.53GHz; 12 cores 24 hyper-threading * Mem: 65996128 kB * Harddisk: SCSI Raid 0 (three hard disks of 1 Tera bytes, two of them are used to store data) * Software: * JDK:1.6.0_26 * Virtuoso: 6.3 commercial * OwlimSE: 4.3.4238 * Mulgara: 2.1.12 * 4store: 1.1.4 * Bigdata: RWSTORE_1_1_0 === Data === #data Allie: .n3 format, 94,420,989 tripples, sparql query attachment:allie.txt . PDBJ: .rdf.gz format ,589,987,335 triples, 77878 files, from [ftp://ftp.pdbj.org/XML/rdf/]. sparql query attachment:pdbj.txt. Uniprot: .rdf.gz format , about 4 billion triples, the 3 largest files are uniprot.rdf.gz,uniparc.rdf.gz,uniref.rdf.gz, from [ftp://ftp.uniprot.org/pub/databases/uniprot/] (the experiment used data was 2011.Nov version). sparql query attachment:uniprot.txt or [http://beta.sparql.uniprot.org/]. DDBJ: .rdf.gz format, about 8 billion triples, 330 files, from [ftp://ftp.ddbj.nig.ac.jp/ddbj_database/ddbj/]. sparql query attachment:ddbj.txt .