Context Navigation

バージョン 31 からバージョン 32 における更新: ~FederatedBench

更新日時:: 2015/01/09 17:32:15 (10 年前)
更新者:: wu
コメント:: --

凡例:

: 変更なし
: 追加
: 削除
: 変更

~FederatedBench

v31	v32
12	12
13	13
14
15		* [#platform Platform]
16
17	14
18	15	* Data & Query[wiki:data => Data&Query]
…	…
24	21
25	22	=== Overview === #overview
	23	Most of RDF data is available via SPARQL endpoints. A federated query, querying RDF data across multiple sources, is indispensable in comprehending biological process, diseases, medicine development ,and also biological data integration. Querying RDF data via SPARQL endpoints can be processed based on the original date sources. We Evaluate whether the existing federated SPARQL endpoint query systems can satisfy the requirements(mainly, response performance) of the big life sciences data.
26	24
27
28		Query of RDF data via SPARQL endpoints can be processed based on the original date sources. We evaluate federated SPARQL endpoint query systems for real life sciences data. Different from Fedbench,SP2Bench, and the fine-grained evaluation of SPARQL endpoint federation systems, all of which use a simulated federated environment, and synthetic data or subset of real data. We use the real life science data and the real endpoints. We test FedX,SPLENDID,ADERIS,ANAPSID.
	25	Different from Fedbench,SP2Bench, and the fine-grained evaluation of SPARQL endpoint federation systems, all of which use a simulated federated environment, and synthetic data or subset of real data. Their result set is small, which can not show the performance when the retrieved data are big. In addition, whether we can directly execute SPARQL 1.1 query without using the federated query engines is also unreported. In this report, we use the real life science data and the real endpoints. We test FedX,SPLENDID,ADERIS,ANAPSID. We also report the performance of SPARQL 1.1 queries with and without these engines.
29	26
30	27
…	…
35	32	\|\|ANAPSID \|\| GNU Lesser GPL\|\| Python\|\| No\|\|available\|\|Yes\|\| predicate \|\|Yes\|\|
36	33
	34	=== Method=== #approach
	35
	36	We use five real biological SPARQL endpoints,and designed five basic queries,considering the number of really queried endpoints, the triple patterns (varying from 4 to 9), and the number of results(from 5 to 11000). And we rewrite query 3 and 5 with “limit 100” clause. To keep a stable server and network environment, we sequentially execute a query for all engines, and repeat it five times. Finally we remove the biggest value and calculate the average of other four values. To test the performance when users do federated 1.1 queries in an endpoint directly instead of using a federated query engine, we rewrite all five queries with service keywords and change the order of two service clauses, and execute the query in one of five endpoints.
	37