All Examples

Tables

For example, we consider several tables as shown.

Clients : which has columns of client ID (cid), name, and address
Products : which has columns of product ID (pid), name, category, and price
Orders : which has columns of dates, client ID (cid), product ID (pid), and quantity (qty)
Classes : which has only one column of class
Students : which has columns of student ID (sid), name, and class
Tests : which has columns of tests ID (id), student ID (sid), course, and score
Degrees : which has columns of student ID (sid), and degree

We insert a few records for each table as follows:

Clients
cid	name	address
1	Sato	Tokyo
2	Suzuki	Tokyo
3	Takahashi	Chiba
4	Tanaka	Ibaraki
5	Ito	Ibaraki

Products
pid	name	category	price
1	shirt	110	100
2	T-shirt	110	200
3	shorts	111	350
4	pants	111	500
5	suit	210	1000

Orders
dates	cid	pid	qty
2020-01-01	1	1	3
2020-01-01	1	2	10
2020-01-01	2	1	2
2020-01-02	1	3	5
2020-01-02	2	4	20

Classes
class
A
B
C
D
E

Students
sid	name	class
1	Sato	A
2	Suzuki	B
3	Takahashi	A
4	Tanaka	B
5	Ito	C

Tests
id	sid	course	score
1	1	Math	80
2	1	Science	75
3	2	Math	59
4	3	Science	55
5	3	Physics	92

Degrees
sid	degree
1	BSc
2	BA
3	BA
4	BSc
5	BSc

Q(simple)

Q(simple) computes, for each order ID, the sum of quantities ordered on 2020-01-01.
It can be described in Quelg as follows:

\( Q({\bf simple}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for}~(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\bf simple}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm where}~(o.{\rm dates} = \texttt{"2020-01-01"}) \\ \phantom{Q({\bf simple}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm yield}~o) \\ ~~~~~{\rm where}~~\alpha = \{({\rm qty}, {\rm SUM}, {\rm qty\_sum})\} \)
A query written in Quelg like the above query is represented in SQL as:

	  select x.pid as pid, sum(x.qty) as qty_sum
	  from (select o.*
                from orders as o
                where o.dates = "2020-01-01") as x
          group by x.pid

By using our optimization, the above SQL query can be optimized as:

	  select o.pid as pid, sum(o.qty) as qty_sum
	  from orders as o
          where o.dates = "2020-01-01"
          group by o.pid

Q(\(g_2 \circ f \circ g_1\))

Suppose we have a query \(g_2 \circ f \circ g_1\) where \(g_i\) are queries with grouping and \(f\) is a query without grouping, and \(\circ\) denotes function composition.
Let \(g_1\) be the following function.

\( g_1(t_1, t_2) = {\rm for}~(x \leftarrow \mathcal{G}_{(\{{\rm dates},{\rm pid}\}, \alpha_1)}~(t_1)) \\ \phantom{g_1(t_1, t_2) = } {\rm yield}~\{ {\rm pid} = x.{\rm pid}, \\ \phantom{g_1(t_1, t_2) = {\rm yield}~} {\rm sales} = {\rm for}~(p \leftarrow t_2) \\ \phantom{g_1(t_1, t_2) = {\rm yield}~ {\rm sales} =} {\rm where}~(x.{\rm pid} = p.{\rm pid}) \\ \phantom{g_1(t_1, t_2) = {\rm yield}~ {\rm sales} =}{\rm yield}~\{ {\rm category} = p.{\rm category}, {\rm sale} = p.{\rm price} * x.{\rm qty\_sum} \} \} \\ ~~~~~{\rm where}~~\alpha_1 = \{ ({\rm qty}, {\rm SUM}, {\rm qty\_sum}) \} \)
We can obtain a concrete query by applying \(g_1\) to two tables.
\(g_1\) first executes \( \mathcal{G}_{(\{{\rm dates},{\rm pid}\}, \alpha_1)}~(t_1) \) which computes a bag of the sum of quantity in each group classified by the value of the dates and pid field in the table \(t_1\).
Then it computes a nested data consisting of a bag of records with the fields pid and sales, and the sales field is a bag of records.
The next function \(f\) extracts the sales field of an input \(t\), which is assumed to be a bag of records, and does some computation for each record.

\( f(t) = {\rm for}~(y \leftarrow t) \\ \phantom{f(t) =} {\rm for}~(z \leftarrow y.{\rm sales}) \\ \phantom{f(t) =} {\rm yield}~\{ {\rm category} = z.{\rm category}, {\rm sale} = z.{\rm sale} * 0.8 \} \)
The third function in this series is \(g_2\), which computes the sum of sales grouped by category, and extracts summation of sales (sales_sum) and multiplies it by 100.

\( g_2(t) = {\rm for}~(v \leftarrow \mathcal{G}_{({\rm category}, \alpha_2)}(t)) \\ \phantom{g_2(t) =} {\rm yield}~\{ {\rm result} = v.{\rm sale\_sum} * 100 \} \\ ~~~~~{\rm where}~~\alpha_2 = \{ ({\rm sale}, {\rm SUM}, {\rm sale\_sum}) \} \)
We can compose the three functions and apply the result to concrete tables to obtain a concrete query.
So, Q(\(g_2 \circ f \circ g_1\)) can be written in Quelg as follows:

\( Q(g_2 \circ f \circ g_1) = (g_2 \circ f \circ g_1)~({\rm table}(``{\rm orders}"), {\rm table}(``{\rm products}")) \)

A query written in Quelg like the above query is represented in SQL as:

	  select y.sale_sum * 100 as result
	  from (select z.category as category, sum(z.sale) as sale_sum
                from (select p.category as category, p.price * x.qty_sum * 0.8 as sale
                      from (select y.dates as dates, y.pid as pid, sum(y.qty) as qty_sum
                            from (select o.*
                                  from orders as o) as y
                            group by y.dates, y.pid) as x, products as p
                      where x.pid = p.pid) as z
                group by z.category) as v

By using our optimization, the above SQL query can be optimized as:

	  select sum(z.sale) * 100 as result
	  from (select p.category as category, p.price * sum(o.qty) * 0.8 as sale
                from orders as o, products as p
                where o.pid = p.pid
                group by o.date, o.pid, p.pid, p.name, p.category, p.price) as z
	  group by z.category

Q(getSales)

Q(getSales) fetches the record with the same pid from the products table and orders table and calculates the sales.
It performs grouping based on pid as the key, and calculates the average of sales.
It can be described in Quelg as follows:

\( Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for}~(x \leftarrow {\rm for} (p \leftarrow {\rm table}(“{\rm products}”)) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow}{\rm yield} \{ {\rm order} = {\rm for}(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow {\rm yield} \{{\rm order} =}{\rm where}(p.{\rm pid} = o.{\rm pid}) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow {\rm yield} \{{\rm order} =}{\rm yield}\{ {\rm pid} = o.{\rm pid}, \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow {\rm yield} \{{\rm order} = {\rm yield} \{}{\rm sales} = p.{\rm price} * o.{\rm qty}\}\}) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm for}(y \leftarrow x.{\rm order}) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm yield}~y) \\ ~~~~~{\rm where}~~\alpha = \{({\rm sales}, {\rm AVG}, {\rm sales\_avg})\} \)
It uses a nested data structure, so we use normalization rules to normalize it into a query that can be translated directly into SQL.
A query written in Quelg like the above query is represented in SQL as:

	  select x.pid as pid, avg(x.sales) as sales_avg
	  from (select o.pid as pid, p.price * o.qty as sales
                from products as p, orders as o
                where p.pid = o.pid) as x
          group by x.pid

By using our optimization, the above SQL query can be optimized as:

	  select o.pid as pid, avg(p.price * o.qty) as sales_avg
          from products as p, orders as o
          where p.pid = o.pid
          group by o.pid

Q(getCount)

Q(getCount) performs grouping twice by nested \(\mathcal{G}\)-operators.
It groups the test tables by sid, and further groups the query with the total number of courses by the total number of courses.
Finally, it calculates the total number of course_count.
It can be described in Quelg as follows:

\( Q({\bf getCount}) = \mathcal{G}_{({\rm course\_count}, \alpha_2)}(\mathcal{G}_{({\rm sid}, \alpha_1)}({\rm for}~(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf getCount}) = \mathcal{G}_{({\rm course\_count}, \alpha_2)}(\mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm yield}~t) \\ ~~~~~{\rm where}~~\alpha_1 = \{({\rm course}, {\rm COUNT}, {\rm course\_count})\} \\ ~~~~~{\rm where}~~\alpha_2 = \{({\rm course\_count}, {\rm COUNT}, {\rm student\_count})\} \)
A query written in Quelg like the above query is represented in SQL as:

	  select y.course_count as course_count, count(y.course_count) as student_count
	  from (select x.sid as sid, count(x.course) as course_count
                from (select t.*
                      from tests as t) as x
                group by x.sid) as y
          group by y.course_count

By using our optimization, the above SQL query can be optimized as:

	  select y.course_count as course_count, count(y.course_count) as student_count
	  from (select t.sid as sid, count(t.course) as course_count
                from tests as t
                group by t.sid) as y
          group by y.course_count

Q(abstraction)

Q(abstraction) combines lambda abstraction and a \(\mathcal{G}\)-operator.
It takes the classes that match the arguments from the students table and groups them by class.
Then it returns the total number of classes.
It can be described in Quelg as follows:

\( Q({\bf abstraction})' = \lambda class. \mathcal{G}_{({\rm class}, \alpha)}({\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{Q({\bf abstraction})' = \lambda class. \mathcal{G}_{({\rm class}, \alpha)}(}{\rm where}(s.{\rm class} = class) \\ \phantom{Q({\bf abstraction})' = \lambda class. \mathcal{G}_{({\rm class}, \alpha)}(}{\rm yield}~s) \\ Q({\bf abstraction}) = Q({\bf abstraction})'~~“A” \\ ~~~~~{\rm where}~~\alpha = \{({\rm class}, {\rm COUNT}, {\rm class\_count})\} \\ \)
A query written in Quelg like the above query is represented in SQL as:

	  select x.class as class, count(x.class) as class_count
          from (select s.*
                from students as s
                where s.class = 'A') as x
          group by x.class

By using our optimization, the above SQL query can be optimized as:

	  select s.class as class, count(s.class) as class_count
          from students as s
          where s.class = 'A'
          group by s.class

Q(predicate)

Q(predicate) groups the data using the predicate given as the argument of the query.
The predicate for this query is \(\lambda x. x > 50 \), which eventually computes the total number of scores.
It can be described in Quelg as follows:

\( Q({\bf predicate})' = \lambda p.~\mathcal{G}_{({\rm course}, \alpha)}({\rm for}~(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf predicate})' = \lambda p.~\mathcal{G}_{({\rm course}, \alpha)}(}{\rm where}~(t.{\rm course} = “{\rm Math}”~\land~p(t.{\rm score})) \\ \phantom{Q({\bf predicate})' = \lambda p.~\mathcal{G}_{({\rm course}, \alpha)}(}{\rm yield}~t) \\ Q({\bf predicate}) = Q({\bf predicate})'~~(\lambda x.~x~>~50) \\ ~~~~~{\rm where}~~\alpha = \{({\rm score}, {\rm COUNT}, {\rm score\_count})\} \)
A query written in Quelg like the above query is represented in SQL as:

	  select x.course as course, count(x.score) as score_count
          from (select t.*
                from tests as t
                where t.course = 'Math' and t.score > 50) as x
          group by x.course

By using our optimization, the above SQL query can be optimized as:

	  select t.course as course, count(t.score) as score_count
          from tests as t
          where t.course = 'Math' and t.score > 50
          group by t.course

Q(getScore)

Q(getScore) used nested data structures as intermediate data in the computation with grouping.
It can be described in Quelg as follows:

\( nestedData = {\rm for}(c \leftarrow {\rm table}(”{\rm classes}”)) \\ \phantom{nestedData = }{\rm yield}~\{ {\rm class} = c.{\rm class}, \\ \phantom{nestedData = {\rm yield}~\{}{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}({\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{nestedData = {\rm yield}~\{{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm for}(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{nestedData = {\rm yield}~\{{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm where} (c.{\rm class} = s.{\rm class} \land s.{\rm sid} = t.{\rm sid}) \\ \phantom{nestedData = {\rm yield}~\{{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm yield}~\{{\rm sid} = s.{\rm sid}, {\rm score} = t.{\rm store} \})\} \\ \\ Q({\bf getScore}) = \mathcal{G}_{({\rm class}, \alpha_2)}({\rm for}(x \leftarrow nestedData) \\ \phantom{Q({\bf getScore}) = \mathcal{G}_{({\rm class}, \alpha_2)}(}{\rm for}(y \leftarrow x.{\rm students}) \\ \phantom{Q({\bf getScore}) = \mathcal{G}_{({\rm class}, \alpha_2)}(}{\rm yield}~\{{\rm class} = x.{\rm class}, {\rm score\_avg} = y.{\rm score\_avg}\}) \\ ~~~~~{\rm where}~~\alpha_1 = \{({\rm score}, {\rm AVG}, {\rm score\_avg})\} \\ ~~~~~{\rm where}~~\alpha_2 = \{({\rm score\_avg}, {\rm AVG}, {\rm score\_avg})\} \)
\(nestedData\) uses a nested data structure, so we use normalization rules to normalize it into a query that can be translated directly into SQL.
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select z.class as class, avg(z.score_avg) as score_avg
	  from (select c.class as class, y.score_avg as score_avg
	        from classes as c, lateral(select x.sid as sid, avg(x.score) as score_avg
	                                 from (select s.sid as sid, t.score as score
                                               from students as s, tests as t
                                               where c.class = s.class and s.sid = t.sid) as x
	                                 group by x.xid) as y ) as z
	  group by z.class

By using our optimization, the above SQL query can be optimized as:

	  select z.class as class, avg(z.score_avg) as score_avg
	  from (select c.class as class, avg(t.score) as score_avg
	        from classes as c, students as s, tests as t
	        group by c.class, s.sid) as z
	  group by z.class

Q(getQty)

Q(getQty) retrieves records with the same pid from the products and orders tables, then performs grouping by the name and dates fields as keys.
It calculates the total number of the products for each date.
It can be described in Quelg as follows:

\( Q({\rm getQty}) = {\rm for}(p \leftarrow {\rm table}(“{\rm products}”)) \\ \phantom{Q({\rm getQty}) =}{\rm for}(y \leftarrow \mathcal{G}_{(\{{\rm name}, {\rm dates} \}, \alpha)}({\rm for}(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\rm getQty}) = {\rm for}(y \leftarrow \mathcal{G}_{(\{{\rm name}, {\rm dates} \}, \alpha)}(}{\rm where} (p.{\rm pid} = o.{\rm pid}) \\ \phantom{Q({\rm getQty}) = {\rm for}(y \leftarrow \mathcal{G}_{(\{{\rm name}, {\rm dates} \}, \alpha)}(}{\rm yield} \{ {\rm name} = p.{\rm name}, {\rm dates} = o.{\rm dates}, {\rm qty} = o.{\rm qty} \})) \\ \phantom{Q({\rm qty\_sum}) =}{\rm yield} \{ {\rm name} = p.{\rm name}, {\rm dates} = y.{\rm dates}, {\rm qty\_sum} = y.{\rm qty\_sum} \} \\ ~~~~~{\rm where}~~\alpha = \{({\rm qty}, {\rm SUM}, {\rm qty\_sum})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select p.name as name, y.dates as dates, y.qty_sum as qty_sum
	  from products as p, lateral (select x.name as name, x.dates as dates, sum(x.qty) as qty_sum
	                               from (select p.name as name, o.dates as dates, o.qty as qty
	                                     from orders as o
                                             where p.pid = o.pid) as x
                                       group by x.name, x.dates) as y

By using our optimization, the above SQL query can be optimized as:

	  select p.name as name, o.dates as dates, sum(o.qty) as qty_sum
	  from products as p, orders as o
	  where p.pid = o.pid
	  group by p.pid, p.name, p.price, o.dates

Q(multiple)

Q(multiple) is a query in which a table and the \(\mathcal{G}\)-operator are mixed in the input of each for-constructor.
It can be described in Quelg as follows:

\( Q({\bf multiple}) = {\rm for}(c \leftarrow {\rm table}(“{\rm clients}”)) \\ \phantom{Q({\bf multiple}) =}{\rm for}(p \leftarrow {\rm table}(“{\rm products}”)) \\ \phantom{Q({\bf multiple}) =}{\rm for}(x \leftarrow \mathcal{G}_{(\{ {\rm name}, {\rm dates} \}, \alpha)}({\rm for}(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\bf multiple}) = {\rm for}(x \leftarrow \mathcal{G}_{\{ {\rm name}, {\rm dates} \}, \alpha})(}{\rm where} (c.{\rm cid} = o.{\rm cid} \land o.{\rm pid} = p.{\rm pid}) \\ \phantom{Q({\bf multiple}) = {\rm for}(x \leftarrow \mathcal{G}_{\{ {\rm name}, {\rm dates} \}, \alpha})(}{\rm yield} \{ {\rm name} = c.{\rm name}, {\rm dates} = o.{\rm dates}, {\rm sales} = p.{\rm price} * o.{\rm qty} \}) \\ \phantom{Q({\rm multiple}) =}{\rm yield} \{ {\rm name} = c.{\rm name}, {\rm dates} = y.{\rm dates}, {\rm sales\_sum} = y.{\rm sales\_sum} \} \\ ~~~~~{\rm where}~~\alpha = \{({\rm sales}, {\rm SUM}, {\rm sales\_sum})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select c.name as name, x.dates as dates, x.sales_sum as sales_sum
	  from clients as c, products as p, lateral (select y.name as name, y.dates as dates, sum(y.sales) as sales_sum
	                                             from (select c.name as name, o.dates as dates, p.price * o.qty as sales
	                                                   from orders as o
	                                                   where c.cid = o.cid and o.pid = p.pid) as y
                                                     group by y.name, y.dates) as x

By using our optimization, the above SQL query can be optimized as:

	  select c.name as name, o.dates as dates, sum(p.price * o.qty) as sales_sum
	  from clients as c, products as p, orders as o
	  where c.cid = o.cid and o.pid = p.pid
	  group by c.cid, c.name, c.address, p.pid, p.name, p.category, p.price, o.dates

Q(for-\(\mathcal{G}\))

Q(for-\(\mathcal{G}\)) has the \(\mathcal{G}\)-operator in the output of the for-constructor.
This query groups the students table, the degrees table, and the grouped queries (results from the tests table that match the sid from the students table and the tests table, grouped by degrees) with all elements of the students and degrees tables, and returns the average of their scores.
It can be described in Quelg as follows:

\( Q({\bf for-}\mathcal{G}) = {\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{Q({\bf for-}\mathcal{G}) = }{\rm for}(d \leftarrow {\rm table}(“{\rm degrees}”)) \\ \phantom{Q({\bf for-}\mathcal{G}) = }{\rm where} (s.{\rm sid} = d.{\rm sid}) \\ \phantom{Q({\bf for-}\mathcal{G}) = }\mathcal{G}_{({\rm degree}, \alpha)}({\rm for}(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf for-}\mathcal{G}) = \mathcal{G}_{({\rm degree}, \alpha)}(}{\rm where} (s.{\rm sid} = t.{\rm sid}) \\ \phantom{Q({\bf for-}\mathcal{G}) = \mathcal{G}_{({\rm degree}, \alpha)}(}{\rm yield} \{ {\rm degree} = d.{\rm degree}, {\rm score} = t.{\rm score} \}) \\ ~~~~~{\rm where}~~\alpha = \{({\rm score}, {\rm AVG}, {\rm score\_avg})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select x.degree as degree, avg(x.score) as score_avg
	  from students as s, degrees as d, lateral (select d.degree as degree, t.score as score
                                                     from tests as t
                                                     where s.sid = t.sid) as x
	  where s.sid = d.sid
	  group by s.sid, s.name, s.class, d.sid, d.degree, x.degree

By using our optimization, the above SQL query can be optimized as:

	  select d.degree as degree, avg(t.score) as score_avg
	  from students as s, degrees as d, tests as t
	  where s.sid = d.sid and s.sid = t.sid
	  group by s.sid, s.name, s.class, d.sid, d.degree

Q(compose)

Q(compose) combines two queries with grouping.
The first query takes the sid as an argument and returns a record from the students table that matches the sid.

\( Q({\bf compose})' = \lambda sid. {\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{Q({\bf compose})' = \lambda sid.}{\rm where} (s.{\rm sid} = sid) \\ \phantom{Q({\bf compose})' = \lambda sid.}{\rm yield}~s \\ \)
The second query takes Q(compose)' as an argument, extracts the records from the tests table that match the sid, groups them by the sid, and returns the average value of the score.

\( Q({\bf compose})'' = \lambda s. \mathcal{G}_{({\rm sid}, \alpha)}({\rm for}(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf compose})'' = \lambda s. \mathcal{G}_{({\rm sid}, \alpha)}(}{\rm where} (s.{\rm sid} = t.{\rm sid}) \\ \phantom{Q({\bf compose})'' = \lambda s. \mathcal{G}_{({\rm sid}, \alpha)}(}{\rm yield} \{ {\rm sid} = s.{\rm sid}, {\rm score} = t.{\rm score} \}) \\ \)
The composed query of these two queries is expressed as follows:

\( Q({\bf compose}) = \lambda x.{\rm for}(y \leftarrow Q({\bf compse})' ~x)~Q({\bf compse})'' ~y \\ ~~~~~{\rm where}~~\alpha = \{({\rm score}, {\rm AVG}, {\rm score\_avg})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select x.sid as sid, avg(x.score) as score_avg
	  from students as s, lateral (select s.sid as sid, t.score as score
                                       from tests as t
                                       where s.sid = t.sid) as x
          where s.sid = 1
          group by s.sid, s.name, s.class, x.sid

By using our optimization, the above SQL query can be optimized as:

	  select s.sid as sid, avg(t.score) as score_avg
	  from students as s, tests as t
          where s.sid = 1 and s.sid = t.sid
          group by s.sid, s.name, s.class