All Examples


Tables

For example, we consider several tables as shown.

  • Clients : which has columns of client ID (cid), name, and address
  • Products : which has columns of product ID (pid), name, category, and price
  • Orders : which has columns of dates, client ID (cid), product ID (pid), and quantity (qty)
  • Classes : which has only one column of class
  • Students : which has columns of student ID (sid), name, and class
  • Tests : which has columns of tests ID (id), student ID (sid), course, and score
  • Degrees : which has columns of student ID (sid), and degree

We insert a few records for each table as follows:

Clients
cid
name
address
1 Sato Tokyo
2 Suzuki Tokyo
3 Takahashi Chiba
4 Tanaka Ibaraki
5 Ito Ibaraki
Products
pid
name
category
price
1 shirt 110 100
2 T-shirt 110 200
3 shorts 111 350
4 pants 111 500
5 suit 210 1000
Orders
dates
cid
pid
qty
2020-01-01 1 1 3
2020-01-01 1 2 10
2020-01-01 2 1 2
2020-01-02 1 3 5
2020-01-02 2 4 20
Classes
class
A
B
C
D
E
Students
sid
name
class
1 Sato A
2 Suzuki B
3 Takahashi A
4 Tanaka B
5 Ito C
Tests
id
sid
course
score
1 1 Math 80
2 1 Science 75
3 2 Math 59
4 3 Science 55
5 3 Physics 92
Degrees
sid
degree
1 BSc
2 BA
3 BA
4 BSc
5 BSc

Q(simple)

Q(simple) computes, for each order ID, the sum of quantities ordered on 2020-01-01.
It can be described in Quelg as follows:

\( Q({\bf simple}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for}~(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\bf simple}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm where}~(o.{\rm dates} = \texttt{"2020-01-01"}) \\ \phantom{Q({\bf simple}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm yield}~o) \\ ~~~~~{\rm where}~~\alpha = \{({\rm qty}, {\rm SUM}, {\rm qty\_sum})\} \)
A query written in Quelg like the above query is represented in SQL as:

	  select x.pid as pid, sum(x.qty) as qty_sum
	  from (select o.*
                from orders as o
                where o.dates = "2020-01-01") as x
          group by x.pid 

By using our optimization, the above SQL query can be optimized as:

	  select o.pid as pid, sum(o.qty) as qty_sum
	  from orders as o
          where o.dates = "2020-01-01"
          group by o.pid 

Q(\(g_2 \circ f \circ g_1\))

Suppose we have a query \(g_2 \circ f \circ g_1\) where \(g_i\) are queries with grouping and \(f\) is a query without grouping, and \(\circ\) denotes function composition.
Let \(g_1\) be the following function.

\( g_1(t_1, t_2) = {\rm for}~(x \leftarrow \mathcal{G}_{(\{{\rm dates},{\rm pid}\}, \alpha_1)}~(t_1)) \\ \phantom{g_1(t_1, t_2) = } {\rm yield}~\{ {\rm pid} = x.{\rm pid}, \\ \phantom{g_1(t_1, t_2) = {\rm yield}~} {\rm sales} = {\rm for}~(p \leftarrow t_2) \\ \phantom{g_1(t_1, t_2) = {\rm yield}~ {\rm sales} =} {\rm where}~(x.{\rm pid} = p.{\rm pid}) \\ \phantom{g_1(t_1, t_2) = {\rm yield}~ {\rm sales} =}{\rm yield}~\{ {\rm category} = p.{\rm category}, {\rm sale} = p.{\rm price} * x.{\rm qty\_sum} \} \} \\ ~~~~~{\rm where}~~\alpha_1 = \{ ({\rm qty}, {\rm SUM}, {\rm qty\_sum}) \} \)
We can obtain a concrete query by applying \(g_1\) to two tables.
\(g_1\) first executes \( \mathcal{G}_{(\{{\rm dates},{\rm pid}\}, \alpha_1)}~(t_1) \) which computes a bag of the sum of quantity in each group classified by the value of the dates and pid field in the table \(t_1\).
Then it computes a nested data consisting of a bag of records with the fields pid and sales, and the sales field is a bag of records.
The next function \(f\) extracts the sales field of an input \(t\), which is assumed to be a bag of records, and does some computation for each record.

\( f(t) = {\rm for}~(y \leftarrow t) \\ \phantom{f(t) =} {\rm for}~(z \leftarrow y.{\rm sales}) \\ \phantom{f(t) =} {\rm yield}~\{ {\rm category} = z.{\rm category}, {\rm sale} = z.{\rm sale} * 0.8 \} \)
The third function in this series is \(g_2\), which computes the sum of sales grouped by category, and extracts summation of sales (sales_sum) and multiplies it by 100.

\( g_2(t) = {\rm for}~(v \leftarrow \mathcal{G}_{({\rm category}, \alpha_2)}(t)) \\ \phantom{g_2(t) =} {\rm yield}~\{ {\rm result} = v.{\rm sale\_sum} * 100 \} \\ ~~~~~{\rm where}~~\alpha_2 = \{ ({\rm sale}, {\rm SUM}, {\rm sale\_sum}) \} \)
We can compose the three functions and apply the result to concrete tables to obtain a concrete query.
So, Q(\(g_2 \circ f \circ g_1\)) can be written in Quelg as follows:

\( Q(g_2 \circ f \circ g_1) = (g_2 \circ f \circ g_1)~({\rm table}(``{\rm orders}"), {\rm table}(``{\rm products}")) \)

A query written in Quelg like the above query is represented in SQL as:

	  select y.sale_sum * 100 as result
	  from (select z.category as category, sum(z.sale) as sale_sum
                from (select p.category as category, p.price * x.qty_sum * 0.8 as sale
                      from (select y.dates as dates, y.pid as pid, sum(y.qty) as qty_sum
                            from (select o.*
                                  from orders as o) as y
                            group by y.dates, y.pid) as x, products as p
                      where x.pid = p.pid) as z
                group by z.category) as v 

By using our optimization, the above SQL query can be optimized as:

	  select sum(z.sale) * 100 as result
	  from (select p.category as category, p.price * sum(o.qty) * 0.8 as sale
                from orders as o, products as p
                where o.pid = p.pid
                group by o.date, o.pid, p.pid, p.name, p.category, p.price) as z
	  group by z.category 

Q(getSales)

Q(getSales) fetches the record with the same pid from the products table and orders table and calculates the sales.
It performs grouping based on pid as the key, and calculates the average of sales.
It can be described in Quelg as follows:

\( Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for}~(x \leftarrow {\rm for} (p \leftarrow {\rm table}(“{\rm products}”)) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow}{\rm yield} \{ {\rm order} = {\rm for}(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow {\rm yield} \{{\rm order} =}{\rm where}(p.{\rm pid} = o.{\rm pid}) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow {\rm yield} \{{\rm order} =}{\rm yield}\{ {\rm pid} = o.{\rm pid}, \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}({\rm for} (x \leftarrow {\rm yield} \{{\rm order} = {\rm yield} \{}{\rm sales} = p.{\rm price} * o.{\rm qty}\}\}) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm for}(y \leftarrow x.{\rm order}) \\ \phantom{Q({\bf getSales}) = \mathcal{G}_{({\rm pid}, \alpha)}(}{\rm yield}~y) \\ ~~~~~{\rm where}~~\alpha = \{({\rm sales}, {\rm AVG}, {\rm sales\_avg})\} \)
It uses a nested data structure, so we use normalization rules to normalize it into a query that can be translated directly into SQL.
A query written in Quelg like the above query is represented in SQL as:

	  select x.pid as pid, avg(x.sales) as sales_avg
	  from (select o.pid as pid, p.price * o.qty as sales
                from products as p, orders as o
                where p.pid = o.pid) as x
          group by x.pid 

By using our optimization, the above SQL query can be optimized as:

	  select o.pid as pid, avg(p.price * o.qty) as sales_avg
          from products as p, orders as o
          where p.pid = o.pid
          group by o.pid 

Q(getCount)

Q(getCount) performs grouping twice by nested \(\mathcal{G}\)-operators.
It groups the test tables by sid, and further groups the query with the total number of courses by the total number of courses.
Finally, it calculates the total number of course_count.
It can be described in Quelg as follows:

\( Q({\bf getCount}) = \mathcal{G}_{({\rm course\_count}, \alpha_2)}(\mathcal{G}_{({\rm sid}, \alpha_1)}({\rm for}~(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf getCount}) = \mathcal{G}_{({\rm course\_count}, \alpha_2)}(\mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm yield}~t) \\ ~~~~~{\rm where}~~\alpha_1 = \{({\rm course}, {\rm COUNT}, {\rm course\_count})\} \\ ~~~~~{\rm where}~~\alpha_2 = \{({\rm course\_count}, {\rm COUNT}, {\rm student\_count})\} \)
A query written in Quelg like the above query is represented in SQL as:

	  select y.course_count as course_count, count(y.course_count) as student_count
	  from (select x.sid as sid, count(x.course) as course_count
                from (select t.*
                      from tests as t) as x
                group by x.sid) as y
          group by y.course_count 

By using our optimization, the above SQL query can be optimized as:

	  select y.course_count as course_count, count(y.course_count) as student_count
	  from (select t.sid as sid, count(t.course) as course_count
                from tests as t
                group by t.sid) as y
          group by y.course_count 

Q(abstraction)

Q(abstraction) combines lambda abstraction and a \(\mathcal{G}\)-operator.
It takes the classes that match the arguments from the students table and groups them by class.
Then it returns the total number of classes.
It can be described in Quelg as follows:

\( Q({\bf abstraction})' = \lambda class. \mathcal{G}_{({\rm class}, \alpha)}({\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{Q({\bf abstraction})' = \lambda class. \mathcal{G}_{({\rm class}, \alpha)}(}{\rm where}(s.{\rm class} = class) \\ \phantom{Q({\bf abstraction})' = \lambda class. \mathcal{G}_{({\rm class}, \alpha)}(}{\rm yield}~s) \\ Q({\bf abstraction}) = Q({\bf abstraction})'~~“A” \\ ~~~~~{\rm where}~~\alpha = \{({\rm class}, {\rm COUNT}, {\rm class\_count})\} \\ \)
A query written in Quelg like the above query is represented in SQL as:

	  select x.class as class, count(x.class) as class_count
          from (select s.*
                from students as s
                where s.class = 'A') as x
          group by x.class 

By using our optimization, the above SQL query can be optimized as:

	  select s.class as class, count(s.class) as class_count
          from students as s
          where s.class = 'A'
          group by s.class 

Q(predicate)

Q(predicate) groups the data using the predicate given as the argument of the query.
The predicate for this query is \(\lambda x. x > 50 \), which eventually computes the total number of scores.
It can be described in Quelg as follows:

\( Q({\bf predicate})' = \lambda p.~\mathcal{G}_{({\rm course}, \alpha)}({\rm for}~(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf predicate})' = \lambda p.~\mathcal{G}_{({\rm course}, \alpha)}(}{\rm where}~(t.{\rm course} = “{\rm Math}”~\land~p(t.{\rm score})) \\ \phantom{Q({\bf predicate})' = \lambda p.~\mathcal{G}_{({\rm course}, \alpha)}(}{\rm yield}~t) \\ Q({\bf predicate}) = Q({\bf predicate})'~~(\lambda x.~x~>~50) \\ ~~~~~{\rm where}~~\alpha = \{({\rm score}, {\rm COUNT}, {\rm score\_count})\} \)
A query written in Quelg like the above query is represented in SQL as:

	  select x.course as course, count(x.score) as score_count
          from (select t.*
                from tests as t
                where t.course = 'Math' and t.score > 50) as x
          group by x.course 

By using our optimization, the above SQL query can be optimized as:

	  select t.course as course, count(t.score) as score_count
          from tests as t
          where t.course = 'Math' and t.score > 50
          group by t.course 

Q(getScore)

Q(getScore) used nested data structures as intermediate data in the computation with grouping.
It can be described in Quelg as follows:

\( nestedData = {\rm for}(c \leftarrow {\rm table}(”{\rm classes}”)) \\ \phantom{nestedData = }{\rm yield}~\{ {\rm class} = c.{\rm class}, \\ \phantom{nestedData = {\rm yield}~\{}{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}({\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{nestedData = {\rm yield}~\{{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm for}(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{nestedData = {\rm yield}~\{{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm where} (c.{\rm class} = s.{\rm class} \land s.{\rm sid} = t.{\rm sid}) \\ \phantom{nestedData = {\rm yield}~\{{\rm students} = \mathcal{G}_{({\rm sid}, \alpha_1)}(}{\rm yield}~\{{\rm sid} = s.{\rm sid}, {\rm score} = t.{\rm store} \})\} \\ \\ Q({\bf getScore}) = \mathcal{G}_{({\rm class}, \alpha_2)}({\rm for}(x \leftarrow nestedData) \\ \phantom{Q({\bf getScore}) = \mathcal{G}_{({\rm class}, \alpha_2)}(}{\rm for}(y \leftarrow x.{\rm students}) \\ \phantom{Q({\bf getScore}) = \mathcal{G}_{({\rm class}, \alpha_2)}(}{\rm yield}~\{{\rm class} = x.{\rm class}, {\rm score\_avg} = y.{\rm score\_avg}\}) \\ ~~~~~{\rm where}~~\alpha_1 = \{({\rm score}, {\rm AVG}, {\rm score\_avg})\} \\ ~~~~~{\rm where}~~\alpha_2 = \{({\rm score\_avg}, {\rm AVG}, {\rm score\_avg})\} \)
\(nestedData\) uses a nested data structure, so we use normalization rules to normalize it into a query that can be translated directly into SQL.
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select z.class as class, avg(z.score_avg) as score_avg
	  from (select c.class as class, y.score_avg as score_avg
	        from classes as c, lateral(select x.sid as sid, avg(x.score) as score_avg
	                                 from (select s.sid as sid, t.score as score
                                               from students as s, tests as t
                                               where c.class = s.class and s.sid = t.sid) as x
	                                 group by x.xid) as y ) as z
	  group by z.class 

By using our optimization, the above SQL query can be optimized as:

	  select z.class as class, avg(z.score_avg) as score_avg
	  from (select c.class as class, avg(t.score) as score_avg
	        from classes as c, students as s, tests as t
	        group by c.class, s.sid) as z
	  group by z.class 

Q(getQty)

Q(getQty) retrieves records with the same pid from the products and orders tables, then performs grouping by the name and dates fields as keys.
It calculates the total number of the products for each date.
It can be described in Quelg as follows:

\( Q({\rm getQty}) = {\rm for}(p \leftarrow {\rm table}(“{\rm products}”)) \\ \phantom{Q({\rm getQty}) =}{\rm for}(y \leftarrow \mathcal{G}_{(\{{\rm name}, {\rm dates} \}, \alpha)}({\rm for}(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\rm getQty}) = {\rm for}(y \leftarrow \mathcal{G}_{(\{{\rm name}, {\rm dates} \}, \alpha)}(}{\rm where} (p.{\rm pid} = o.{\rm pid}) \\ \phantom{Q({\rm getQty}) = {\rm for}(y \leftarrow \mathcal{G}_{(\{{\rm name}, {\rm dates} \}, \alpha)}(}{\rm yield} \{ {\rm name} = p.{\rm name}, {\rm dates} = o.{\rm dates}, {\rm qty} = o.{\rm qty} \})) \\ \phantom{Q({\rm qty\_sum}) =}{\rm yield} \{ {\rm name} = p.{\rm name}, {\rm dates} = y.{\rm dates}, {\rm qty\_sum} = y.{\rm qty\_sum} \} \\ ~~~~~{\rm where}~~\alpha = \{({\rm qty}, {\rm SUM}, {\rm qty\_sum})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select p.name as name, y.dates as dates, y.qty_sum as qty_sum
	  from products as p, lateral (select x.name as name, x.dates as dates, sum(x.qty) as qty_sum
	                               from (select p.name as name, o.dates as dates, o.qty as qty
	                                     from orders as o
                                             where p.pid = o.pid) as x
                                       group by x.name, x.dates) as y 

By using our optimization, the above SQL query can be optimized as:

	  select p.name as name, o.dates as dates, sum(o.qty) as qty_sum
	  from products as p, orders as o
	  where p.pid = o.pid
	  group by p.pid, p.name, p.price, o.dates 

Q(multiple)

Q(multiple) is a query in which a table and the \(\mathcal{G}\)-operator are mixed in the input of each for-constructor.
It can be described in Quelg as follows:

\( Q({\bf multiple}) = {\rm for}(c \leftarrow {\rm table}(“{\rm clients}”)) \\ \phantom{Q({\bf multiple}) =}{\rm for}(p \leftarrow {\rm table}(“{\rm products}”)) \\ \phantom{Q({\bf multiple}) =}{\rm for}(x \leftarrow \mathcal{G}_{(\{ {\rm name}, {\rm dates} \}, \alpha)}({\rm for}(o \leftarrow {\rm table}(“{\rm orders}”)) \\ \phantom{Q({\bf multiple}) = {\rm for}(x \leftarrow \mathcal{G}_{\{ {\rm name}, {\rm dates} \}, \alpha})(}{\rm where} (c.{\rm cid} = o.{\rm cid} \land o.{\rm pid} = p.{\rm pid}) \\ \phantom{Q({\bf multiple}) = {\rm for}(x \leftarrow \mathcal{G}_{\{ {\rm name}, {\rm dates} \}, \alpha})(}{\rm yield} \{ {\rm name} = c.{\rm name}, {\rm dates} = o.{\rm dates}, {\rm sales} = p.{\rm price} * o.{\rm qty} \}) \\ \phantom{Q({\rm multiple}) =}{\rm yield} \{ {\rm name} = c.{\rm name}, {\rm dates} = y.{\rm dates}, {\rm sales\_sum} = y.{\rm sales\_sum} \} \\ ~~~~~{\rm where}~~\alpha = \{({\rm sales}, {\rm SUM}, {\rm sales\_sum})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select c.name as name, x.dates as dates, x.sales_sum as sales_sum
	  from clients as c, products as p, lateral (select y.name as name, y.dates as dates, sum(y.sales) as sales_sum
	                                             from (select c.name as name, o.dates as dates, p.price * o.qty as sales
	                                                   from orders as o
	                                                   where c.cid = o.cid and o.pid = p.pid) as y
                                                     group by y.name, y.dates) as x 

By using our optimization, the above SQL query can be optimized as:

	  select c.name as name, o.dates as dates, sum(p.price * o.qty) as sales_sum
	  from clients as c, products as p, orders as o
	  where c.cid = o.cid and o.pid = p.pid
	  group by c.cid, c.name, c.address, p.pid, p.name, p.category, p.price, o.dates 

Q(for-\(\mathcal{G}\))

Q(for-\(\mathcal{G}\)) has the \(\mathcal{G}\)-operator in the output of the for-constructor.
This query groups the students table, the degrees table, and the grouped queries (results from the tests table that match the sid from the students table and the tests table, grouped by degrees) with all elements of the students and degrees tables, and returns the average of their scores.
It can be described in Quelg as follows:

\( Q({\bf for-}\mathcal{G}) = {\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{Q({\bf for-}\mathcal{G}) = }{\rm for}(d \leftarrow {\rm table}(“{\rm degrees}”)) \\ \phantom{Q({\bf for-}\mathcal{G}) = }{\rm where} (s.{\rm sid} = d.{\rm sid}) \\ \phantom{Q({\bf for-}\mathcal{G}) = }\mathcal{G}_{({\rm degree}, \alpha)}({\rm for}(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf for-}\mathcal{G}) = \mathcal{G}_{({\rm degree}, \alpha)}(}{\rm where} (s.{\rm sid} = t.{\rm sid}) \\ \phantom{Q({\bf for-}\mathcal{G}) = \mathcal{G}_{({\rm degree}, \alpha)}(}{\rm yield} \{ {\rm degree} = d.{\rm degree}, {\rm score} = t.{\rm score} \}) \\ ~~~~~{\rm where}~~\alpha = \{({\rm score}, {\rm AVG}, {\rm score\_avg})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select x.degree as degree, avg(x.score) as score_avg
	  from students as s, degrees as d, lateral (select d.degree as degree, t.score as score
                                                     from tests as t
                                                     where s.sid = t.sid) as x
	  where s.sid = d.sid
	  group by s.sid, s.name, s.class, d.sid, d.degree, x.degree 

By using our optimization, the above SQL query can be optimized as:

	  select d.degree as degree, avg(t.score) as score_avg
	  from students as s, degrees as d, tests as t
	  where s.sid = d.sid and s.sid = t.sid
	  group by s.sid, s.name, s.class, d.sid, d.degree 

Q(compose)

Q(compose) combines two queries with grouping.
The first query takes the sid as an argument and returns a record from the students table that matches the sid.

\( Q({\bf compose})' = \lambda sid. {\rm for}(s \leftarrow {\rm table}(“{\rm students}”)) \\ \phantom{Q({\bf compose})' = \lambda sid.}{\rm where} (s.{\rm sid} = sid) \\ \phantom{Q({\bf compose})' = \lambda sid.}{\rm yield}~s \\ \)
The second query takes Q(compose)' as an argument, extracts the records from the tests table that match the sid, groups them by the sid, and returns the average value of the score.

\( Q({\bf compose})'' = \lambda s. \mathcal{G}_{({\rm sid}, \alpha)}({\rm for}(t \leftarrow {\rm table}(“{\rm tests}”)) \\ \phantom{Q({\bf compose})'' = \lambda s. \mathcal{G}_{({\rm sid}, \alpha)}(}{\rm where} (s.{\rm sid} = t.{\rm sid}) \\ \phantom{Q({\bf compose})'' = \lambda s. \mathcal{G}_{({\rm sid}, \alpha)}(}{\rm yield} \{ {\rm sid} = s.{\rm sid}, {\rm score} = t.{\rm score} \}) \\ \)
The composed query of these two queries is expressed as follows:

\( Q({\bf compose}) = \lambda x.{\rm for}(y \leftarrow Q({\bf compse})' ~x)~Q({\bf compse})'' ~y \\ ~~~~~{\rm where}~~\alpha = \{({\rm score}, {\rm AVG}, {\rm score\_avg})\} \\ \)
This query is converted as a correlated subquery in SQL, which makes it a very inefficient query.
A query written in Quelg like the above query is represented in SQL as:

	  select x.sid as sid, avg(x.score) as score_avg
	  from students as s, lateral (select s.sid as sid, t.score as score
                                       from tests as t
                                       where s.sid = t.sid) as x
          where s.sid = 1
          group by s.sid, s.name, s.class, x.sid 

By using our optimization, the above SQL query can be optimized as:

	  select s.sid as sid, avg(t.score) as score_avg
	  from students as s, tests as t
          where s.sid = 1 and s.sid = t.sid
          group by s.sid, s.name, s.class