Commit graph

50 commits

Author SHA1 Message Date
sin-ack 3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Idan Horowitz 086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Timothy Flynn bfe1bd9726 LibSQL: Convert binary SQL operations to be fallible
Now that expression evaluation can use TRY, we can allow binary operator
methods to fail as well. This also fixes a few instances of converting a
Value to a double when we meant to convert to an integer.
2022-02-13 21:30:38 +00:00
Timothy Flynn e13c96157c LibSQL: Implement converting float and tuple values to a boolean 2022-02-13 21:30:38 +00:00
Timothy Flynn 2397836f8e LibSQL+SQLServer: Introduce and use ResultOr<ValueType>
The result of a SQL statement execution is either:
    1. An error.
    2. The list of rows inserted, deleted, selected, etc.

(2) is currently represented by a combination of the Result class and
the ResultSet list it holds. This worked okay, but issues start to
arise when trying to use Result in non-statement contexts (for example,
when introducing Result to SQL expression execution).

What we really need is for Result to be a thin wrapper that represents
both (1) and (2), and to not have any explicit members like a ResultSet.
So this commit removes ResultSet from Result, and introduces ResultOr,
which is just an alias for AK::ErrorOrr. Statement execution now returns
ResultOr<ResultSet> instead of Result. This further opens the door for
expression execution to return ResultOr<Value> in the future.

Lastly, this moves some other context held by Result over to ResultSet.
This includes the row count (which is really just the size of ResultSet)
and the command for which the result is for.
2022-02-10 23:11:13 +01:00
Timothy Flynn f1f0770d68 LibSQL: Do not crash when SELECTing from an empty table
The crash was caused by getting the first element of an empty vector.
2022-02-10 12:20:35 +00:00
Timothy Flynn 373b467302 LibSQL+SQLServer: Move LibSQL/SQLResult.[h,cpp] to LibSQL/Result.[h,cpp]
Rename the file to match the new class name.
2022-02-10 12:20:35 +00:00
Timothy Flynn 6620f19979 LibSQL+SQLServer: Return the new Result class from statement executions
We can now TRY anything that returns a SQL::Result or an AK::Error.
2022-02-10 12:20:35 +00:00
Mahmoud Mandour 794d79e315 LibSQL: Implement DESCRIBE TABLE tests 2022-02-05 00:35:03 +01:00
Timothy Flynn 6efbafa6e0 Everywhere: Update copyrights with my new serenityos.org e-mail :^) 2022-01-31 18:23:22 +00:00
mnlrsn 66216d3af6 LibSQL: Add simple REGEXP match
The implementation of LIKE uses regexes under the hood, and this
implementation of REGEXP takes the same approach. It employs
PosixExtended from LibRegex with case insensitive and Unicode flags
set. The implementation of LIKE is based on SQLlite specs, but SQLlite
does not offer directions for a built-in regex functionality, so this
one uses LibRegex.
2022-01-23 22:34:53 +03:30
Jan de Visser 6e9f06fc9f LibSQL: Introduce SELECT ... LIMIT xxx OFFSET yyy
What it says on the tin.
2022-01-16 11:17:15 +01:00
Jan de Visser 7fc901d1b3 LibSQL+SQLServer: Implement first cut of SELECT ... ORDER BY foo
Ordering is done by replacing the straight Vector holding the query
result in the SQLResult object with a dedicated Vector subclass that
inserts result rows according to their sort key using a binary search.
This is done in the ResultSet class.

There are limitations:
- "SELECT ... ORDER BY 1" (or 2 or 3 etc) is supposed to sort by the
n-th result column. This doesn't work yet
- "SELECT ... column-expression alias ... ORDER BY alias" is supposed to
sort by the column with the given alias. This doesn't work yet

What does work however is something like
```SELECT foo FROM bar SORT BY quux```
i.e. sorted by a column not in the result set. Once functions are
supported it should be possible to sort by random functions.
2022-01-16 11:17:15 +01:00
Guilherme Gonçalves f91d471843 LibSQL: Implement LIKE SQL expressions 2022-01-07 10:50:39 +03:30
Guilherme Gonçalves e957c078d5 LibSQL: Properly parse ESCAPE expressions
The evaluation order of method parameters is unspecified in C++, and
so we couldn't rely on parse_statement() being called before
parse_escape() when building a MatchExpression.

With this patch, we explicitly parse what we need in the right order,
before building the MatchExpression object.
2022-01-07 10:50:39 +03:30
Sam Atkins 9e3a786a64 Tests: Cast unused smart-pointer return values to void 2021-12-05 15:31:03 +01:00
Jan de Visser c369626ac1 LibSQL: Gracefully react to unimplemented valid SQL
Fixes a crash that was caused by a syntax error which is difficult to
catch by the parser: usually identifiers are accepted in column lists,
but they are not in a list of column values to be inserted in an INSERT.

Fixed this by putting in a heuristic check; we probably need a better
way to do this.

Included tests for this case.

Also introduced a new SQL Error code, `NotYetImplemented`, and return
that instead of crashing when encountering unimplemented SQL.
2021-12-04 20:49:22 +03:30
Jan de Visser 001949d77a LibSQL: Improve error handling
The handling of filesystem level errors was basically non-existing or
consisting of `VERIFY_NOT_REACHED` assertions. Addressed this by
* Adding `open` methods to `Heap` and `Database` which return errors.
* Changing the interface of methods of these classes and clients
downstream to propagate these errors.

The constructors of `Heap` and `Database` don't open the underlying
filesystem file anymore.

The SQL statement handlers return an `SQLErrorCode::InternalError`
error code if an error comes back from the lower levels. Note that some
of these errors are things like duplicate index entry errors that should
be caught before the SQL layer attempts to actually update the database.

Added tests to catch attempts to open weird or non-existent files as
databases.

Finally, in between me writing this patch and submitting the PR the
AK::Result<Foo, Bar> template got deprecated in favour of ErrorOr<Foo>.
This resulted in more busywork.
2021-12-04 20:49:22 +03:30
Jan de Visser 3425730294 LibSQL: Implement table joins
This patch introduces table joins. It uses a pretty dumb algorithm-
starting with a singleton '__unity__' row consisting of a single boolean
value, a cartesian product of all tables in the 'FROM' clause is built.
This cartesian product is then filtered through the 'WHERE' clause,
again without any smarts just using brute force.

This patch required a bunch of busy work to allow for example the
ColumnNameExpression having to deal with multiple tables potentially
having columns with the same name.
2021-11-10 14:47:49 +01:00
Jan de Visser 1c50e9aadc LibSQL: Add current statement to the ExecutionContext
Because SQL is the craptastic language that it is, sometimes expressions
need to know details about the calling statement. For example the tables
in the 'FROM' clause may be needed to determine which columns are
referenced in 'WHERE' expressions. So the current statement is added
to the ExecutionContext and a new 'execute' overload on Statement is
created which takes the Database and the Statement and builds an
ExecutionContaxt from those.
2021-11-10 14:47:49 +01:00
Jan de Visser 7ea54db430 LibSQL: Add 'schema' and 'table' to TupleElementDescriptor
These are needed to distinguish columns from different tables with the
same column name in one and the same (joined) Tuple. Not quite happy
yet with this API; I think some sort of hierarchical structure would be
better but we'll burn that bridge when we get there :^)
2021-11-10 14:47:49 +01:00
Jan de Visser 7496f17620 LibSQL Tests: Add tests for SELECT ... WHERE ... 2021-10-25 12:59:42 +02:00
Mahmoud Mandour 0e5b2c923d LibSQL: Add an INSERT without column names test
This adds a passing test of an insert statement that contains no column
names and assumes full tuple input
2021-10-04 15:51:48 +02:00
Mahmoud Mandour 235573f7ba LibSQL: Test INSERT statement with wrong number of values 2021-10-04 15:51:48 +02:00
Mahmoud Mandour 4df85840c3 LibSQL: Test INSERT statement with wrong data types 2021-10-04 15:51:48 +02:00
Mahmoud Mandour 6065383e05 LibSQL: Check NoError individually in execution tests
This enables tests to check that a statement is erroneous, we could not
do so by only calling `execute()` since it asserted that no errors
occurred.
2021-10-04 15:51:48 +02:00
Ben Wiederhake 2e4ec891da Everywhere: Fix format-vulnerabilities
Command used:
grep -Pirn '(out|warn)ln\((?!["\)]|format,|stderr,|stdout,|output, ")' \
     AK Kernel/ Tests/ Userland/
(Plus some manual reviewing.)

Let's pick ArgsParser as an example:
    outln(file, m_general_help);
This will fail at runtime if the general help happens to contain braces.

Even if this transformation turns out to be unnecessary in a place or
two, this way the code is "more obviously" correct.
2021-09-11 15:16:26 +01:00
Daniel Bertalan d7b6cc6421 Everywhere: Prevent risky implicit casts of (Nonnull)RefPtr
Our existing implementation did not check the element type of the other
pointer in the constructors and move assignment operators. This meant
that some operations that would require explicit casting on raw pointers
were done implicitly, such as:
- downcasting a base class to a derived class (e.g. `Kernel::Inode` =>
  `Kernel::ProcFSDirectoryInode` in Kernel/ProcFS.cpp),
- casting to an unrelated type (e.g. `Promise<bool>` => `Promise<Empty>`
  in LibIMAP/Client.cpp)

This, of course, allows gross violations of the type system, and makes
the need to type-check less obvious before downcasting. Luckily, while
adding the `static_ptr_cast`s, only two truly incorrect usages were
found; in the other instances, our casts just needed to be made
explicit.
2021-09-03 23:20:23 +02:00
Andrew Kaster 58797a1289 Tests: Remove all file(GLOB) from CMakeLists in Tests
Using a file(GLOB) to find all the test files in a directory is an easy
hack to get things started, but has some drawbacks. Namely, if you add
a test, it won't be found again without re-running CMake. `ninja` seems
to do this automatically, but it would be nice to one day stop seeing it
rechecking our globbed directories.
2021-09-02 09:08:23 +02:00
Jan de Visser 85a84b0794 LibSQL: Introduce Serializer as a mediator between Heap and client code
Classes reading and writing to the data heap would communicate directly
with the Heap object, and transfer ByteBuffers back and forth with it.
This makes things like caching and locking hard. Therefore all data
persistence activity will be funneled through a Serializer object which
in turn submits it to the Heap.

Introducing this unfortunately resulted in a huge amount of churn, in
which a number of smaller refactorings got caught up as well.
2021-08-21 22:03:30 +02:00
Jan de Visser d074a601df LibSQL+SQLServer: Bare bones INSERT and SELECT statements
This patch provides very basic, bare bones implementations of the
INSERT and SELECT statements. They are *very* limited:
- The only variant of the INSERT statement that currently works is
   SELECT INTO schema.table (column1, column2, ....) VALUES
      (value11, value21, ...), (value12, value22, ...), ...
   where the values are literals.
- The SELECT statement is even more limited, and is only provided to
  allow verification of the INSERT statement. The only form implemented
  is: SELECT * FROM schema.table

These statements required a bit of change in the Statement::execute
API. Originally execute only received a Database object as parameter.
This is not enough; we now pass an ExecutionContext object which
contains the Database, the current result set, and the last Tuple read
from the database. This object will undoubtedly evolve over time.

This API change dragged SQLServer::SQLStatement into the patch.

Another API addition is Expression::evaluate. This method is,
unsurprisingly, used to evaluate expressions, like the values in the
INSERT statement.

Finally, a new test file is added: TestSqlStatementExecution, which
tests the currently implemented statements. As the number and flavour of
implemented statements grows, this test file will probably have to be
restructured.
2021-08-21 22:03:30 +02:00
Jan de Visser b74721e604 LibSQL: Redesign Value implementation and add new types
The implemtation of the Value class was based on lambda member variables
implementing type-dependent behaviour. This was done to ensure that
Values can be used as stack-only objects; the simplest alternative,
virtual methods, forces them onto the heap. The problem with the the
lambda approach is that it bloats the Values (which are supposed to be
lightweight objects) quite considerably, because every object contains
more than a dozen function pointers.

The solution to address both problems (we want Values to be able to live
on the stack and be as lightweight as possible) chosen here is to
encapsulate type-dependent behaviour and state in an implementation
class, and let the Value be an AK::Variant of those implementation
classes. All methods of Value are now basically straight delegates to
the implementation object using the Variant::visit method.

One issue complicating matters is the addition of two aggregate types,
Tuple and Array, which each contain a Vector of Values. At this point
Tuples and Arrays (and potential future aggregate types) can't contain
these aggregate types. This is limiting and needs to be addressed.

Another area that needs attention is the nomenclature of things; it's
a bit of a tangle of 'ValueBlahBlah' and 'ImplBlahBlah'. It makes sense
right now I think but admit we probably can do better.

Other things included here:
- Added the Boolean and Null types (and Tuple and Array, see above).
- to_string now always succeeds and returns a String instead of an
  Optional. This had some impact on other sources.
- Added a lot of tests.
- Started moving the serialization mechanism more towards where I want
  it to be, i.e. a 'DataSerializer' object which just takes
  serialization and deserialization requests and knows for example how
  to store long strings out-of-line.

One last remark: There is obviously a naming clash between the Tuple
class and the Tuple Value type. This is intentional; I plan to make the
Tuple class a subclass of Value (and hence Key and Row as well).
2021-08-21 22:03:30 +02:00
Jan de Visser a5e28f2897 LibSQL: Make TupleDescriptor a shared pointer instead of a stack object
Tuple descriptors are basically the same for for example all rows in
a table. Makes sense to share them instead of copying them for every
single row.
2021-08-21 22:03:30 +02:00
Daniel Bertalan 6821cd45ed Tests: Fix compile errors on Clang
Since Clang enables a couple of warnings that we don't have in GCC,
these were not caught before. Included fixes:

- Use correct printf format string for `size_t`
- Don't compare Nonnull(Ref|Own)Ptr` to nullptr
- Fix unsigned int& => unsigned long& conversion
2021-07-14 13:12:25 +02:00
Jan de Visser a034774e3a LibSQL+SQLServer: Build SQLServer system service
This patch introduces the SQLServer system server. This service is
supposed to be the only process/application talking to database storage.
This makes things like locking and caching more reliable, easier to
implement, and more efficient.

In LibSQL we added a client component that does the ugly IPC nitty-
gritty for you. All that's needed is setting a number of event handler
lambdas and you can connect to databases and execute statements on them.

Applications that wish to use this SQLClient class obviously need to
link LibSQL and LibIPC.
2021-07-08 17:55:59 +04:30
Jan de Visser 30691549fd LibSQL: Move Order and Nulls enums from SQL::AST to SQL namespace
The Order enum is used in the Meta component of LibSQL. Using this enum
meant having to include the monster AST/AST.h include file. Furthermore,
they are sort of basic and therefore can live in the general SQL
namespace. Moved to LibSQL/Type.h.

Also introduced a new class, SQLResult, which is needed in future
patches.
2021-07-08 17:55:59 +04:30
Jan de Visser bd5a04fffe LibSQL: Reduce run time of TestSqlDatabase
Scanning tables is a linear process using pointers in the table's
tuples, and does not involve more 'stochastic' code paths like index
traversals. Therefore the 1000 and 10000 row tests were basically
overkill and added nothing we can't find out with less rows.
2021-06-24 09:23:14 +02:00
Jan de Visser 5c4890411b LibSQL: Make lexer and parser more standard SQL compliant
SQL was standardized before there was consensus on sane language syntax
constructs had evolved. The language is mostly case-insensitive, with
unquoted text converted to upper case. Identifiers can include lower
case characters and other 'special' characters by enclosing the
identifier with double quotes. A double quote is escaped by doubling it.
Likewise, a single quote in a literal string is escaped by doubling it.

All this means that the strategy used in the lexer, where a token's
value is a StringView 'window' on the source string, does not work,
because the value needs to be massaged before being handed to the
parser. Therefore a token now has a String containing its value. Given
the limited lifetime of a token, this is acceptable overhead.

Not doing this means that for example quote removal and double quote
escaping would need to be done in the parser or in AST node
construction, which would spread lexing basically all over the place.
Which would be suboptimal.

There was some impact on the sql utility and SyntaxHighlighter component
which was addressed by storing the token's end position together with
the start position in order to properly highlight it.

Finally, reviewing the tests for parsing numeric literals revealed an
inconsistency in which tokens we accept or reject: `1a` is accepted but
`1e` is rejected. Related to this is the fate of `0x`. Added a FIXME
reminding us to address this.
2021-06-24 00:36:53 +02:00
Jan de Visser 4198f7e1af LibSQL: Move Lexer and Parser machinery to AST directory
The SQL engine is expected to be a fairly sizeable piece of software.
Therefore we're starting to restructure the codebase for growth.
2021-06-24 00:36:53 +02:00
coderdreams 49340f98f7 LibSQL: Create databases in writable directory 2021-06-22 18:54:40 +04:30
Jan de Visser 87bd69559f LibSQL: Database layer
This patch implements the beginnings of a database API allowing for the
creation of tables, inserting rows in those tables, and retrieving those
rows.
2021-06-19 22:06:45 +02:00
Jan de Visser 267eb3b329 LibSQL: Hash index implementation for the SQL storage layer
This patch implements a basic hash index. It uses the extendible hashing
algorith. Also includes a test file.
2021-06-19 22:06:45 +02:00
Jan de Visser 224804b424 LibSQL: BTree index, Heap, and Meta objects for SQL Storage layer
Unfortunately this patch is quite large.

The main functionality included are a BTree index implementation and
the Heap class which manages persistent storage.

Also included are a Key subclass of the Tuple class, which is a
specialization for index key tuples. This "dragged in" the Meta layer,
which has classes defining SQL objects like tables and indexes.
2021-06-19 22:06:45 +02:00
Jan de Visser 2a46529170 LibSQL: Basic dynamic value classes for SQL Storage layer
This patch adds the basic dynamic value classes used by the SQL Storage
layer. The most elementary class is Value, which holds a typed Value
which can be converted to standard C++ types. A Tuple is a collection
of Values described by a TupleDescriptor, which specifies the names,
types, and ordering of the elements in the Tuple.

Tuples and Values can be serialized and deserialized to and from
ByteBuffers. This is mechanism which is used to save them to disk.

Tuples are used as keys in SQL indexes and rows in SQL tables.

Also included is a test file.
2021-06-19 22:06:45 +02:00
Timothy Flynn c7cd81bce8 LibSQL: Limit the number of nested subqueries
SQLite hasn't documented a limit on https://www.sqlite.org/limits.html
for the maximum number of nested subqueries. However, its parser is
generated with Yacc and has an internal limit of 100 for general nested
statements.

Fixes https://crbug.com/oss-fuzz/35022.
2021-06-08 19:08:13 +02:00
Timothy Flynn 4e974e6d60 LibSQL: Rename expression tree depth limit test case
Meant to rename this before committing the test - 'stack_limit' isn't a
great name when there's multiple test cases for various stack overflows.
2021-06-08 19:08:13 +02:00
Timothy Flynn f8f36effc9 LibSQL: Limit the allowed depth of an expression tree
According to the definition at https://sqlite.org/lang_expr.html, SQL
expressions could be infinitely deep. For practicality, SQLite enforces
a maxiumum expression tree depth of 1000. Apply the same limit in
LibSQL to avoid stack overflow in the expression parser.

Fixes https://crbug.com/oss-fuzz/34859.
2021-06-05 23:48:18 +04:30
Timothy Flynn a870eac0eb LibSQL: Report a syntax error for unsupported LIMIT clause syntax
Rather than aborting when a LIMIT clause of the form 'LIMIT expr, expr'
is encountered, fail the parser with a syntax error. This will be nicer
for the user and fixes the following fuzzer bug:
https://crbug.com/oss-fuzz/34837
2021-06-03 08:30:13 +02:00
Timothy Flynn ab79599a5e LibSQL: Return an error for empty common table expression lists
SQL::CommonTableExpressionList is required to be non-empty. Return an
error if zero common table expressions were parsed.

Fixes #7627
2021-06-01 23:48:21 +04:30
Brian Gianforcaro 597de3356f Tests: Move LibSQL tests to Tests/LibSQL 2021-05-06 17:54:28 +02:00