1. TODO:

Static means outside of call stack, but not necessarily program-wide.

Think about having caller (c) and stack (k) scopes separately; returning c casts to local, returning k casts to k?

Think on passing scoped variables to lambda, i.e. closure. Done: it may only accept static or undefined pointers.

Variants and atomics have their own semantics on assigning.

Only classes may have traits. It allows to have mut for trait methods.

Remove units, because it’s not defined on where to store them, in which address space.

is? and of? affect compilation, they are called on a real type instead of imaginary. All other methods are called on imaginary. But as is called on imaginary? No, on real as well.

final var = Std@rand(42, 0.5) : Variant<SBin32, FBin64>
var.abs()
# var~Numeric.abs()

Real type is constant, while imaginary type can change during runtime.

Restricting on a real type makes the imaginary type equal to the real type:

trait Drawable2D { decl draw() }
trait Drawable3D { decl draw() }

class Point
  derive Drawable2D
    impl ~draw as draw2d;
  end

  derive Drawable3D
    impl ~draw as draw3d;
  end
end

# Changing `Drawable2D` to `Drawable3D` here
# shall not change the branch contents.
def foo(x : Drawable2D)
  @typeof(x)  # => Undef~Drawable2D
  \@typeof(x) # => Point~Drawable2D

  x.draw() == x~Drawable2D.draw()

  # NOTE:
  if \@typeof(x)[0] == Point
    @typeof(x) # => Undef~Drawable2D

  # BUT: (`is?` is SpEcIaL)
  if x is? Point
    @typeof(x) # => Point~Point
    x.point()
    # x.draw()
    x~Drawable2D.draw()
    x.draw2d()
    x.draw3d()
    x~Drawable3D.draw()
  end
end

Returned type may have Auto.

2. Document

It contains explainations and rationale, which are rare in the Standard. It also "speaks" with a reader (e.g. "you", "we").

3. Philosophy

Target agnosticism. No assumptions are made about target in the language. All that’s known is that there is processing unit, registers and instructions.

Onyx defines concept of function, abstract data structures (Array, Tuple, namespaces, trait, union, struct, class, enum, Variant, Union, Lambda, Function, Type, Block, Literal, Reference, Pointer), storages (local, caller, instance, static, undefined), lifetime, common math types.

TODO: Only functions may be exported. Structs, enums, typedefs are externed instead.

A target may be binary, decimal or even quantum; to contain ALU and FPU or not. It is possible to query if target implements any type natively. A entity is a blackbox until observed. Interchange formats are defined: SBin8 is not necessarily stored in 8 bits, but .bits method returns Bit[8], formatted in special way.

Pointer is just a pointer to data. It may be a pointer to memory, or register. Size of a Pointer is undefined. But Pointer has to($int*) methods defined, which allocate memory on stack.

4. Design goals

Stay low-level, but give tools for powerful abstractions. For example, C pointer is target-dependent; what we known of Onyx pointer is storage. We call ptr.to($int*), and target may allocate it on stack.

5. Notation

Keywords are written like this: \$bb"let"\$. Example identifiers are written like this: \$"foo"\$. For example, \$bb"let"\$ \$"foo"\$ = \$"bar"\$().

6. Comments

A comment begins with # and spans until the end of the line.

A comment adjacent to a member declaration or implementation statement is called documentation.

The Standard contains an informative appendix for comment styling.

An implementation is required to provide a command to generate API documentation data, e.g. nxc api -fjson -o main.json main.nx. Only documentation comments are included in the generated API data. The API data format is a normative part of the Standard, and provides specifications for C header (see Section 9, “Interoperability”), JSON, YAML, XML, MessagePack and NXAPI binary transfer formats.

6.1. Comment intrinsics

The Standard contains an informative list of comment intrinsics for special treatment.

A comment intrinsic syntax is :\$"intrinsic"\$(\$"args"\$):, whereas the argument part may be omitted if having zero arity.

A comment intrinsic does not expand during API data generation, e.g. :ditto: is preserved as-is. It is an API data consumer (expected) responsibility to consume and handle intrinsics properly. A non-standardized intrinsic is thus not a error, e.g. :unknown: is legal during API data generation. A misused intrinsic, e.g. a missing :fmt: pattern reference is also not a error during API data generation.

The intrinsics ignorance behaviour is dictated by the fact that comments are not a part of the resulting program.

Below goes the list of standardized comment intrinsics.

6.1.1. :ditto:

A :ditto: comment intrinsic copies documentation from the previous member in current file.

# This is doc.
let x = 42

# This is a comment.
#

# :ditto:
let y = 42

Results in:

# This is doc.
let x = 42

# This is doc.
let y = 42

6.1.2. :super:

A :super: comment intrinsic copies comment from the super declaration, applicable to overwrites, inherited functions etc.

Without :super:, a documentation comment fully replaces previous documentation.

struct Foo
  # A doc.
  def a;

  # B doc.
  def b;
end

struct Bar
  extend Foo
    # C doc.
    reimpl a;

    # :super:
    # D doc.
    reimpl b;
  end
end

Results in:

struct Bar
  # C doc.
  def a;

  # B doc.
  # D doc.
  def b;
end

6.1.3. :nodoc:

A :nodoc: comment intrinsic disables documentation for the currently documented member until a matching :doc: intrinsic is met.

# :nodoc:
# Is useless in non-doc comments.
#

# This is doc.
# :nodoc: This would not be included.
# :doc: And this would.
# :nodoc: This would not again.
let x = 42

# Note that previous nodoc does not matter here.
let y = 42

Results in:

# This is doc.
# And this would.
let x = 42

# Note that previous nodoc does not matter here.
let y = 42

6.1.4. :patt:

:patt(\$"name"\$, \$"args"\$):, :endpatt(\$"name"\$): and :fmt(\$"name"\$, \$"args"\$): comment intrinsics are used for comment patterns.

Within a pattern, the %{\$"var"\$} syntax is used to insert a variable.

For \$"name"\$ and \$"args"\$, double or single quotes are optional, but required if the text contains possibly misinterpred symbols, i.e. ), :, ,.

Patterns are local to the file.

# :patt("trg-dep", entity, default):
# %{entity} is target-dependent, defaults to %{default}.
# :endpatt:

# :fmt("trg-dep", 'Alignment', 8):
let x = 42

Results in:

# Alignment is target-dependent, defaults to 8.
let x = 42

7. Entities

In Onyx, a entity may be declared and possibly implemented.

During the compilation process, the program AST is continuosuly being appended to, in real time. Therefore, the order of declaration matters. Unlike in other languages, referencing an undeclared yet entity triggers panic.

This code panics, because y is not declared prior to usage:

# let x = y + 1 # Panic!
let y = 42

Note that the following code leads to undefined behavior, because the x expression is evaluated immediately:

let y = unsafe! uninitialized SInt32
let x = y + 1 # Undefined behaviour
y = 42

7.1. decl

A declaration statement (decl) of a entity tells or reminds a compiler that such a entity exists.

Namespace, annotation, trait and unit types are implicitly declared; for example, namespace Foo is equivalent to decl namespace Foo.

7.2. impl

An implementation statement (impl) implements a previously declared entity.

Only a data type, or function or macro member may be implemented.

7.3. def

A definition (def) is a declaration and implementation of a entity in the same statement.

Struct, class and enum types, as well as reference, function and macro members are implicitly defined; for example, struct Foo is equivalent to def struct Foo. However, even such a entity may be explicitly declared prior to implementation, for example:

decl struct Foo;

# Either one would be valid,
# but a struct may only
# be implemented once!
#

impl struct Foo;
def struct Foo;
struct Foo; # `def` is implied

7.4. moveimpl

A entity implementation may be moved under another name using a moveimpl statement. For example, moveimpl foo as bar statement moves implementation from foo to bar, effectively un-declaring foo.

However, only the specified declaration is moved. For example:

def foo(arg ~ Real) { x }
moveimpl foo(arg ~ SInt) to bar

Leads to:

def foo(arg ~ Real && !SInt) { x }
def bar(arg ~ SInt) { x }

7.5. reimpl

A entity implementation may be re-implemented using a reimpl statement. For example, def foo { return 1 }; reimpl foo { return 2 } results in def foo { return 2 }.

Akin to Section 7.4, “moveimpl, only the specified declaration is re-implemented.

Having a as clause acts as Section 7.4, “moveimpl, for example:

def foo(arg ~ Real) { return 1 }
reimpl foo(arg ~ SInt) as bar { return 2 }

Results in:

def foo(arg ~ Real && !SInt) { return 1 }
def bar(arg ~ SInt) { return 2 }

7.6. undecl

A declaration may be un-declared using an undecl statement, e.g undecl foo. From that point, a compiler no more aware of the declaration until the entity is declared again.

7.7. alias

An alias statement declares an alias to a entity.

alias =
  "alias",
  ref, {",", ref},
  ("=" | "to"),
  ref;

Forwarded and recursive aliases are allowed while resolve-able.

An alias statement conveys arguments to the target entity. An omitted arguments list implies conveying all of the arguments. A * in the arguments list captures all the arguments left and passes them to the target entity, e.g. alias SInt32Pointer<*> = Pointer<SInt32, *>.

A single alias statement may contain multiple aliases to the target entity, separated by commas.

primitive Int<Bitsize ~ \%nat, Signed ~ \%bool>
  def subtract(another : self) : self;
  alias sub, - to subtract
end

alias SInt<Bitsize: Z> = Int<Z, true>
alias UInt<Bitsize: Z> = Int<Signed: false, Bitsize: Z>

alias UInt1 = UInt<1>
alias Bit, Bool = UInt1 # Multiple forwarded aliases

7.8. Visibility modifiers

A decl, hence also def, statement may have a visibility modifer, which affects the visibility of the declared entity.

A entity declared public is visible outside of the current scope. A entity declared protected is only visible in the same or child scope. A entity declared private is only visible in the current scope.

A top-level entity can not be declared protected. A top-level entity declared private is only visible in the current file.

8. Directives

A directive is an instruction to the compiler.

File dependency directives instruct the compiler to depend on certain files.

8.1. require

Files can be required using a require directive.

The same file may be required multiple times. It is guaranteed to be only processed once required for the first time.

If a required file is missing extension, .nx would be appended.

A require directive may list multiple files to require, and an optional path to prepend to each required file path. For example, require "foo", "bar/baz.nx" from "/myloc" is equivalent to require "/myloc/foo.nx", "/myloc/bar/baz.nx".

A non-relative file path is first looked up relatively to the folder the file is contained in, i.e. ./.

A compiler is required to provide a way to pass folder paths to lookup required files in, e.g. -R/usr/nx. These paths would be prepended if a require statement is missing from clause. For example, given -R/usr/nx flag, a require "foo" statement would lookup the file in the following order:

  1. ./foo

  2. ./foo.nx

  3. /usr/nx/foo

  4. /usr/nx/foo.nx

The -R feature comes in handy when need to flip the dependencies source folder, for example to match the target.

Wildcard requirements are possible, as defined by the POSIX standards, e.g. require "./*" or require "./**". The order of wildcard lookups is standardized.

A translation environment maintains the being-compiled program AST, and the order of requiring files matters. If a required file references an undeclared yet entity, a compiler panics.

8.2. import

An import directive imports C header files.

Rules similar to require are applied to an import directive. The default imported file extension is .h. A compiler is required to provide a way to pass import lookup paths, e.g. -I/usr/include.

More information on handling imported entities is found at Section 9, “Interoperability”.

8.3. using

A using directive allows to either include a namespace or apply a refinement in the current scope, limited to the file.

If namespace and refinement keywords are omitted, the exact kind of a using directive operand is inferred from the type being used. Otherwise, the type is forced.

namespace Foo
  let bar = 42
end

# bar = 43 # Panic! `bar` is not defined

using Foo
# using namespace Foo # To be more explicit

bar = 43 # OK

9. Interoperability

An Onyx compiler is required to be aware of the C Standard. Non-standard C conformance is optional, but discouraged.

There is no Onyx ABI, and Onyx functions have undefined symbols in assembly. To make an Onyx function "visible" in an object file, hence callable from assembly, a developer should export it as a C function.

The Standard does not define any built-in "entry" function semantics. It is a linker’s responsibility to start a process with a function considered the entry one. Luckily, an Onyx compiler is required to be able to emit AST. A linker script may then be generated pointing to an annotated function.

C header files may be included (imported in Onyx terminology), and all the C entities imported throughout the program compilation are accessible via the $<id> notation, e.g. $printf.

A C function call is unsafe, and shall have parentheses regardless of arity.

Some Onyx type specializations can be autocast to a C type upon calling a C function, e.g. String<10, UTF8> is char[10]. Unfortunately, looseness of the C Standard restricts automatic convertsion of, say, String<10, UTF16LE> to char16_t[10], because char16_t is not guaranteed to contain exactly UTF-16 encoded characters. In such cases, explicit, maybe unsafe, conversion or coercion is required. Moreover, a C function has undefined scope, thus passing an Onyx pointer to it would likely be unsafe for itself.

import "stdio.h"

export void main(int argc, char** argv) {
  # Threadsafe Onyx code inside
  # an exported function body
  #

  let msg = "Hello, world!\0" : String<15, UTF8>
  unsafe! $puts(&msg)
}

Despite of that the Onyx safety within an exported function is the default threadsafe, an exported function call is still unsafe.

An exported function shall not be throwing.

An exported function, as well as its argument declarations, may be annotated. An exported function argument reference is writeable by default, unless has a const qualifier.

An exported function, as well as its argument declarations, may be documented using Onyx comments. Only an exported function documentation is preserved on exported; argument documentation is ignored.

A C primitive type has layout defined per compilation. It usually depends on the target platform data model. For example, passing { c: { int { size: 32, signedness: 2c } } } to an Onyx compiler would make all $int instances 32 bit and 2’s complement within Onyx context. However, due to the target dependency, an explicit conversion is still required.

An advanced Onyx compiler usually has built-in mapping for C data type sizes, so you won’t need to pass them every time you compile your code.

An Onyx compiler is required to implement a subset of C operators.

export int sum(int a, int b) {
  return a + b # Use native C summation operator
}

export int sub(int a, int b) {
  # `$int` may be greater than 32 bits,
  # thus data loss is possible.
  let onyx_a = a.to!(SBin32)

  # Would only work with `{ c: { int {
  # size: 32, signedness: 2c } } }`.
  # Otherwise, the behaviour is undefined.
  let onyx_b = unsafe! b as SBin32

  let result = onyx_a - onyx_b

  # Data loss is possible if
  # `$int` bitsize is < 32.
  return result.to!($int)
}

export void main() {
  try
    # Native C equation operator
    @assert(unsafe! sum(1, 2) == 3)

    @assert(unsafe! sub(1, 2).to_i32!() == -1)
  catch
    unsafe! $exit($(EXIT_FAILURE))
  end
}

9.1. C data types

A C struct or union definition may also be exported. An exported data type is treated in as if it was imported.

An imported named C struct or union has semantics identical to an Onyx struct or Union. This imples:

  • Default initializers, Onyx-style. Note that exported C entities are zero-initialized.

  • Reopening with static function and method declaration thanks to UFCS.

  • Unsafe access to union options, which may be flattened to fields.

import "math.h", "stdlib.h"

export struct struct_t {
  double x;
  double y;

  unsigned sw;
  union {
    float foo;
    short bar;
  };
};

reopen $strukt_t
  # Note that the function is not exported.
  #
  # ≡ `public threadsafe def length(&this : $strukt_t*cr) : $double`
  def length
    return unsafe! $sqrt($pow(this.x, 2) + $pow(y, 2))
  end

  def val : Variant<$float, $short>
    # Accessing union-ed fields is unsafe.
    (sw == 0) ? (unsafe! foo) : (unsafe! bar)
  end
end

export void main() {
  # `sw` is zero-initialized.
  let s : $strukt_t = $strukt_t(1,  2, foo: 42)

  try
    # Passes `s` by pointer.
    @assert(s.length().to_f64!().~=(2.236, 0.01))
    @assert(s.val().<unsafe!>as!(FBin64) == 42)
  catch
    unsafe! $exit($(EXIT_FAILURE))
  end
}

9.2. C variables

A C variable may be exported as well. In respect to C, zero initialization is implied if the value is omitted.

An exported variable reference has static scope in Onyx and is writeable unless has const qualifier.

It is possible to export a C-static variable as well. It would be treated as private in terms of Onyx.

TODO: What happens upon taking a file-local variable’s address and passing it outside? In regard to Onyx as well.

Example 1. Exporting a C variable
import "stdlib.h"

extern int foo;

export void main() {
  try
    @assert($foo : $int&srw == 0)
  catch
    unsafe! $exit($(EXIT_FAILURE))
  end
}

9.3. C macros

C preprocessor macros may also be imported and exported.

A C macro within Onyx context is used as follows: ${MACRO}. The evaluation result of a macro is inserted into Onyx after some pre-processing defined below. Nested macros are processed in C, and only the final result is pre-processed by Onyx.

Appending an Onyx suffix is legal, e.g. with MACROc evaluating to 42, ${MACRO}i64 would result in 42i64.

A C macro is not evaluated within a comment. If you really need that, emit a comment from an Onyx macro, and make use of a C macro within the Onyx macro, for example:

# Note: a C macro is terminated with a newline,
# therefore you should not use inline comments
# for documenting exported C macros.
export #define MACRO 42

# ${MACRO} is not evaluated here.

{{
  "# " .. nx.c.lkp("MACRO"):evaluate() ..
  " is evaluated here."
}}

Within an exporting context, C macros may intuitively be used as they’d be in pure C. For example, export MY_MACRO_VOID main() { could theoretically expand to export void main() {. The resulting code is then parsed as usual.

Fundamentally, a C macro evaluation result may be not cross-platform in terms of C.

For example, the 4294967296ulc constant is too large for a target where unsigned long is 32 bits, limited to 4294967295. This would therefore result in a panic on some platforms.

But what really matters in crossing the C-Onyx boundary is a C constant’s contents.

For instance, l and ll suffixes are ignored in Onyx. What Onyx sees after the C macro evaluation is 4294967296u, and you shall explicitly constrain a macro evaluation result to the bitsize you want, with 32 being the reasonable default. For example, with ${BIG}u64 evaluating to 4294967296uu64 the code would be valid and cross-platform.

TODO: Shall think more on ll suffixes and that cross-platform stuff.

9.3.1. C numeric constants

If a C macro evaluated to a numerical constant without any suffixes, the following rules apply:

  • Octal constants (those beginning with zero) are converted to the 0onx notation. For example, 052c becomes 0o52nx.

  • Hexadecimal constants are converted into the canonical Onyx notation, where x is always in lower case and hex decimals are in upper case. For example, 0X2ac becomes 0x2Anx.

  • Floating-point constants beginning or ending with a dot get the missing leading or trailing zero inserted. For example, .1c becomes 0.1nx, and 1.c becomes 1.0nx.

  • Exponent symbols in floating-point constants are converted to their lower-case counterparts. For example, 1E3c becomes 1e3nx, and 0XAbP2c becomes 0xABp2.

An Onyx integer literal is constrained to SBin32 by default. Therefore, if a C integer constant does not fit into 32 bits, a compiler would panic. To fix that, manually constrain a C macro evaluation result. For example, with export #define MACRO 2147483648, ${MACRO} would result in 2147483648, which would panic. Using a suffix would fix that: ${MACRO}i64 evaluates to 2147483648i64, which is legal.

Appending a u suffix to a C constant macro results in u suffix also appended in Onyx. For example, export #define N 42U, ${N} becomes 42u.

Appending a l or ll suffix to a C constant macro is ignored. You should still constrain the result to the type you want, with SBin32 or UBin32 being the implicit defaults. For example, export #define BIG 8589934592ull, ${BIG} would result in 8589934592u, which does not actually fit into 32 bits, thus panics. ${BIG}u64 would result in 8589934592uu64, which is a valid Onyx literal.

9.3.2. C text constants

A single-byte or u8-prefixed integer character constant is left as-is. For example, 'f'c and u8’f'c both result in 'f'nx.

Octal character codepoints are converted to such in Onyx, for example '\141'c becomes '\o141'nx.

The \141nx notation in Onyx means decimal ordinal 141. Therefore, '\97' == 'a'nx.

A hexadecimal sequence character constant becomes a hexadecimal codepoint character literal in Onyx notation. For example, '\xaB'c becomes '\xAB'nx.

C escape sequences are a subset of those in Onyx, therefore they are left as-is.

Multi-codepoint C constants are disallowed, because in Onyx a single character shall contain exactly one codepoint. Also C18§6.4.4.4.10:

The value of an integer character constant containing more than one character (e.g., 'ab'), […​] is implementation-defined.

Multi-byte character C constants, i.e. those prefixed with u, U and L, result in the same single codepoint Onyx character literal. Note that if a constant codepoint is out of range defined by C environment, the compiler would panic.

Onyx character literals have UCS charset by default, thus being able to hold any Unicode character.
# OK, as long it fits in a UTF-16 codepoint.
#
# Note that if `__STDC_UTF_16__` is not set,
# the behaviour is undefined: a compiler
# may panic upon this macro evaluation.
#
export #define MYCHAR = u'貓'

export void main() {
  # # TODO:
  # #
  # # A) This would even panic on `u'f'`, because it's always a 16-bit codeunit.
  # # If so, what's the exact result of evaluation? Some `'f'utf16`? Nah.
  # #
  # # B) No panic on `u'f'`, as it evaluates to `'f'`, and `'f'u8` is valid.
  # #
  # final c = ${MYCHAR}u8 # Panic! Can not constrain literal UCS character `貓` to `UBin8`

  # OK, because `貓` codepoint value fits into 16 bytes
  final c = ${MYCHAR}u16
}
# Would usually panic upon evaluation,
# because `🍌` does not fit into 16 bits.
#
# However, if `__STDC_UTF_16__` is not set,
# then the behaviour is undefined: this may
# actually become a valid constant in C.
#
export #define MYCHAR u'🍌'

export void main() {
  # Evaluates to `c = '🍌'`, which is valid in Onyx.
  # But again, the compiler may still panic
  # if `u'🍌'` is invalid in C.
  final c1 = ${MYCHAR}
}

A C string constant results in a Onyx string literal, which has UTF-8 encoding by default, regardless of C constant prefixes.

Adjacent string literals are concatenated (C translation phase 6). A terminating null character is appended to a C string constant (C translation phase 7). For example, U"\xfF" U"f"c becomes "\x{FF}f\0"nx.

The surrounding context does not affect C macro evaluation.

For example, $puts(${MYSTRING}) would result exactly in $puts("mystring\0"). However, $puts expects a pointer, and "mystring\0" is String<UTF8, 9>; the compiler would thus panic. To fix that, change the code to $puts(&${MYSTRING}), which would evaluate to $puts(&"mystring\0").

9.3.3. Other C macros

TODO: Initializers, e.g. export #define FOO {1, 2, 3} and then ${FOO}. Probably evaluates to 1, 2, 3, thus allowing Array(${FOO}), Strukt(${FOO}) etc. Struct initializers, e.g. export #define BAR { .x = 1, .y = 2 }, ${BAR}x: 1, y: 2.

9.4. External declarations

A C function, struct, union or variable may be marked externally declared using the extern directive from Onyx code. This is equivalent to importing a declaration. That means that the entity would be defined at some later point of linkage. The behaviour is similar to declaration in Onyx: a single entity may be `extern`ed multiple times, but only `export`ed or imported once.

Example 2. Externing C entities

The following code would fail to compile if symbol foo is not resolved during linkage.

import "stdio.h"

extern int foo;

export void main() {
  unsafe! $printf(&"%d\n", $foo)
}

10. Literals

10.1. Literal constrainment

When read from source, a literal has inferred constrainment in accordance to Table 1, “Basic literal constrainments”.

In code, a literal constrainment has form of \<constrainment>, e.g. 0.5 : \q. A literal constrainment is a type-level restriction, not an instance-level restriction. It may be used to restrict a literal in-place or define a generic literal argument, e.g. def foo(arg: _ : L) forall L : \q.

Both def foo(arg: L) forall L : \q and def foo(arg : \q) would panic, because a literal shall not be an instance.

A constrainment is defined by a regular expression. For example, /f(?<Bitsize>\d+)/ defines \f16, \f32 etc.

TODO: Restricting a literal to a concrete type is possible, but not vice versa, e.g. 42 : \%n : SBin32, but not 42 : SBin32 : \%n. Also 42 : \%u : SBin32 is not possible.

TODO: Only basic \%real, \%int, \%nat, \%bool, \%string and \%char restictions are needed. No literal instances. Still can apply (partial) suffixes.

Table 1. Basic literal constrainments
Literal examples Constrainment regex Default type Notes

false, true

/b/

Bool

A boolean literal.

0, 1

/n

SBin32

A natural number (\$NN\$) literal.

-1

/z/

SBin32

An integer (\$ZZ\$) literal.

1.0

/q/

FBin64

A rational number (\$QQ\$) literal.

'a'

/c/

Char<UCS>

A character literal.

Special literals

:abc

\y

N\A

A symbol literal.

Compound literals are basic literals with some modifications. The inferred basic constrainment becomes a part of the compound constrainment. For example, 0/1 : \rn is a ratio of two natural literals.

Table 2. Compound literal constraints

Literal examples

Constrainment regex

Default type (1)

Notes

0//1

/r(?<Constraint>.+)/, e.g. ri32

Ratio<Constraint>

A ratio (as quotient of two integers) literal.

0j

/j(?<Constraint>.+)/ , e.g. ji32

Imaginary<Constraint>

An imaginary number literal.

0..1

\..\$tt"L"\$, e.g. \..n

Range<\$tt"L"\$>

An interval literal.

"abc"

\s

String<UTF8>

A string literal.

[0, 1]

stem:[tt"L"][\$tt"Z"\$], e.g. \n[2]

Array<\$tt"L"\$, \$tt"Z"\$>

An array literal.

<0, 1>

stem:[tt"L"]<\$tt"Z"\$>, e.g. \n<2>; or \(\$tt"L"\$)x\$tt"Z"\$, e.g. \(n)x2

Vector<\$tt"L"\$, \$tt"Z"\$>

A vector literal.

|[0, 1], [2, 3]|r

\(\$tt"L"\$)x\${tt"Z"}\$\$tt"D"\$, e.g. \(n)x2x2r

Tensor<\$tt"L"\$, *\$tt"Z"\$, \$tt"D"\$>

A tensor literal.

(0, 1.0)

(\${tt"L"}\$), e.g. (\n, \q)

Tuple<*\$tt"L"\$>

A tuple of literals.

(foo: 0, bar: 1.0)

(\${tt"L"}\$), e.g. (foo: \n, bar: \q)

Struct<*\$tt"L"\$>

1 Constrainment is the default type for the base literal, e.g. 1/0 would default to Ration<SBin32>.

A complete constrainment is one constraining to a complete type. For example, \i32 is complete, but \n is not.

A numeric literal constrainment can be further constrained to a specific numeric type using a suffix from Table 3, “Numeric literal contrainments”. The resulting constrainment equals to the suffix applied. For example, 1u8 : \u8.

Simply constraining a literal has the same effect as applying a suffix to it, e.g. 1 : \u8 \$-=\$ 1u8.

TODO: A specific-type constrainment (even partial) restricts the target type. For example, 1u can not be used as FBin32; also 1q can not be used as an integer.

A character literal may also have a numeric suffix appended: it would turn it into a numeric literal, e.g. 'a’u8 : \u8 == 97. Appending a numeric suffix to a string turns it into an array of numeric literals representing the string’s codeunits, e.g. "abc"u8 : \u8[3] == %u8[97 98 99].

Literal value and suffixes may be separated with underscores or wrapped in parentheses. For example, (42n)_i32. A constrainment is always flattened, thus e.g. \(z)_fb64 is interchangeable with \f64.

Table 3. Numeric literal contrainments
Regex Applicable to (1) Default type

/n/

\n

SBin32

/z/

\n, \z

SBin32

/q/

\n, \z, \q

FBin64

/s?ib?(?<Bitsize>\d+)?/

\n, \z

SBin<Bitsize = 32>

/ui?b?(?<Bitsize>\d+)?/

\n

UBin<Bitsize = 32>

/(si?|i)d(?<Digits>\d+)/

\n, \z

SDec<Digits>

/ui?d(?<Digits>\d+)/

\n

UDec<Digits>

/fb?(?<Bitsize>\d+)?/

\n, \z, \q

FBin\$<<\$Bitsize = 64\$>>\$

/f?d(?<Bitsize>\d+)?/

\n, \z, \q

FDec\$<<\$Bitsize = 64\$>>\$

/s?Q(?<Bitsize>\d+)?(e(?<Exponent>-?\d))?/ (2)

\n, \z, \q

SXBin<Bitsize, Exponent>

/uQ(?<Bitsize>\d+)?(e(?<Exponent>-?\d))?/ (2)

\n, \z, \q (3)

UXBin<Bitsize, Exponent>

/s?D(?<Total>\d+)?(f(?<Fractional>-?\d+))?/ (2)

\n, \z, \q

SXDec<Total, Fractional>

/uD(?<Total>\d+)?(f(?<Fractional>-?\d+))?/ (2)

\n, \z, \q (3)

UXDec<Total, Fractional>

/p(?<Bitsize>\d+)/

\n, \z, \q

Posit\$<<\$Bitsize\$>>\$

1 Always applicable to \c and \s.
2 Either one of generic arguments is required.
3 Literal signedness is checked.

A character literal (\c) may have a character set suffix appended before a numerical suffix. A string literal (\s) may have an encoding suffix appended before a numerical suffix.

A text literal suffix replaces the constrainment, and makes it incompatible with the initial constrainment, e.g. 'a' : \ucs is legal, but 'a’ucs : \c is not.

The language is only aware of Unicode and its modern encodings, excluding other character sets.
Table 4. Character set literal suffixes
Regex Type

/ucs/

Char<UCS>

Table 5. Encoding literal suffixes
Regex Type

/utf8/

String<UTF8>

/utf(16|32)[lb]e/

E.g. String<UTF16LE>

/ucs(2|4)/

E.g. String<UCS2>

10.2. Range literals

TODO: Only have [] intervals?

Literal Magic literal Math equivalent Type

\$A\$..\$B\$

%r[\$A\$ \$B\$]

\$[A, B]\$

Range<\$"T"\$, true, true>

\$A\$...\$B\$

%r[\$A\$ \$B\$)

\$[A, B)\$

Range<\$"T"\$, true, false>

N/A

%r(\$A\$ \$B\$]

\$(A, B]\$

Range<\$"T"\$, false, true>

\$A\$....\$B\$

%r(\$A\$ \$B\$)

\$(A, B)\$

Range<\$"T"\$, false, false>

If omitted, \$A\$ defaults to :min, and \$B\$ defaults to :max. For example, 0.. == 0..:max, .. == :min..:max. A magic literal requires both ends to be set explicitly (still allowing symbols, e.g. %ri[min \$B\$)).

11. Values

A value is an instance of a data type. For example, 42 may be a value of data type SBin32.

A runtime entity is either a value (val : \$T\$), a reference to a value (ref : Reference<\$T\$> : \$T\$&), or a pointer to a value (ptr : Pointer<\$T\$> : \$T\$*). The latter two are known as Section 12, “Indirect values”.

A reference has the same internal representation as a pointer, but the referenced value access semantic is different.

A reference is an lvalue, these terms are interchangeable. A value or a pointer to a value is an rvalue.

11.1. Assigning values

Assigning an rvalue to an lvalue simply moves the value into the lvalue, making the lvalue a sole owner of the value.

Assigning an lvalue to another lvalue of the same type calls a copy initializer on the right operand, and moves the rvalue result to the left operand. In other words symbols, \$l\$0 : \$T\$& = \$l\$1 : \$T\$& \$=>\$ \$l\$0 = \$T\$(&\$l\$1) : \$T\$.

TODO: Assigning to a variant.

let x ~ SBin32& = 42
let y ~ SBin32& = x # Calls a copy initializer: `SBin32(&x)`
y = 43
@assert(x == 42) # Did not change `x`

When passing an argument to a function decl foo(arg : T&), the foo(arg: x) call syntax (or simply foo(x)) is a syntactic sugar for foo(arg: = x), where arg: references the callee’s argument lvalue. Similar is applicable to the aforementioned moving semantics, i.e. foo(arg: <-x). However, a function argument does not have a value yet (even default), thus this is applicable neither to pushing nor to swapping.

It is not applicable to pushing (e.g. foo(arg: <<= x)), because the argument default value is set if the argument is empty after the pass, and there is no syntax defined to receive the pushed value.

Should think about default value semantics: may be the default value is set prior to passing? If so, both pushing and swapping may be possible.

11.2. Moving values

A reference may be turned into an rvalue using the <- unary operator. After that, the reference is considered moved. Effectively, moving imples direct copying of the value data, skipping a copy initializer call.

A moved lvalue itself shall not be used anymore, unless set again. Therefore, only an explicitly declared (e.g. with \$"let"\$) local-scoped reference may be safely moved. Otherwise, moving is unsafe, but possible. When moving safely, a compiler would panic if there is at least a possibility of using a moved lvalue, for example, when moving depends on runtime.

A \$l\$0 <- \$l\$1 expression is a syntactic sugar for \$l\$0 = <- \$l\$1. Without any receiver, a <-\$l\$ expression effectively finalizes the referenced value.

Example 3. Moving an lvalue
let x = 42
let y <- x # Moves `x` into `y`
# x # Panic! Use after move (UAM)
y = 43 # Changes `y`
unsafe! x = 44 # Undefined behaviour, but does not affect `y`
x = 45 # Set `x` again

@assert(x == 45)
@assert(y == 43)
def foo(list : Std::List&);
let list = Std::List()

foo(list) # Copy the list
# foo(list: list) # Ditto

foo(<-list) # Move the list instead of copying it
# foo(list: <-list) # Ditto
# list # Panic! UAM

Returning an lvalue implicitly moves it, i.e. return \$l\$ is equal to return <-\$l\$. Therefore, it is not possible to return an lvalue, hence reference.

An rvalue may also be moved, i.e. <-\$r\$ is not a error.

11.3. Pushing values

Assigning or moving into an lvalue returns the left operand, i.e. the affected reference, finalizing the old value. It is possible to do a push-assign (<<=) or push-move (<<-) instead, which return the old value as an rvalue.

Example 4. Pushing into lvalues
let x = 42
@assert((x = 43) == 43)   # Replaces the old value
@assert((x <<= 44) == 43) # Pushes the old value

let y = 17
@assert((y <<- x) == 17)

11.4. Swapping values

Two indirect values referencing values of the same type may swap their values using the swap operator <->. The operation shall be allowed by the scope constraints (for example, it is not possible to swap indirect values with undefined scopes), and is fragile. The left operand is then returned.

Example 5. Swapping lvalues
let x = 42
let y = 43
@assert((x <-> y) ~ SBin32& == 43) # New `x` value is 43

12. Indirect values

An indirect value is either a reference or pointer to a value. Indirect values share common semantics, such as scope, space, readability and writeability (commonly known as Section 12.3, “Accessibility”).

12.1. References

TODO: A reference may be restricted to an rvalue; this would copy it, returning an rvalue. Thus, let x : Std::List; x : Std::List is OK, but copies.

TODO: An rvalue is really a temporal reference, but for some reason it’s moved instead of copying upon assignment, e.g. let list = Std::List() |.shuffle(). Assigning a temporal reference moves it?

A reference type Reference<Type: \$T\$, Scope: \$S\$, Space: \$P\$, Readable: \$R\$, Writeable: \$W\$> can be shorcut as \$T\$&\$SPRW\$, e.g. SBin32&lrw0 == Reference<SBin32, :local, 0, true, true>. For scope one-letter shortcuts, see Section 13, “Scopes”.

A variable reference is declared using the \$bb "let"\$ \$"var"\$ syntax. A variable reference is always both writeable and readable, i.e. let var : \$T\$&rw. A variable may be also declared write-only, e.g. let buff : SBin32&sw. Within a class declaration, special \$bb "get"\$ and \$bb "set"\$ declarations may be used, which does not affect the "real" reference accessibility.

A constant reference is declared using the \$bb "final"\$ \$"const"\$ syntax. A constant reference is read-only by default, i.e. final const : \$"T"\$&r. However, a constant reference may be declared inaccessible by restricting it to a \$"T"\$& : \$"T"\$&\$"RW"\$ type, e.g. final dead : SBin32&s.

A reference declaration type annotation is optional and (usually) may be inferred.

A reference declaration may have one of \$"local"\$, \$"instance"\$ or \$"static"\$ scope modifiers, e.g. let local var. Implicit default scope modifiers are defined for certain scopes, read more in Section 13, “Scopes”.

Accessing a reference transparently accesses the referenced value. For example, (\$l\$ : \$T\$&).\$m\$ accesses \$m\$ member of the value referenced by \$l\$. The same applies to lookup, i.e. \$T\$&::\$m\$ transparently lookups \$T\$::\$m\$. This paragraph is important, because it means that a reference itself can not be accessed, but only the value it references.

A value type itself shall not be a reference, i.e. \$r\$ : \$T\$& is illegal, which also makes references to references and pointers to references illegal.

This behaviour is different from C++, where references are first-class types and may be (almost) freely passed around.

12.2. Pointers

Similar to references, the shorcut semantic is applicable to a Pointer type, but with the * symbol, e.g. SBin32*lrw0 == Pointer<SBin32, :local, 0, true, true>.

Akin to C, pointer to pointer, i.e. \$"T"\$**, is legal, with arbitrary depth.

Akin to C, a reference may be safely cast to a pointer using the &(\$"l"\$ : \$"T"\$&) : \$"T"\$* semantic, and vice versa. For example, let x : SBin32&lrw = 42 and then &x : SBin32*lrw, and then *&x : SBin32*lrw again.

In fact, a reference is similar to pointer, but implies different underlying value access semantics, and can not be referenced to.

As in C, a pointee is accessed using the -> operator, e.g. ptr->foo. However, in Onyx, the -> operator by itself turns a pointer into reference, i.e. ((\$"ptr"\$ : \$"T"\$*)->) : \$"T"\$&.

12.3. Accessibility

An object is accessed in runtime using the . notation, which transparantly passes the callee’s pointer as the first argument to the caller in accordance to the UFCS, e.g. obj.foo() equals to obj::foo(&obj).

Indirect value readability and writeability are commonly referenced as accessibility. Thus, a neither readable nor writeable indirect value is inaccessible.

Reading means either moving an lvalue or assigning it, i.e. read the underlying value. Note that passing an indirect value around is not considered reading.

Writing means writing directly into the underlying value space, e.g. assigning to an indirect value. Note that mutating an underlying value of a class type is not considered writing, i.e. final list : mut Std::List<SBin32>(), and then list << 42 is legit; but "mutating" any other type is considered writing. That’s one of the outstanding features of a class type.

Positive readability is designated with lowercase r symbol in the indirect value shortcut semantic; for example, \$T\$&r is a readable reference. Writeability uses the letter w. Negative \$x\$-ability is designated with an uppercase letter, e.g. \$T\$&RW is inaccessible.

A \$x\$ indirect value may be safely conveyed into an outer scope as a non-\$x\$ indirect value. For example, a \$T\$*rw may be safely auto-cast to a \$T\$*Rw argument, but vice versa would be unsafe.

12.4. Spaces

An indirect value space is a platform-defined natural value, declared as a Space : ~\%nat = 0 argument. Note that omitting the Space argument implies the default zero space.

An indirect value with undefined space is incomplete. Indirect values with different spaces are incompatible.

In an indirect value shortcut notation, space is a natural number, usually put in the very end, e.g. T&lrw0.

The Standard defines space mappings for common platforms.

13. Scopes

Defined scopes are temporal, local, caller, instance and static. An indirect value may also have an undefined scope. An underlined symbol defines the scope shortcut used in indirect value shortcuts, e.g. \$"T"\$*l has local scope.

When passing an aggregate (i.e. non-scalar) value to an outer scope (e.g. returning from a function or passing as an argument), each of its fields' scopes is checked and auto-cast (if applicable) separately.

In Onyx, arrays, tuples and structs are very similar. It’s their access semantic what’s different. It can be said that an array elements accessed with [] are "fields" of the array with [0], [1] etc. names.

In this example, an array of local pointers can be passed as an array of caller pointers. However, it shall not be returned.

def foo(ary : SBin32*cw[1])
  ary[0]-> = 43
end

def main
  let x = 42
  let ary = [&x] : SBin32&lrw[1]

  foo(ary) # OK, auto-casts the element's scope
  @assert(x == 43)

  # return ary # Panic! Can not return a local-scoped pointer
end

This example is similar, but a custom struct is used instead of an array.

struct Wrapper
  # Note how it points to an instance scope,
  # i.e. to the one the object is in.
  let wrapped : SBin32*irw
end

def foo
  final x = 42

  final wrapper = Wrapper(&x)
  @assert(wrapper :? Wrapper&lrw)

  *(wrapper.wrapped) = 43
  @assert(x == 43)

  # # Panics because `wrapped`
  # # has instance, hence local, scope.
  # return wrapped # Panic!
end

Similar rules are applied to a value existence. \$T\$*l&s does not make sense, as there is no local scope in the static scope. That said, Array<\$T\$*l>&l and Array<\$T\$*s>&l are valid, but Array<\$T\$*l>&s is not.

Note that a non-indirect value does not have scope, it is pure data, which can be passed at any direction.

A field of a pointer type with scope other than instance, static or undefined shall not be declared. However, a generic-typed field may be specialized with another scope. For example:

struct Foo
  # # Does not make sense to
  # # have local pointer here.
  # let ptr : SBin32*l # Panic!

  let ptr : T

  # # Could've used instance
  # # pointer as an alternative.
  # # See the `Wrapper` example above.
  # let ptr : SBin32*i
end

final global_x = 42
# TODO: Address space inference here?
# Would likely put into `.global` on PTX.
final global_foo = Foo<SBin32*sr>(&x) # OK

def bar
  final x = 42
  final foo = Foo<SBin32*lr>(&x) # OK

  # # As mentioned above, each field is checked independently;
  # # it is not possible to pass a local pointer outside, thus panicking.
  # return foo # Panic!
end

13.1. Scope casting

An indirect value of one scope may be cast to another scope using the as operator in accordance to Table 6, “Scope casting”. For example, (ptr : T*c) as T*l is threadsafe, but (ptr : T*l) as T*c is unsafe.

Disabling or preserving an accessibility option is threadsafe (e.g. making an *rw pointer *r-only), but enabling it back is unsafe (e.g. casting a &W reference to &rw).

Casting to undefined scope is threadsafe with respect to the beforementioned accessibility casting (e.g. (ptr : T*r) as T*rw becomes unsafe). Casting from undefined scope is always unsafe.

Table 6. Scope casting

Source

Safety of casting to target scope

Scope

Accessibility

Temporal

Local

Caller

Static

Undefined

Temporal

Threadsafe

Unsafe

Unsafe

Unsafe

Threadsafe

Local

Threadsafe

Threadsafe

Unsafe

Unsafe

Threadsafe

Caller

Threadsafe

Threadsafe

Threadsafe

Unsafe

Threadsafe

Static

Read-only and constant

Threadsafe

Threadsafe

Threadsafe

Threadsafe

Threadsafe

Writeable or mutable

Threadsafe

Fragile (1)

Fragile (1)

Threadsafe

Threadsafe

Undefined

Unsafe

Unsafe

Unsafe

Unsafe

Threadsafe

1 Because other threads may simultaneously write or mutate the value.

13.2. Indirect value argument autocasting

Only a pointer with caller, static or undefined scope may be declared as a function argument type. A reference shall not be declared a function argument type, because a reference to a reference is impossible.

The Scope argument is a ghost generic argument. Pointer<Scope: Undef> would therefore trigger a separate specialization.

When a pointer is passed to a function, it may be automatically cast to the target argument scope with safety defined in Table 7, “Pointer argument autocasting”. Otherwise, manual scope casting is required. The resulting safety of a call is the lowest safety from the callee safety modifier and the autocasting safety of its arguments from the table, plus the accessibility casting constraints.

Table 7. Pointer argument autocasting

Caller-side pointer

Autocasting safety by a declared argument’s scope

Scope

Accessibility

Caller

Static (1)

Undefined

Temporal

N/A

N/A

Threadsafe

Local

Threadsafe

N/A

Threadsafe

Caller

Threadsafe

N/A

Threadsafe

Static

Read-only and constant

Threadsafe

Threadsafe

Threadsafe

Writeable or mutable

Fragile (2)

Threadsafe

Threadsafe

Undefined

Unsafe

N/A

Threadsafe

1 Manual cast to static scope is required prior to passing, see Table 6, “Scope casting”.
2 Because other threads may simultaneously write or mutate the value.

13.3. Returned indirect value scope autocasting

Returning a reference implicitly moves it, thus making returning a reference impossible.

Otherwise, a pointer with scope other than local may be returned from a function. Its scope is automatically cast at the caller side in accordance to Table 8, “Returned pointer scope autocasting”.

An observer never sees raw instance scope. It always turns into the containing object’s.
Table 8. Returned pointer scope autocasting

Returned pointer scope

Caller-side resulting pointer scope

Temporal

Temporal

Local

N/A

Caller

Local (see Section 13.6, “Caller scope”)

Static

Static

Undefined

Undefined

13.4. Temporal scope

A temporal-scoped indirect value shall not be preserved for future use. Therefore, a reference to a temporal-scoped pointer (e.g. let x : T*t&) is illegal, which makes it impossible to pass a temporal-scoped indirect value anywhere, but access it immediately or return only.

Any-scoped indirect value other than undefined may be thread-safely cast to a temporal-scoped indirect value, but not vice versa.

Counter-example for passing a temporal-scoped pointer as an argument: def foo(list : List<T>*c, element : T*c): after resizing of the list inside the body, element may become invalid. Also returning the element from the function would cast it to local scope on the caller site, which is inappropriate.
class Std::List<T>
  let pointer : Void*

  mut def [](index : Size) : T*trw
    # Returning a reference would not make sense here,
    # because returning implies moving, and moving
    # turns the reference into an rvalue.
    #
    # Thus, return a temporal pointer to an element.
    # Temporal it is because the list may be resized at any
    # moment, and the element pointer would become invalid.
    return unsafe! pointer[index] as T*trw
  end
end

final list = mut Std::List(1, 2, 3)
let x = 42

# final e : SBin32*trw&lr = list[1] # Panic! Can not have a reference
                                    # to a temporal pointer

final e : SBin32&lr = *list[1] # OK, copies `2` into `e`

*list[1] = x # OK, copies value from `x` into the element

13.5. Local scope

References declared within a function body or arguments list with \$"local"\$ modifier (which is the default one) always have local scope. Only references with \$"local"\$ scope modifier may be local-scoped. That means that neither let \$x\$ : \$T\$&s nor static let \$x\$ : \$T\$&l are legal.

Once the scope containing an explicitly declared local-scoped reference terminates, the referenced value is finalized, but only once.

It is not possible to safely pass a local-scoped pointer to an outer scope. But, a local-scoped pointer may be safely passed as a caller-scoped pointer argument. Note that it does not make sense do declare a local-scoped pointer argument, i.e. \$a\$ : \$T\$*lr&, because where would it point to?

# def foo(final local arg : SBin32&lrW0) # Ditto
def foo(arg : SBin32&)
  arg : SBin32&lr0  # Inferred to be local

  # final x : SBin32&s # Panic! A reference declared with `local` scope
                       # modifier may only have local `Scope` argument

  final x = 42
  x : SBin32&lr0 # Inferred to be local

  bar(&x : SBin32*lr) # Can pass local-scoped ind-val as caller-scoped
  @assert(x == 43)
end

# Note the `&ref` syntactic sugar,
# which turns a pointer into reference.
def bar(&ref : SBin32*cw&lr)
  ref = 43 # Change caller-scoped reference
end

13.6. Caller scope

A caller-scoped pointer is known to point at a value existing somewhere in the call stack, and therefore shall not be passed outside of it (the call stack), but can be safely returned.

Returning a caller-scoped pointer always casts it to a local-scoped pointer on the caller side, because there is no way to preserve whether does the pointer point to a value existing in the caller scope or somewhere upper in the call stack.

def tap(arg : SBin32*c&) : SBin32*c
  return arg
end

def foo(a : SBin32*c&)
  let b = 42

  let x = tap(a) : SBin32*l # Here, `a` really points to the caller
                            # scope, but we can't know that

  let y = tap(&b) : SBin32*l
end

There is no way to declare a caller-scoped reference other than dereference a caller-scoped pointer, which is ephemeral by nature.

13.7. Instance scope

Instance scope is guaranteed to span at least to the containing object’s lifetime.

field

A field is a reference declared with an \$bb"instance"\$ scope modifier, which is the default and only applicable for a reference declaration within a data type definition. A reference declaration within a data type definition may also be declared with a \$bb"static"\$ scope modifier. In that case, it would not be a field anymore, but simply a static reference.

An instance-scoped indirect value type is only used either within a field declaration, e.g. instance let ptr : T*i. For an observer, an instance scope translates to the containing object’s scope. Therefore, accessing an object’s field returns a reference with the same scope as the object’s. Consequently, casting to and from instance scope is absent from Table 6, “Scope casting”.

struct Point
  # `instance val` is implied.
  val x : FBin64

  # Return a pointer to `x`.
  #
  # The declaration has return type inferred;
  # the returned scope would equal to `this`'.
  #
  # NOTE: `&this*` is a syntactic
  # sugar for `&this : self*`.
  #
  # NOTE:The function is always threadsafe,
  # because there is no actual reference access.
  def get_x_ptr(&this*)
    return &(this.x)
  end

  # # NOTE: A more wordy, but similar implementation.
  # # NOTE: Could've used `S, R, W, P` instead of `T`.
  # def get_x_ptr(
  #   &this : Pointer<self, *T>
  # ) : Pointer<FBin64, *T> forall T
  #   return &(this.x)
  # end
end

# A static point.
final sp = Point(42)

@[Std::Entry]
export void main() {
  # Note the static scope
  sp.x : FBin64&sr
  sp:get_x_ptr() : FBin64*sr

  # A local point
  final lp = Point(43)

  # Note the local scope
  lp.x : FBin64&lr
  lp:get_x_ptr() : FBin64*lr
}
struct Wrapper
  # Using an instance-scoped indirect
  # value as a field type restriction.
  val ptr : FBin64*irw

  # # This is implied.
  # impl initialize(&this*crw, ptr : FBin64*crw)
  #   this.ptr = ptr
  # end
end

@[Std::Entry]
export void main() {
  static final x = 0f64

  # Onyx does not support non-trivial
  # initializers within static context
  static final w = unsafe! uninitialized Wrapper

  # `initialize` is special in terms that it safely
  # casts `this : self&r` to `this : self&rw`.
  # But static to caller casting is still fragile.
  fragile! w:initialize(&x)

  w.ptr : FBin64*srw&sr # `ptr`'s scope becomes static
}

When returned from a method, an instance-scoped pointer scope is cast to the object’s scope, from the perspective of the caller. An instance-scoped pointer shall not be safely cast to any other scope other than undefined, because it would eliminate the "cast to object’s scope" feature.

It is possible to safely pass an instance-scoped pointer as a caller-scoped pointer argument.

A field’s type may be declared an instance-scoped pointer. The pointer shall then have the same scope as the containing object.

struct Wrapper
  let a : FBin64

  def get_a() : FBin64*irw
    a : FBin64&irw # This is an instance-scoped reference

    # Again, returning a reference would turn it
    # into an rvalue, which is not what we want.
    return &a : FBin64*irw
  end

  # An instance-scoped pointer expands
  # to the containing object's scope.
  let b : SBin32*irw

  def double_b()
    # It is still instance-scoped here.
    *(b : SBin32*irw) *= 2
  end
end

let b = 42

# `w` has local scope, thus its `b`
# becomes `SBin32*lrw` for the observer.
final w = Wrapper(a: 17, b: &b)

# An instance pointer becomes a local
# pointer, inherited from `p`'s scope
final a : FBin64*lrw = w.get_a() : FBin64*lrw

w.double_b()
@assert(b == 84)
TIP: Only local-scoped references are finalized, and x is a local-scoped pointer. Therefore, no double-finalization would happen.

13.8. Static scope

Statically-scoped indirect values reference values existing in the static scope, i.e. outside of the call stack, and guaranteed to be available at any moment of program execution.

TODO: The definition of "static" is tricky for GPU kernels. Should put better thought at it.

A reference declared in a namespace, trait or unit type declaration has implicit \$"static"\$ scope modifier. A reference declared in a struct, class or enum type declaration may be declared statically-scoped with explicit \$"static"\$ scope modifier.

A statically-scoped indirect value may be safely cast to a local-, caller-, instance- or undefined-scoped indirect value, but not vice versa.

13.9. Undefined scope

Indirect values with undefined scope are safe to pass around, but the values they’re referencing can not be safely accessed. For example, with \$x\$ : \$T\$* : \$T\$*uRW, it is unsafe to either call a method on \$x\$ or dereference it, reading its value.

A C pointer has undefined scope by default (it is also neither readable nor writeable), and therefore should be unsafely cast to a desired pointer type prior to using, for example:

extern int* get_some_int_ptr(void);

def main
  final ptr = unsafe! $get_some_int_ptr() : $int*
  final result = *(unsafe! ptr as SBin32*sr) # Now we can read from it
end

Any-scoped indirect value may be safely cast to an undefined-scope indirect value, but not vice versa.

14. Functions

A function may declare generic arguments. For that, a type identifier unavailable in the current scope shall be listed in a forall clause. A function generic argument is available within the function prototype and body. An unrestricted argument

If a function body contains delayed macros, then it is guaranteed to specialize per matching type of a fuzzy restriction?

Built-in methods are is? == :?, of? == ~? and as!. They can not be overloaded, but can be used as binary operators, e.g. (x is? T) == (x :? T) == (x.is?(T)). as! means unconditional coercion and unsafe unless the argument is self or a compiler can prove safety of the coercion (e.g. for a local-scoped variant instance within a branch).

x.is?(Undef~U) == x.of?(U).

to(type) and to_ methods family may also be overloaded. x to SInt32 == x.to(SInt32) == x.to_i32, also x.to_$i.

TODO: {% if nx.ctx.impl.recv then %}.

Trivial functions

A trivial function is one not calling any runtime code when inlined, but may only be doing some reference assignments. An example of a trivial function is the default initializer for structs, or primitive initializers such as numbers.

14.1. Function declarations

Function declarations are threadsafe by default.

14.2. Methods new

A function member declared within a data type with instance scope (which is the implicit default one) is called a method.

A method may be called on an object using the obj.method(args) syntax. Note that as any other function call, a method call requires parentheses even with zero arity.

Within a method, a special this reference is available.

If the containing type is a value type, then this is a read-only copy of the caller, i.e. this : self&lr.

Otherwise, if the containing type is a reference type, then this is a read-only reference to the caller, i.e. this : mut<self>&cr or this : const<self>&cr. The mutability of this in a reference type method is controlled with a method mutability modifier, e.g. mut decl foo(). Read more about mutability in Section 18.2, “Classes”.

A static function may also be called on a type using the same methods-call syntax, e.g. (obj : T)::func() \$-=\$ T::func() \$-=\$ T.func(). Nothing is implicitly passed to such a call. This behaviour is also applicable to the static field access syntax, e.g. obj::ref \$-=\$ T::ref \$-=\$ T.ref.

14.3. Methods

method

A function member declared in a data type accepting an instance of the type (or a pointer to an instance of the type if the type is a by-ref type) as the first anonymous argument. Also see UFCS.

A function declared with an instance scope modifier, which is the implicit default for a function declaration within a data type declaration, is similar to a function declared with a static scope modifier with the very first argument declared as this : self or &this : self*cr, based on the type kind.

For example, decl foo() in a struct declaration would be similar to static decl foo(this : self).

Consequently, this : self&lr or this : self&cr is implicitly defined within a function declared with an instance scope modifier, referring to the implicit argument declaration.

If you want to have a custom-scoped or custom-acessible this, consider declaring a statically-scoped function with the argument restriction you need.

An identifier lookup within such a function adds this. after the local scope lookup. For example, if x is not found in the local scope, it is then attempted to be qualified as this.x. Note that a local identifier can shadow an instance reference.

A method is declared using one of the following semantics:

struct T # Or `class T`
  decl method(args) # `instance decl` is implied
  static decl method(this : self, args) (1) (2)
end

decl T:method(args)
decl T::method(this : T, args) (1) (2)
1 To qualify as a method, only the type of the first argument matters. Therefore, it may be named other than this.
2 this : T shall be changed to this : T*cr for a by-ref type to qualify as a method.

A method may be called on an object using one of the following semantics:

obj.method(args)
obj::method(obj, args) # Or `(&obj, args)` if called on a by-ref type
T::method(obj, args)   # Ditto

As any other function call, a method call requires parentheses even with zero arity.

Example 6. By-val type methods
struct Point
  val x, y : FBin64

  def append(another : self)
    return self(
      # `this` is implicitly declared in a method,
      # referring to the instance copy.
      x: ((this : self&lr).x : FBin64&lr) + another.x,

      # An identifier lookup starts from `this.`.
      y: y + another.y)
  end

  # # Could've been declared as a static function instead.
  # static threadsafe def append(that : self, another : self) : self
  #   return self(that.x + another.x, that.y + another.y)
  # end
end

def main
  final p = Point(1, 2)
  @assert(p.length() ~= 2.24)
  # @assert(p::length(p) ~= 2.24)     # Ditto
  # @assert(Point::length(p) ~= 2.24) # Ditto
end

Class methods have this.field syntactic sugar for an argument declaration, which expands to the field assignment. For example, impl Foo::initialize(this.x); sets x to 42 upon calling Foo(42) before executing the function body.

Example 7. By-ref type methods
class Foo
  let x : SBin32

  # An initializer implementation
  # is required for a class.
  impl initialize(this.x);

  # `mut` defines mutability of `this`.
  mut def double_x()
    # Ditto for lookup.
    #
    # Note how `.x` it returns a writeable
    # reference even if `this` is read-only.
    # This behaviour is related to mutability.
    (x : SBin32&crw) += (this : self&cr).x
  end

  # # Could've been implemented as a static function instead.
  # static threadsafe def double_x(&this : mut<self>*cr) : SBin32
  #   return this.x *= 2
  # end
end

def main
  final f = mut Foo(42)

  f.double_x()
  # f::double_x(&f)   # Ditto
  # Foo::double_x(&f) # Ditto

  @assert(f.x == 84)
end

15. Arguments

Function and generic arguments share the same syntax. An argument requires an explicit name or index. An argument may have an alias, a type restriction and a default value.

A function argument declaration has the same semantics as a value declaration. By default, a function argument is implicitly constant, i.e. def foo(x) is equivalent to def foo(final x). Alternatively, an argument value may be declared variable: def foo(let x). It is not possible to pass a constant as a variable argument.

A constant value may be unsafely cast to a variable:

def foo(let arg);
final x = 42
foo(unsafe! x.as(SInt32&w))

16. Types

Types are namespaces, traits, units, structs, classes, enums and annotations. The classification is known as a type kind. Structs, classes and enums are known as data types; their instances, called objects, may exist in runtime.

16.1. Type instances

A type instance is an instance of a type itself, e.g. let x : \SInt32 : Type<SInt32> = SInt32.

A type instance may be used in runtime, have its static members accessed (including initialization and comparison to other types), and even used as a generic argument.

A type instance by itself does not carry any type information in runtime.

Example 8. Type instances
A freestanding type instance
let x = SBin32

@assert(@typeof(x) == \SBin32)
@assert(@sizeof(x) == 0)

@assert(x == SBin32)      # Calling a static function `::==(another_type)`
@assert(x<Bitsize> == 32) # Accessing a generic argument
@assert(x(42) == 42i32)   # Calling a static function `::()`
@assert(x::Max == 2147483647i32) # Accessing a static reference

# def foo(x : \T) forall T ~ SInt;
def foo(x ~ \SInt);

foo(x) # OK, equivalent to `foo(SBin32)`
A variant of type instances
let x = Std@rand(SBin32, FBin64)
x : <\SBin32, \FBin64>?&lrw

if x is? \SBin32
  @assert((x as SBin32)(42) == 42i32)
end

16.2. Literal instances

A type instance may be a literal instance. A literal instance has basic arithmetic functions and can be used both as a generic and runtime argument.

A literal instance has type Literal<\$L\$>, where \$L\$ is a literal constrainment from Table 1, “Basic literal constrainments”.

16.3. Generic types

Data type

A struct, class or a enum type.

Object

An instance of a data type, existing in runtime.

By-val type

A struct or a enum type.

By-ref type

A class or Lambda type.

Generic type

A type containing at least one generic argument.

Type reference

A reference to a type with or without generic arguments from the source or macro code, e.g. SBin<32> or Std.

A generic type is a type containing at least one generic argument. A generic argument may be used within the type.

16.4. Type specialization

Qualification of an identifier (i.e. a lookup) under a type reference triggers the reference specialization. A specialization occurs once per unique generic arguments combination. An omitted generic argument is valid, has nil value in macros, and contributes into the uniqueness. A non-generic type may have at most one specialization.

A specialization triggers evaluation of delayed macros contained directly within the type declaration.

A delayed macro contained directly within a struct or class type declaration may evaluate to an instance field implementation.

A specialization of a struct or class child type triggers specialization of its parent.

A specialization of a deriving type triggers specialization of all the traits it derives from, in the order of derivation.

A complete type is a data type reference specialized with defined occupied size, or a unit type reference (which always has zero size). Any other type is incomplete type.

Only a complete type shall be used as a runtime value type. However, an incomplete type instance is allowed, e.g. let x : \SInt = SInt.

16.5. Members

A type reference may contain member entities: functions, macros, values and types. This classification is known as member kind.

In that sense, every type is a name-space.

Function and value members have storage, which is either instance or static. In a trait, struct, class or enum type declaration, a function or value member declaration has implicitly instance storage, which may be changed to static. However, a enum type declaration disallows instance value member declarations, therefore it shall be explicitly set to static. In a namespace, unit or annotation type declaration, a function or value member declaration always has static storage, and it shall not be changed.

16.6. Behavioural erasure

With constraints applied to a value a compiler may or may not be able to interact with it in certain ways, e.g. call a specific method. This is known as a behavioural erasure.

Both real and erasured types of a value are always known.

Any type (including special types like Type, Void etc.) has built-in is?, of? and as methods defined, collectively known as reflection methods. Reflection methods are well-known and may be used as binary operators, e.g. x is? T. is? and of? shall not be used as function names, i.e. overloaded. as may be overloaded, e.g. for 0.5.as($float).

# Determine if the instance
# is of exactly type `T`.
#
# ```
# x = 42
# @assert(x is? SInt32)
# ```
decl is?(\T) : Bool forall T
alias :? = is? # E.g. `x :? T`

# Determine if the instance is of
# a type less than or equal to `T`.
#
# ```
# x = 42
# @assert(x of? Int)
# ```
decl of?(\T) : Bool forall T
alias ~? = of? # E.g. `x ~? T`

# Return the instance itself.
decl as(\self) : self

# Unsafely coerce the instance as
# an instance of another type.
unsafe decl as(\T) : T forall T

Reflection methods affect behavioural erasure of a entity. as becomes a fragile method when a compiler can prove it is not unsafe.

A value may be constrained using : and ~ binary operators, whereas : requires right operand to be a complete type.

In this example, a well-known type SInt32 is behaviour-erasured, so we can’t access the constant Max, which is only defined for sized `SInt`s.

let x : SInt32 = 42 # `x` is constrained to `SInt32`
@assert(x::Max == 4_294_967_295) # OK

# # We're constraining `x` to `SInt`, and then
# # try to access its `::Max` constant
# x~SInt::Max # Panic! `Max` is not defined for `SInt`

# Constraining to `SInt` in the current scope.
# Now the compiler treats `x` as `SInt`,
# but its true type is still preserved.
x = x ~ SInt
@typeof(x) # => SInt (SInt32) # Compiler still knows the real type

@assert(x ~ SInt)
# x::Max # Panic! Ditto

# `~SInt` could theoretically be `SInt32`,
# and we can check it in runtime.
# A compiler may elide the actual comparison.
if x :? SInt32
  @assert(x::Max == 4_294_967_295)
end

An unconstrained generic argument has implicit type Any. An Any type instance does not allow any access other than reflection method calls.

def foo(x : T) forall T # eq. to `forall T ~ Any`
  # During the initial parsing,
  # no real type is present
  @typeof(x)  # => Any

  # Actual type is revealed during a
  # specialization, but it's still erasured
  \@typeof(x) # => Any (SInt32)

  # x += 1 # Panic! `Any` does not have method `+`

  if x :? SInt32
    \@typeof(x) # => SInt32 # No erasure is applied anymore
    x += 1 # OK
  end
end

foo(42i32)

Behavioural erasure ignores any definitions from other than the constrained scope. The code in the example below would continue working even if added the Drawable3D trait to Line, or introduced an entirely new Drawable4D trait and derived it from both of the structs.

trait Drawable2D
  decl draw()
end

trait Drawable3D
  decl draw()
end

# Point has the following methods:
#
# ```
# final p = Point()
# p.draw2d()
# p.draw3d()
# p~Drawable2D.draw()
# p~Drawable3D.draw()
# p.draw() # Panic! `Point:draw` is ambuguous between
#          # `Point~Drawable2D:draw` and `Point~Drawable3D:draw`
# ```
struct Point
  derive Drawable2D
    # Callable as `Point~Drawable2D:draw`
    # and `Point:draw2d`
    impl draw() as self.draw2d;
  end

  derive Drawable3D
    # Callable as `Point~Drawable3D:draw`
    # and `Point:draw3d`
    impl draw() as self.draw3d;
  end
end

# Line has the following methods:
#
# ```
# final l = Line()
# l.draw()
# l~Drawable2D.draw()
# ```
struct Line
  derive Drawable2D
    impl draw()
  end
end

def draw2d(x : T) forall T ~ Drawable2D
  @typeof(x) # => Drawable2D
  \@typeof(x) # => Drawable2D (Point) # Or `Line`

  x.draw() # OK, `:draw` is defined for any `Drawable2D`
  # x.draw2d() # Panic! `:draw2d` is not defined for `Drawable2D`

  if x is? Point
    \@typeof(x) # => Point (Point)
    x.draw2d() # OK, can call `Point`-specific method
  end
end

draw2d(Point())
draw2d(Line())

Either in the form of a type annotation, or as a binary operator, a restriction operator contributes into the return-type overloading.

def read() : String*
def read() : Std::Twine

# let x = read() # Panic! Can not infer type of `x`

let x : String* = read() # OK
let x = read() : Std::Twine # OK

# Still enough information to
# unambiguously choose an overload.
let x : Pointer = read()
@typeof(x) # => Pointer (String*) # The value is erasured, however
x as String* # The coercion is safe here

16.7. Type expressions

A type expression consists of multiple type references joined with logical operators &&, ||, ^, ! and grouped with parentheses. A freestanding type reference is a degenerate case of a type expression. A type expression containing at least one logical operator is a complex type expression.

A type expression may be flattened to a comma-separated list of currently specialized complete types matching the expression using the * unary operator. For example, *(SInt && !SInt32) would likely evaluate to SInt8, SInt16, SInt64, SInt128 (note the missing SInt32). A flattened list of types may be used as a list of generic arguments, for example, Union<*(SInt32 || FBin64)> would evaluate to Union<SInt32, FBin64>. As a syntactic sugar, a freestanding complex type expression or a freestanding flattened list turns into a Variant of types contained in the flattened expression, e.g. 𝐴 || 𝐵 : *(𝐴 || 𝐵) : Variant<*(𝐴 || 𝐵)>.

A type expression may be flattened to a list of complete types already specialized at the moment; it does not include unspecialized yet type references.

Flattening a type expression is aligned with flattening a tuple type, e.g. Union<*(A, B)> : Union<A, B>.

A wildcard type may be used within a type expression, for example * < 𝑇 means "all types satisfying the < 𝑇 condition". A T::* expression would match all types directly under the T namespace, for example T::A, but not T::B::C or T. A T::** expression would match all types under the T namespace, for example T::A and T::B::C, but not T. These may be combined, e.g. T::** < (U && V). A result of an expression containing a wildcard is a flattened list of matching types. Hence, *(A && B) is equivalent to * < (A && B).

A type expression may be enumerated upon using mapping (), filtering (-?>) and negative filtering (-!>) operators.

A mapping block is not a "logic" complex type expression, but rather an algebraic type expression, where types are operated upon using <, <~, , == etc. built-in operators, and the mapped type is referenced with $ or $0 (which is aligned with anonymous block arguments syntax).

Filtering and negative filtering blocks are complex type expressions, where the filtered type is matched.

An example of an enumeration would be : *(AbstractLogger)-!>(UnwantedLogger)→$&, which evaluates to a variant of pointers to all AbstractLogger specializations known at the moment of specialization, excluding the UnwantedLogger type.

16.8. Type restriction

Runtime values (which includes function return values) can be restricted to a concrete type using the : T notation, where T is a type expression. Such a restriction is a concrete type restriction.

The notation is similar to the one used in the Type Theory, e.g. \$2 : nat\$.

If a restriction type expression contains generic arguments, they are checked against sequentially and recirsively in the order of declaration in the restriction. For example, in x : Array<Size: 3, Element: Foo<Bar>>:

  1. Ensure that x is Array

  2. Ensure that x::<Size> is 3

  3. Ensure that x::<Element> is Foo

  4. Ensure that x::<Element>::<[0]> is Bar

A tuple is simply a generic type with its types listed as generic arguments. If a restriction is a tuple, its elements are checked sequentially, e.g. in x : (A, B), x must be a tuple of two types A and B.

A generic argument may be restricted to a type instance or a literal. For example, Foo<T : \U || 42 || \%s> would only allow Foo<U>, Foo<42> or Foo<"bar">.

Foo<T : U> would be invalid.

When a type restriction is applied to a argument declaration, it is said that the declaration is type-annotated; the restriction defines the type of the argument.

When a type restriction is applied to runtime expression, it is used as a binary operator; the restriction is used to ensure the type of the expression. A restriction binary operator returns the left operand on success, panicking otherwise.

There are also soft-check versions of restriction binary operators: :? / is? and ~? / of?, which evaluate to a boolean value.

let x : SInt32 = 42 # A type-annotated variable definition
Std.print(x : SInt32) # Ensure that `x` is `SInt32`

# Soft-check if `x` is `SInt32`.
# Would possibly evaluate to the `true` literal
if x :? SInt32
  Std.print("`x` is always `SInt32`")
else
  Std.print("`x` is always not `SInt32`")
end

# An algebraic expression
# is applicable here.
if @typeof(x) == SInt32
  Std.print("`x` is always `SInt32`")
else
  Std.print("`x` is always not `SInt32`")
end

let y = Std@rand(42, 0.5) : Variant<SInt32, FBin64>

# Soft-check if `y` is currenty `SInt32`.
# Would perform a runtime check.
#
# NOTE: It calls `.is?(SInt32)` on
# the actual option of the variant.
if y is? SInt32
  Std.print("`y` is currently `SInt32`")
else
  Std.print("`y` is currently `FBin64`")
end

# An algebraic expression is not applicable here,
# because `@typeof` evaluates in compile-time, so
# `@typeof(x)` would always be `Variant<SInt32, FBin64>`.
if @typeof(x) == SInt32 # Would always evaluate to `false`
  @unreacheable
end

16.9. Fuzzy type restriction

: T is a concrete type restriction, whereas ~ T is a fuzzy type restriction.

A concrete restriction requires the expression to evaluate to a concrete type, whereas a fuzzy restriction does not. Instead, a fuzzy restriction requires the restricted value type to be either of concrete types matching the type expression. For example, ~ SInt matches any SIntN, where N is bitsize.

Therefore, a fuzzy restriction shall not be used as a field or local value type annotation. But if it is used as an argument declaration type annotation, (a) it leads to specialization for every matching type, (b) it may use polymorphism.

DRAFT: When a type is fuzzy-restricted, you can not query its real type? So this is orthogonal to a concrete type restriction.

17. Namespaces

A namespace type may only contain static functions and values.

18. Data types

A data type defines meaning of a runtime value. It may be a user-defined or anonymous struct, class or enum, or a primitive such as an array, tuple, vector, scalar number etc. Namespace and trait types are not data types.

18.1. Structs

A struct is a user-defined named container of named references.

A struct is defined using the struct keyword, e.g. struct Foo. See Section 7, “Entities” for more info about defining entities.

As any other type declaration, a struct allows static member declarations. Static reference and function member declarations within a struct require an explicit static scope modifier.

Instance-scoped references, i.e. fields, in a struct are defined using the val statement, e.g. val x : FBin64. A val statement does not have a scope. Accessibility of a field depends on the containing struct’s. For example, if strukt is read-only, then strukt.x would also be read-only. A val statement is not allowed to have a default value clause.

The val keyword is chosen over let to reflect the dependant accessibility nature, whereas let implies always being both readable and writeable. Otherwise, val has semantics similar to let.

A function declared within a struct with the implicit default instance scope modifier becomes a method with a this : self&lr local constant available within the body representing a copy of the caller. Consequently, qualification of identifiers within a method’s body implies the this.* lookup. Read more in Section 14.3, “Methods”.

A struct type implements a default trivial initializer accepting field values in the order of declaration.

# `complete public def struct Point` is implied.
struct Point
  val x, y : FBin64

  # `instance public def length()` is implied.
  def length
    (x ** 2 + this.y ** 2).sqrt()
  end

  # Use a static function to implement
  # a custom initialization logic.
  static def zero
    self(0, 0)
  end
end

def main
  # A basic literal initialization
  final p1 = Point{ 1, 2 }
  @assert(p1.length() ~= 2.23)

  # A static function call
  final p2 = Point::zero()
  @assert(p2.length() == 0)

  # Using of an uninitialized instance
  final p3 = unsafe! uninitialized Point
  p3.x = 1
  # p3.length() # UB
end

A struct type does not declare a finalize method. But it still has lifetime and finalizes its fields in the order of declaration once it dies.

Even if a finalize method is declared in a struct, it won’t be called automatically.

Once a struct is implemented with a complete completeness modifier (which is the default implicit one), it can not have new fields declaration. But it may be further reopened to have new methods declared. A struct implemented with an explicit incomplete completeness modifier can not be initialized, but allows further reopenings to declare new fields.

A struct declared with an abstract modifier shall never be directly initialized. Further reopenings of the struct shall also include the abstract modifier.

A struct has undefined ordering of elements in memory unless annotated with @[Ordered]. It also has undefined alignment unless annotated with @[Aligned]. A struct may be packed by applying the @[Packed] annotation.

A struct may extend another struct at most once. Only a (currently) complete struct may be extended. An extending struct default initializer contains all the fields in the order of declaration. Memory boundaries of an extended struct are guaranteed to precede the extending’s. Therefore, an extending struct may be thread-safely coerced into the extended one. Extending a struct does not inherit its static members.

abstract struct Parent
  # The ordering of `a` and `b` is undefined.
  # For example, `b` may actually precede `a`.
  val a : SBin32
  val b : FBin64

  def sum
    a.to_f64() + b
  end

  static def noop;
end

# `<` is a shortcut for `extend Parent;`
struct Child < Parent
  # The ordering of `c` and `d` is also undefined.
  # However, `a` and `b` are guaranteed to precede `c` and `d`.
  val c, d : UBin16

  reimpl sum
    Parent::noop() # Static members are not inherited
    super() + c.to_f64() + d.to_f64()
  end

  # # Would not trigger name collision.
  # static def noop;
end

def main
  # Child::noop() # Panic! `Child::noop()` is not declared

  # Note the augmented default initializer
  final c = Child(1i32, 2f64, 3u16, 4u16)
  @assert(c.sum() == 10)
  @assert(c~Parent.sum() == 3)

  # This is a threadsafe operation
  final p = c as Parent

  # Note that `Parent` is `abstract`, but it
  # still can be "initialized" indirectly
  @assert(p.sum() == 3)
end

18.2. Classes

Class semantics is the way to do object-oriented programming in Onyx. A class implies incapsulation and resource control.

A class type does not have a literal initializer defined. Instead, it may have multiple initialize method declarations which are delegated to upon a Klass() call. All fields without default values shall be initialized in every initialize method implementation. Despite of a defined initialize method safety, a Klass() call is always threadsafe.

A class type may have its fields declared using the familiar let and final statements, as well as class-specific get and set statements. The latter two only allow certain accessibility from the observer’s point of view, but the field is fully accessible from the class itself. Class field declarations allow default values and visibility modifiers. A final field declaration shall not be changed after set in an initialize method, or to its default value.

Class method declarations allow special this.field argument declaration syntax, which is a syntactic sugar for accepting and rewriting a field value in the very beginning of the method’s body.

By default, class field and method declarations have an implicit private visibility modifier.

A class instance has lifetime and it finalizes all its fields upon death, in the order of declaration. An explicit finalize() method may be defined, which would be called prior to finalizing them fields. Despite of the defined finalize() method safety, an implicit finalization of a class instance is always threadsafe.

TODO: Mutability. TODO: Traits.

mut class Car
  # A publicly-visible variable field.
  public let acceleration : FBin64

  # A publicly-visible field which shall not
  # be changed after set in an `initialize` method.
  # Note that it does not have a default value,
  # thus shall be set in every `initialize` method.
  public final max_speed : FBin64

  # This field may only be read from the
  # outside, i.e. `car.velocity : FBin64&r`,
  # but it's fully accessible from the
  # inside, i.e. `this.velocity : FBin64&crw`.
  #
  # TODO: The declaration type restriction is implicitly
  # `: Infer`, which immediately expands to `: FBin64`.
  public get velocity = 0f64

  # This variable is only visible within the class itself.
  let resources : $void*

  # This is marked unsafe to avoid duplicate resource acquisition.
  public unsafe def initialize(this.max_speed, acceleration = 0)
    threadsafe!
      this.acceleration = acceleration
      resources = unsafe! $malloc(100)
    end
  end

  # A public function to update the car's
  # velocity based on its acceleration.
  public def update(time_passed: `Δ)
    velocity = (velocity + acceleration * `Δ).max(max_speed)
  end

  # Ditto for duplicate resource deallocation.
  # Note that this definiton is private so we
  # don't accidentaly manually call it.
  unsafe def finalize
    $free(resources)
  end
end

def main
  final car = Car(100)
  car.acceleration = 10
  car.update(time_passed: 3)
  @assert(car.velocity == 30)
end

18.3. Enums

A enum type is a collection of named integer values.

By default, an underlying type of a enum is SInt32, but it can be changed explicitly, e.g. enum Foo : UInt16. Only a Int type may be a enum underlying type.

The very first defined enum value has an implicit underlying value of zero. Each enum value defined is implicitly incremented by one from the previous defined value. A enum value definition may have an explicit underlying value assigned, e.g. val Foo = 3.

For Rust-like enums, create a distinct alias for a Variant.

Example 9. Rust-like enums
distinct alias MyRustEnum = Variant<SInt32, Vector<FBin64, 2>>
  def product : SInt32 || FBin64
    if this.is?(SInt32)
      return this.as!(SInt32)
    else
      return this.as!((FBin64)x2).product()
    end
  end
end

19. Built-in types

19.1. Variants

A variant is a union of values (called options) with an unsigned integer switch determining its actual option.

Akin to a union, the order of options in a variant is irrelevant, i.e. Variant<\$"A"\$, \$"B"\$> == Variant<\$"B"\$, \$"A"\$>.

Internal layout of a variant is undefined.

An access to a variant in runtime is first attempted on the variant instance itself, and then transparently delegated to its actual option. Therefore, an accessed member shall be implemented for every option of the variant.

The rule also applies to built-in methods is?, of? and as!.

An as! call is threadsafe on a local-scoped reference, or fragile on a statically-scoped reference, to a variant which is proved to have a concrete option.

A simple example demonstrating access to a variant’s options.

struct Foo { decl x() }
struct Bar { decl x(); decl y() }

final var = Std@rand(Foo(), Bar()) : Variant<Foo, Bar>
var.x() # OK, declared for both options
# var.y() # Panic! `y` is not declared for `Foo`

# A `.switch()` call returns an unsigned integer with
# minimum required bitsize to store the variant switch.
@assert(var.switch() :? UBin1)

# Calling `.as!()` is unsafe here,
# because there is no compile-time
# information to guarantee that
# `var` is actually `Bar`.
unsafe! var.as!(Bar).y()

if var.is?(Bar)
  # Here, a compiler is able to proof
  # that `var` is currently `Bar`, thus
  # the `.as!(Bar)` call is threadsafe.
  var.as!(Bar).y()

  # Note that `var` itself is still a variant.
  var = Foo()

  # # A compiler knows that the variant
  # # can not currently be `Bar`.
  # var.y() # Panic!
end

Passing a variant outside of Onyx context has no defined semantics, but totally feasible. The example may be further improved using macros.

export union val_t {
  int int_v;
  double double_v;
};

export struct var_t {
  int sw;
  union val_t un;
};

export enum SWITCH {
  INT,
  DOUBLE
};

export struct var_t get_variant() {
  final var = Std@rand($int(42), $double(42)) : Variant<$int, $double>

  return case var
    when $int
      $var_t(
        sw: $SWITCH::INT,
        un: $val_t(int_v: var.as($int)))
    when $double
      $var_t(
        sw: $SWITCH::DOUBLE,
        un: $val_t(double_v: var.as($double)))
  end
}

20. Lifetime

TODO: A structure is finalized on the containing scope side. If a class has a finalize() method definition, it is called prior to finalizing its fields. The NoFinalize annotation may be unafely applied to a reference to skip finalization at the end of the reference’s lifetime.

# If want to move just an
# element of a structure
#

@[unsafe! NoFinalize]
final ary = [1, 2, 3]

final x = @rand(2)
final moved <- ary[x]

ary.each -> |e, i|
  if i != x
    @finalize(e)
  end
end

return moved

TODO: Only identified references with local scope are finalized. For example, in let e = &ary[0], *e : T&lrw is not an identified reference. Should think better on the naming: let x = 42, x is an identified reference, i.e. has memory in the current scope.

21. Piping

A entity may be piped to reduce code duplication.

A "self-pipe"

An x |𝐸 expression expands to (x𝐸; x). For example, x |.y = 42(x.y = 42; x). \$E\$ must be either a lookup (|:, |::) or a method call (|.). No whitespace is allowed after | to avoid confusion with a binary operator |. To control precedence, use parentheses. For example, x |.a = b |.foo(x.a = (b.foo; b); x), but x |.a=(b) |.foo((x.a = b; x).foo; x).

A "block pipe"

An x |> 𝐸 expression expands to (𝐸), where \$E\$ is a block of code with a single anonymous block argument x. For example, x |> foo($)(foo(x)). Unlike self-pipes, block pipes are always "flat", e.g. x |> $.foo |> $.bar expands to ((x.foo).bar). Therefore, the block pipes concept is similar to POSIX pipes.

A "tapping pipe" or "self-returning block pipe"

An x <|> 𝐸 expression is similar to block pipe, but it expands to (𝐸; x), returning the argument. For example, x <|> foo($)(foo(x); x).

The name is after Ruby’s #tap method.
Example 10. Piping
x
  |.foo     # Simple pipe, returns `x`
  |> $.bar  # Block pipe, returns the evaluation result
  <|> $.baz # Tapping pipe, returns the argument (`x.bar`)

  # A multi-line block pipe version
  # with block boundaries and prologue
  |> |(arg)| do
    qux(arg)
  end

# Is equivalent to:
#

final %%0 = ((x.foo; x).bar)
qux((%%0.baz; %%0))