A story of exception encoding in BuckleScript

May 6, 2020

We just recently made some significant improvements with our new exception encoding and we find it so exciting that we want to highlight the changes and explain a little bit how exceptions work when compiling to JS.

The new encoding allows us to provide proper, clear stacktrace information whenever a Reason/OCaml exception is thrown. This is particularly important when you have some code running in production that needs to collect those stacktrace for diagnostics.

What's the difference?

exception My_exception { x : int};

let loop = () => {
 for (i in 0 to 100) {
   if (i == 10) {
     raise (My_exception { x : i})
   };
 };
};
loop ();

When we compile and run this piece of code with the old exception encoding, this is what we'd get:

exn_demo$node src/exn_demo.bs.js 

/Users/hongbozhang/git/exn_demo/src/exn_demo.bs.js:11
      throw [
      ^
[ [ 'Exn_demo.My_exception', 1, tag: 248 ], 10 ]

With our new improvements, we now get way better results:

bucklescript$node jscomp/test/exn_demo.js

/Users/hongbozhang/git/bucklescript/jscomp/test/exn_demo.js:10
      throw {
      ^
{
  RE_EXN_ID: 'Exn_demo.My_exception/1',
  x: 10,
  Error: Error
      at loop (/Users/hongbozhang/git/bucklescript/jscomp/test/exn_demo.js:13:20)
      at Object.<anonymous> (/Users/hongbozhang/git/bucklescript/jscomp/test/exn_demo.js:21:1)
      at ...
}

That's basically it! Furthermore in this post, we want to give you some insights on how the data representation of exceptions looks like, and how it has been changed to expose useful stacktraces.

Why it is tricky to preserve stack-traces in ReasonML exceptions

Whenever you are using a Reason / OCaml exception (a so called "native exception"), you are actually using a data structure which is not the same as a JS runtime exception. That means that each exception representation invokes a different stacktrace handling mechanism:

In JS, the stacktrace is collected immediately when an Error object is created / thrown, while in native Reason / OCaml, such data is not attached to the exception object at all (you can't just access e.stack to retrieve the stacktrace). This is because collecting the stacktrace in a native environment highly depends on the runtime support (e.g. if a flag was provided to attach the stacktrace data).

Our goal was to provide a way to get the same stacktrace for native exceptions as you would with JS exceptions. This is all part of our on-going work to plan and implement the optimal encoding for all the different ReasonML data types for the JS runtime (just like with our previous changes to the bool, unit and records representation as well).

What's the classical ReasonML exception encoding?

In ReasonML, an exception is basically structured data. Let's have a look at the two exception definitions below:

exception A of { x : int , y : string}
exception B

exception A is encoded as an array of 3 slots. The first slot is a block by itself (called an identity block), while the second slot is for field x and the third slot for field y.

exception B is just the identity block.

The identity block is an array of 2 slots. The first slot is a string like "B", while the second slot is a unique integer. In more detail, the native array will also have a magic tag 248 attached which is not relevant for our purposes though.

What's the new exception encoding?

We had to simplify and unify the encoding for the different exception cases to make it possible to compile exceptions into an object instead of an array. Let's take a look at the two exception values below for example:

A ({ x : 1, y : "x"}
B

The two values will be compiled into

{RE_EXN_ID : "A/uuid", x : 1, y : "x" }
{RE_EXN_ID : "B/uuid"}

As you can see, all exceptions (no matter with or without payload) share the same encoding.

What will happen when you raise an exception?

raise (A {x : 1 , y : "x"})

It generates following JS:

throw {RE_EXN_ID: "A/uuid", x : 1 , y : "x", Error : new Error ()}

The output above shows that we are now able to attach the stacktrace as an Error attribute very easily, since every exception is now an object instead of an array. Really cool!

It's important to note that a stacktrace will only be attached when you raise an exception. In other words, the stacktrace will not be attached just by creating an exception (which is different to JS'es new Error() behavior).

What does that mean for JS interop?

Note that in the JS world, users can pretty much throw any value they want. It is even totally valid to throw undefined. In ReasonML, when you try to catch an exception, the compiler will convert any arbitrary value to a ReasonML exception behind the scene:

If it is already a ReasonML exception, then the conversion will be a no-op (no runtime cost
Otherwise it will be wrapped as a Js.Exn.Error obj

Here is an example on how you'd access the exception value within a Reason try expression:

try (someJSFunctionThrowing()) {
| Not_found => ..  // catch  reasonml exception 1 
| Invalid_argument =>  // catch  reasonml exception 2
| Js.Exn.Error (obj) => ... // catch js exception
}

The obj value in the Js.Exn.Error branch is an opaque type to maintain type soundness, so if you need to interact with this value, you need to classify it into a concrete type first.

Caveat

Please note that it's not allowed to rely on the key name of RE_EXN_ID. It's an implementation detail which will probably be changed into a symbol in the future.
Don't over-use exeptions, remember exception should only be used in exceptional cases like division by zero. Whenever you try to express erroneous results, use the result or option type instead.

Bonus

Now with our new exception encoding in place, a hidden feature called extensible variant suddenly got way more interesting as well. Practically speaking, native exceptions are actually a special form of an extensible variant, so both are benefiting from the same representation changes!

Happy hacking and we would like your feedback!

What's new in release 7.3

April 13, 2020

bs-platform@7.3 is available for testing, you can try it with npm install bs-platform@7.3.1.

For those unfamiliar with bs-platform, it is the platform for compiling ReasonML and OCaml to fast and readable JavaScript.

This is a major release with some highlighted features as below:

Generalized uncurry calling convention support

You can use an uncurried function as conveniently as a curried one now, this is an exciting change that we wrote a separate post for details.

For uncurried support, we also fixed a long standing issue so that type inference follows naturally using the new encoding.

bar 
 -> Belt.Array.mapU((.b)=>b.foo /*no type annotation needed */)

`unit` value is compiled into `undefined`

In ReasonML, when a function does not return any meaningful value, it returns a value that is () of type unit. In native backend, the dummy value () is compiled into a const zero. We used to inherit this in JS backend as well. However, this is a semantics mismatch since in JS, if the function does not return anything, it defaults to undefined. In this release, we make it more consistent with JS: compiling () into undefined. Since in JS, return undefined can be ignored in tail position, this leads to some other nice enhancement.

let log = x => Js.log(x)

The generated code used to be

function log(x){
    console.log(x);
    return /* () */ 0;
}

It's now

function log(x){
    console.log(x)
}

Various improvements in code generation

We have increased the readability of the generated code in several common places, we believe that we reached an important milestone that if you write code using features that have counterparts in JS, the generated code is readable. This is not a small achievement given that quite a lot of the compiler code base is shared between native backend and JS backend.

There are some features that are not available in JS, for example, complex pattern matches, the readability of those pieces of generated code will continue being improved.

Take several enhancement below as examples:

meaningful pattern match variable names

let popUndefined = s =>
  switch (s.root) {
  | None => Js.undefined
  | Some(x) =>
    s.root = x.tail;
    Js.Undefined.return(x.head);
  };

function popUndefined(s) {
-  var match = s.root;
-  if (match !== null) {
-    s.root = match.tail;
-    return match.head;
+  var x = s.root;
+  if (x !== undefined) {
+    s.root = x.tail;
+    return x.head;
   }
   
 }

When pattern match against a compounded expression, the compiler used to use a temporary name match, now we employ better heuristics to generate meaningful names for such temporary variables.

Eliminate intermediate variable names when inlining

 function everyU(arr, b) {
   var len = arr.length;
-  var arr$1 = arr;
   var _i = 0;
-  var b$1 = b;
-  var len$1 = len;
   while(true) {
     var i = _i;
-    if (i === len$1) {
+    if (i === len) {
       return true;
-    } else if (b$1(arr$1[i])) {
-      _i = i + 1 | 0;
-      continue ;
-    } else {
+    }
+    if (!b(arr[i])) {
       return false;
     }
+    _i = i + 1 | 0;
+    continue ;
   };
 }

The above diff is the generated code for Belt.Array.everyU, the intermediate variables were introduced when inlining an auxiliary function, such duplication were removed in this release.

Flatten if/else branch making use of JS's early return idiom

Take the same diff from above, you will notice that the second else following if(..) continue is removed.

Below are similar diffs benefiting from such enhancement:

 function has(h, key) {
@@ -133,21 +123,18 @@ function has(h, key) {
   var nid = Caml_hash_primitive.caml_hash_final_mix(Caml_hash_primitive.caml_hash_mix_string(0, key)) & (h_buckets.length - 1 | 0);
   var bucket = h_buckets[nid];
   if (bucket !== undefined) {
-    var key$1 = key;
     var _cell = bucket;
     while(true) {
       var cell = _cell;
-      if (cell.key === key$1) {
+      if (cell.key === key) {
         return true;
-      } else {
-        var match = cell.next;
-        if (match !== undefined) {
-          _cell = match;
-          continue ;
-        } else {
-          return false;
-        }
       }
+      var nextCell = cell.next;
+      if (nextCell === undefined) {
+        return false;
+      }
+      _cell = nextCell;
+      continue ;
     };
   } else {
     return false;
@@ -155,17 +142,17 @@ function has(h, key) {
 }

--- a/lib/js/belt_List.js
+++ b/lib/js/belt_List.js
@@ -15,9 +15,8 @@ function head(x) {
 function headExn(x) {
   if (x) {
     return x[0];
-  } else {
-    throw new Error("headExn");
   }
+  throw new Error("headExn");
 }

For loop minor-enhancement

 function shuffleInPlace(xs) {
   var len = xs.length;
-  for(var i = 0 ,i_finish = len - 1 | 0; i <= i_finish; ++i){
+  for(var i = 0; i < len; ++i){
     swapUnsafe(xs, i, Js_math.random_int(i, len));
   }
-  return /* () */0;
+  
 }

Reason's for .. in only provide closed interval iterating, so it is quite common to write for (i in 0 to Array.length(x) - 1) { .. }, we did the tweaking above to make the generated code more readable.

A full list of changes is available here: https://github.com/BuckleScript/bucklescript/blob/master/Changes.md#73

Generalized uncurry support in BuckleScript 7.3

March 26, 2020

ReasonML is a curried language, while Js is an uncurried language. When compiling ReasonML into Js, there's lots of headache due to the semantics mismatch.

After several years of research and development, we reach an ideal situation in the next release: adding a lightweight uncurried calling convention to ReasonML.

Why we need native uncurried calling convention

The curried call is inherently slower than the uncurried call.

A native implementation of curried call like purescript does will generate very slow code:
```
let curriedFunction = x => y => z => x + y +z ;
let curriedApply = curriedFunction(1)(2)(3); // memory allocation triggered
```
BuckleScript does tons of optimizations and very aggressive arity inference so that the curried function is compiled into a multiple-arity function, and when the application is supplied with the exact arguments -- which is true in most cases, it is applied like normal functions.

However, such optimization does not apply to high order functions:
```
let highOrder = (f,a,b)=> f (a, b) 
// can not infer the arity of `f` since we know
// nothing about the arity of `f`, unless
// we do the whole program optimization
```
In cases where arity inference does not help, the arity guessing has to be delayed into the runtime.
Bindings to JS world

When we create bindings for high order functions in the JS world, we would like to have native uncurried functions which behave the same as JS world -- no semantics mismatch.

Generalized uncurried calling convention in this release

Before release 7.3, we had introduced uncurried calling convention, however, it has serious limitations -- uncurried functions can not be polymorphic, it does not support labels, the error message leaks the underlying encoding -- now all those limitations are gone!

Previously

The error messages above are cryptic and hard to understand. And the limitation of not supporting recursive functions make uncurried support pretty weak.

Now those limitations are all gone, you can have polymorphic uncurried recursive functions and it support labels.

The error message is also enhanced significantly

When the uncurried function is used in curried

let add = (. x, y ) => x + y;

let u = add (1, 2)

The old error message:

Error: This expression has type (. int, int) => int
    This is not a function; it cannot be applied.

The new error message

Error: This function has uncurried type, it needs to be applied in ucurried style

When the curried function is used in the uncurried context


let add = ( x, y ) => x + y;

let u = add (.1, 2)

The old error message:

Error: This expression has type (int, int) => int
    but an expression was expected of type (. 'a, 'b) => 'c

The new error message:

Error: This function is a curried function where an uncurried function is expected

When arity mismatch

let add = (. x, y ) => x + y;

let u = add (.1, 2,3)

The old message:

Error: This expression has type (. int, int) => int
    but an expression was expected of type (. 'a, 'b, 'c) => 'd
    These two variant types have no intersection

The new message:

Error: This function has arity2 but was expected arity3

Note the generalized uncurry support also applies to objects, so that you can use obj##meth (~label1=a,~label2=b).

The only thing where the uncurried call is not supported is optional arguments, if users are mostly targeting JS runtime, we suggest you can try uncurry by default and would like to hear your feedback!

You can already test it today by npm install bs-platform@7.3.0-dev.1 (Windows support will be coming soon).

Announcing bs-platform 7.2

March 12, 2020

Today we are proud to release bs-platform 7.2!

For those unfamiliar with bs-platform, it is the platform for compiling ReasonML and OCaml to fast and readable JavaScript.

You can try it with npm i bs-platform!

This is a major release with some highlighted features as below:

In memory loading stdlib

Since this release, the binary artifacts generated by the stdlib are loaded from memory instead of an external file systems, which means much faster compilation and installation.

Previously we recommended installing bs-platform globally to save on installation time.

However, with this release the installation is so fast that we recommend installing it locally instead - per project - instead, as there's no additional cost, and it provides better isolation.

You can use it with a nice tool called npx, for example, npx bsb.

The installation is also compatible with --ignore-scripts for major platforms (see Richard Feldman's talk on the security implications), and is more stable with yarn

More technical details can be found in this post.

let %private

In OCaml's module system, everything is public by default, the only way to hide some values is by providing a separate signature to list public fields and their types:

module A : { let b : int} = {
    let a = 3 ;
    let b = 4 ; 
}

let %private gives you an option to mark private fields directly

module A  = {
    let%private a  = 3;
    let b  = 4;
}

let%private also applies to file level modules, so in some cases, user does not need to provide a separate interface file just to hide some particular values.

Note interface files are still recommended as a general best practice since they give you better separate compilation units and also they're better for documentation. Still, let%private is useful in the following scenarios:

Code generators. Some code generators want to hide some values but it is sometimes very hard or time consuming for code generators to synthesize the types for public fields.
Quick prototyping. During prototyping, we still want to hide some values, but the interface file is not stable yet, let%private provide you such convenience.

Int64 performance optimization

We received feedback from some users that various Int64 operations became bottlenecks in their code performance, in particular Int64.to_string.

We responded to this, and after some hard work - but without changing the underlying representation - our Int64.to_string is even faster than bigint for common inputs.

A micro-benchmark for comparison:

running on 7.1
Int64.to_string: 367.788ms # super positive number 
Int64.to_string: 140.451ms # median number
Int64.to_string: 375.471ms # super negative number

bigint
Int64.to_string: 25.151ms
Int64.to_string: 12.278ms
Int64.to_string: 21.011ms

latest
Int64.to_string: 43.228ms
Int64.to_string: 5.764ms
Int64.to_string: 43.270ms

We also apply such optimizations to other Int64 operations.

Note that Int64 is implemented in OCaml itself without any raw JavaScript. This is case compelling hints that our optimizing compiler not only provides expressivity and type-safe guarantees, but also empowers users to write maintainable, efficient code.

File level compilation flags

In this release, we also provide a handy flag to allow users to override some configurations at the file level.

[@bs.config {flags: [|"-w", "a", "-bs-no-bin-annot"|]}]; // toplevel attributes

A full list of changes is available here: https://github.com/BuckleScript/bucklescript/blob/master/Changes.md#72

Loading stdlib from memory

February 20, 2020

Loading stdlib from memory

In the next release, we are going to load stdlib from memory instead of from external files, which will make the BuckleScript toolchain more accessible and performant.

You can try it via npm i bs-platform@7.2.0-dev.4

How does it work

When the compiler compiles a module test.ml, the module Test will import some modules from stdlib. This is inevitable since even basic operators in BuckleScript, for example (+), are defined in the Pervasives module, which is part of the stdlib.

Traditionally, the compiler will consult Pervasives.cmi, which is a binary artifact describing the interface of the Pervasives module and Pervasives.cmj, which is a binary artifact describing the implementation of the Pervasives module. Pervasives.cm[ij] and other modules in stdlib are shipped together with the compiler.

This traditional mode has some consequences:

The compiler is not stand-alone and relocatable. Even if we have the compiler prebuilt for different platforms, we still have to compile stdlib post-installation. postinstall is supported by npm, but it has various issues against yarn.

- It's hard to split the compiler from the generated stdlib JS artifacts. When a BuckleScript user deploys apps depending on BuckleScript, in theory, the app only needs to deploy those generated JS artifacts; the native binary is not needed in production. However, the artifacts are still loaded since they are bundled together. Allowing easy delivery of compiled code is one of the community’s most desired [feature requests](https://github.com/BuckleScript/bucklescript/issues/2772).

In this release, we solve the problem by embedding the binary artifacts into the compiler directly and loading it on demand.

To make this possible, we try to make the binary data platform agnostic and as compact as possible to avoid size bloating. The entrance of loading cmi/cmj has to be adapted to this new way.

So whenever the compiler tries to load a module from stdlib, it will consult a lazy data structure in the compiler itself instead of consulting an external file system.

What's the benefit?

More accessiblity.
Package installation now becomes downloading for prebuilt platforms. In the future, we can make it installable from a system package manager as well. The subtle interaction with [yarn reinstall](https://github.com/BuckleScript/bucklescript/issues/2799) is also solved once and for all.

Easy separation between compiler and JS artifacts

The compiler is just one relocatable file. This makes the separation between the compiler and generated JS artifacts easier. The remaining work is mostly to design a convention between compiler and stdlib version schemes.
Yes, better compile performance

A large set of files is not loaded from the file system but rather from memory now!
Fast installation and reinstallation.

Depending on your network speed, the installation is reduced from 15 seconds to 3 seconds. Reinstallation is almost a no-op now.

JS playground is easier to build

We translate the compiler into JS so that developers can play with it in the browser. To make this happen, we used to fake the IO system; this not needed any more since no IO happens when compiling a single file to a string.

Some internal changes

To make this happen, the layout of binaries has been changed to the following structure. It is not recommended that users depend on the layout, but it happens. Here is the new layout:


|-- bsb // node wrapper of bsb.exe
|-- bsc // node wrapper of bsc.exe
|
|-- win32
|     |-- bsb.exe
|     |-- bsc.exe 
|
|---darwin
|     |-- bsb.exe
|     |-- bsc.exe
|
|---linux
|     |-- bsb.exe
|     |-- bsc.exe

Union types in BuckleScript

February 7, 2020

Union types

Union types describe a value that can be one of several types. In JS, it is common to use the vertical bar (|) to separate each type, so number | string | boolean is the type of a value that can be a number, a string, or a boolean.

Following the last post since the introduction of unboxed attributes in 7.1.0, we can create such types as follows:

type t = 
    | Any : 'a  -> t 
[@@unboxed]    
let a (v : a) = Any v
let b (v : b) = Any v
let c (v : c) = Any v

[@unboxed]
type t =
  | Any('a): t;
let a = (v: a) => Any(v);
let b = (v: b) => Any(v);
let c = (v: c) => Any(v);

Note: due to the unboxed attribute, Any a shares the same runtime representation as a; however, we need to make sure that user can only construct values of type a, b , or c into type t. By making use of the module system, we can achieve this:

module A_b_c : sig 
  type t 
  val a : a -> t 
  val b : b -> t 
  val c : c -> t   
end= struct 
type t = 
    | Any : 'a  -> t 
[@@unboxed]    
let a (v : a) = Any v
let b (v : b) = Any v
let c (v : c) = Any v
end

module A_b_c: {
  type t;
  let a: a => t;
  let b: b => t;
  let c: c => t;
} = {
  [@unboxed]
  type t =
    | Any('a): t;  
  let a = (v: a) => Any(v);
  let b = (v: b) => Any(v);
  let c = (v: c) => Any(v);
};

What happens when we need to know specifically whether we have a value of type `a`? This is a case by case issue; it depends on whether there are some intersections in the runtime encoding of `a`, `b` or `c`. For some primitive types, it is easy enough to use `Js.typeof` to tell the difference between, e.g, `number` and `string`.

Like type guards in typescript, we have to trust the user knowledge to differentiate between union types. However, such user level knowledge is isolated in a single module so that we can reason about its correctness locally.

Let's have a simple example, number_or_string first:

module Number_or_string : sig 
    type t 
    type case = 
        | Number of float 
        | String of string
    val number : float -> t 
    val string : string -> t 
    val classify : t -> case             
end = struct 
    type t = 
        | Any : 'a -> t 
    [@@unboxed]     
    type case = 
        | Number of float 
        | String of string
    let number (v : float) = Any v 
    let string (v : string) = Any v     
    let classify (Any v : t) : case = 
        if Js.typeof v = "number" then Number (Obj.magic v  : float)
        else String (Obj.magic v : string)
end

module Number_or_string: {
  type t;
  type case =
    | Number(float)
    | String(string);
  let number: float => t;
  let string: string => t;
  let classify: t => case;
} = {
  [@unboxed]
  type t =
    | Any('a): t;
  type case =
    | Number(float)
    | String(string);
  let number = (v: float) => Any(v);
  let string = (v: string) => Any(v);
  let classify = (Any(v): t): case =>
    if (Js.typeof(v) == "number") {
      Number(Obj.magic(v): float);
    } else {
      String(Obj.magic(v): string);
    };
};

Note that here we use Obj.magic to do an unsafe type cast which relies on Js.typeof. In practice, people may use instanceof; the following is an imaginary example:

module A_or_b : sig 
    type t 
    val a : a -> t 
    val b : b -> t 
    type case = 
        | A of a 
        | B of b 
    val classify : t -> case
end = struct
    type t = 
        | Any : 'a -> t
    [@@unboxed]   
    type case = 
        | A of a 
        | B of b 
    let a (v : a) = Any v 
    let b = (v : b) = Any v
    let classify ( Any v : t)  = 
        if [%raw{|function (a) { return  a instanceof globalThis.A}|}] v then A (Obj.magic v : a)
        else B (Obj.magic b)
end

module A_or_b: {
  type t;
  let a: a => t;
  let b: b => t;
  type case =
    | A(a)
    | B(b);
  let classify: t => case;
} = {
  [@unboxed]
  type t =
    | Any('a): t;
  type case =
    | A(a)
    | B(b);
  let a = (v: a) => Any(v);
  let b = (v: b) => Any(v);
  let classify = (Any (v): t) =>
    if ([%raw {|function (a) { return  a instanceof globalThis.A}|}](v)) {
      A(Obj.magic(v): a);
    } else {
      B(Obj.magic(b));
    };
};

Here we suppose a is of JS class type A, and we use instanceof to test it. Note we use some unsafe code locally, but as long as such code is carefully reviewed, it has a safe boundary at the module level.

To conclude: thanks to unboxed attributes and the module language, we introduce a systematic way to convert values from union types (untagged union types) to algebraic data types (tagged union types). This sort of conversion relies on user level knowledge and has to be reviewed carefully. For some cases where classify is not needed, it can be done in a completely type safe way.

bs-platform release 7.1.0

February 4, 2020

bs-platform@7.1.0 is a major release. You can try it with npm i -g bs-platform! (If you have permission issues, try sudo npm i --unsafe-perm -g bs-platform)

It was called 7.0.2 but bumped into 7.1.0 due to a soundness fix (a breaking change) as follows:

Previously, the empty array [||] was polymorphic. This happens to be true, since in native an array is not resizable, so users cannot do anything with it. But in JS, we introduced a binding for push which can change the size of an array dynamically. In this case, an empty array cannot be polymorphic any more.

Removing push is possible, but it makes arrays in JS context less useful. To fix this issue while keeping push, we make [||] weakly typed so that its type inference is deferred until the first time it is used. If it is never used across the module, it has to be annotated with a concrete type; otherwise, the type checker will complain.

Several highlighted features are listed as follows:

Raw JavaScript Parsing/Checking

BuckleScript allows users to embed raw JavaScript code as an escape hatch; it used to treat such piece of code as a black box.

In this release we vendor a JavaScript parser (thanks to flowtype) for syntax checking and simple semantics analysis over raw. This is on-going work, but it is already useful now.

First, we now report syntax errors properly for raw.

Second, for simple semantics analysis, we can tell whether the code inside raw is a function or not and the arity of raw function:

let f = [%raw "function(x){return x}"]

let f = [%raw "function(x){return x}"];

Now we know f is a function declaration with no side effect; it can be removed by the dead code analyzer if not used. We also know its arity so that when it's called we know whether it's fully applied or not.

Because this sort of information can be derived from raw directly, the special raw form we introduced as follows is no longer needed:

let f = fun%raw x -> {|x|}

let f = [%raw x => {|x|}];

To reduce interop API surface, this feature will now be discouraged.

We're also exploring using such knowledge on JS literals and regexes checking.

Unboxed Types

One major feature introduced in this release is unboxed types which is blogged here.

Uniform Warning System

Previously warnings are reported in two ways:

The OCaml compiler style: -w +10
Ad-hoc warnings introduced by flags -bs-warn-unimplemented-external

In this release, we make such integration so that BuckleScript warnings are handled in the same way as OCaml's own warnings, for example, the warning attribute below can also turn off BuckleScript warnings now.

[@warning "-101"]; // file-level config

Based on this effort, we have changed all BuckleScript warnings into OCaml style warnings to reduce user-level complexity.

The newly introduced warnings are listed via bsc -warn-help:

101 BuckleScript warning: Unused bs attributes
102 BuckleScript warning: polymorphic comparison introduced (maybe unsafe)
103 BuckleScript warning: about fragile FFI definitions
104 BuckleScript warning: bs.deriving warning with customized message
105 BuckleScript warning: the external name is inferred from val name is unsafe from refactoring when changing value name
106 BuckleScript warning: Unimplemented primitive used:
107 BuckleScript warning: Integer literal exceeds the range of representable integers of type int
108 BuckleScript warning: Uninterpreted delimiters (for unicode)

We also recommend users to turn on warnerror and only disable warnings for some specific files.

We've also upgraded the Reason parser refmt to 3.6.0.

A full list of changes is available here: https://github.com/BuckleScript/bucklescript/blob/master/Changes.md#702

What's new in release 7 (cont)

December 27, 2019

[EDIT pre Dec. 27th: yes, we know the dateline is wrong :-) the actual publish date of this post is November 28th, but we're not changing the dateline because that would break the published URL of the post.]

The second dev release 7.0.0-dev.2 is released for testing!

As we mentioned in the previous post, we compile records into js objects in this release. This makes the generated code more idiomatic, however, this is not enough to write idiomatic bindings to manipulate arbitrary js objects, since the key of js objects can be arbitrary which is not expressible in ReasonML syntax, so we support user level customization now, which makes idiomatic bindings really easy.

type entry = {
  [@bs.as "EXACT_MAPPING_TO_JS_LABEL"]
  x: int,
  [@bs.as "EXACT_2"]
  y: int,
  z: obj,
}
and obj = {
  [@bs.as "hello"]
  hi: int,
};

let f4 = ({x, y, z: {hi}}) => (x + y + hi) * 2;

type entry  = {
  x : int  ; [@bs.as "EXACT_MAPPING_TO_JS_LABEL"]
  y : int ; [@bs.as "EXACT_2"]
  z : obj
} 
and obj = {
  hi : int ; [@bs.as "hello"]  
}    

let f4  { x; y; z = {hi }} = 
  (x + y + hi) * 2

function f4(param) {
  return (((param.EXACT_MAPPING_TO_JS_LABEL + param.EXACT_2 | 0) + param.z.hello | 0) << 1);
}

As you can see, you can manipulate js objects using Reason pattern match syntax, the generated code is highly efficient, more importantly, bindings to JS will be significantly simplifie.

Happy Hacking.

BuckleScript holiday release!

December 20, 2019

bs-platform@7.0.2-dev.1 is released for testing!

Try it via

npm i -g bs-platform@7.0.2-dev.1

This release contains several bug fixes for refmt(updated from 3.5.1 to 3.5.4). We also spent quite some time improving the compiler performance. For example, we optimized our specialized hash based data structures, which means that we can expect a 5% better build time performance. We would like to collect more benchmark data, so we are happy for any feedback / benchmarks from our community!

A highlighting feature is that we added Generalized Unboxed Support (so called [@unboxed] annotations). Here's a short definition from the official OCaml Manual:

unboxed can be used on a type definition if the type is a single-field record or a concrete type with a single constructor that has a single argument. It tells the compiler to optimize the representation of the type by removing the block that represents the record or the constructor (i.e. a value of this type is physically equal to its argument). In the case of GADTs, an additional restriction applies: the argument must not be an existential variable, represented by an existential type variable, or an abstract type constructor applied to an existential type variable.

Note: The beforementioned restriction about GADTs only applies to OCaml's native compiler, not to BuckleScript's JavaScript compilation. So we will get the maximum value with less confusing error messages!

The exciting thing about this feature is that we will now have more ways of expressing our programs in our typical type safe records and variants without sacrificing on runtime performance ("zero cost interop").

The best way to understand this feature is by looking at the following examples:

Unboxed variants:

[@unboxed]
type t = A(int);
let x = A(3);

will translate to following JS:

var x = 3;

As you can see, we are "unboxing" the int value from the internal variant representation, so the variant will get completely invisible to the runtime. Great for e.g. mapping to stringly typed JavaScript enums!

Unboxed Records (1 field only)

[@unboxed]
type t2 = {f: string};
let x = {f: "foo"};

will translate to following JS:

var x = "foo";

The same principle as with variants. Now a lot of people will probably ask: "Why would I ever want a 1 field record?". There are multiple reasons, one of them would be a ref type, which is just a syntax sugar for a { contents: 'a} record.

Another use case is for expressing high rank polymorphism without cost:

[@unboxed]
type r = {f: 'a. 'a => 'a};
let map_pair = (r, (p1, p2)) => (r.f(p1), r.f(p2));

Note: 'a. 'a => 'a describes a polymorphic function interface, where 'a can be called with many different types (e.g. f(1) and f("hi")). The compiler will not try to lock 'a for the first type it sees (e.g. the int) on the first call site. The parameter 'a is therefore polymorphic!

By unboxing those records with one polymorphic function, we will get rid of value restriction for our existing encoding of uncurried function, this will be a major feature!

Unboxed GADTs:

Since GADTs are lesser known in Reason syntax, we also added some OCaml snippet to get a better idea of how the example data structure is defined.

[@unboxed]
type t = 
  | Any ('a) : t; 

let array = [|Any(3), Any("a")|];

(* OCaml *)
type t = 
  | Any : 'a -> t
[@@unboxed]

let array = [|Any 3; Any "a"|]

The examples above will translate to following JS:

var array = [ 3, "a"];

As you can already tell, this feature will give us way better possibilities to do interop with polymorphic array representations in JavaScript (without losing any type safetiness!).

As a more concrete use-case, this will give users the possibility to define types such as int_or_string.

Note: Even if this GADT t contains an ADT Any, it doesn't mean that it's the same as any in TypeScript. An Any value is constrained to a certain contract ('a -> t), the array [|Any(3), Any("a")|] is inferred as a array(t). When users try to use Any values, they need to unpack them, process the value inside, and repack them again. Pretty neat, right?

Conclusion

This release will introduce the [@unbox] annotation to give us better ways to do zero cost interop with variants, records, higher kinded polymorphic functions, and GADTs. Under the hood improvements will give us better performance as well!

We are really excited about these changes, and we hope so are you. Please check out our newest bs-platform@7.0.2-dev.1 release and let us know if you find any issues!

A detailed list of changes is available here: https://github.com/BuckleScript/bucklescript/blob/master/Changes.md#702

Happy hacking!

Appendix

A sophiscated explanation on why unboxed lifts some OCaml's type system limitations

structural types (objects, classes, polymorphic variants, functions, etc) in OCaml are regular types, ocaml always do the expansion when dealing with such types, there is some limitations for such structural types, for example, non regular definitions are not allowed. Non structural types (variants, records) does not have such limitations, with unboxed, we can use non structural types as an indirection without changing its runtime representations.

What's new in release 7

November 18, 2019

The new major version of BuckleScript is coming -7.0.0-dev.1 released for testing!

We are maintaining 5.* and 6.* for OCaml 4.02 and 4.06 for a while, since this release we are moving forward and focusing on release 7.* (for OCaml 4.06).

This is a major release comes with lots of nice features listed here.

We talk about some highlights here

refmt upgraded to latest, it comes with better error message
OCaml Records compiled into JS objects

This is one of the most desired features, it is finally landed.

See the generated code below for excitement!

type t = {
  x: int,
  y: int,
  z: int,
};

let obj = {x: 3, y: 2, z: 2};

let obj2 = {...obj, y: 4};

type t = {
  x : int;
  y : int;
  z : int 
}

let obj = { x = 3 ; y = 2; z = 2}

let obj2 = { obj with y = 4}

var obj2 = {
  x: 3,
  y: 4,
  z: 2
};

var obj = {
  x: 3,
  y: 2,
  z: 2
};

This new change makes record much more useful and its interaction with private type; unboxed option type will make interop with JS much nicer!

As always, we continue improving our optimizer in various commits, we belive that not only a better language but also an implementation of high quality is key to push typed functional programming into industry.

Happy hacking!

Why it is tricky to preserve stack-traces in ReasonML exceptions

What's the classical ReasonML exception encoding?

What's the new exception encoding?

What does that mean for JS interop?

Caveat

Bonus

Generalized uncurry calling convention support

unit value is compiled into undefined

Various improvements in code generation

Why we need native uncurried calling convention

Generalized uncurried calling convention in this release

In memory loading stdlib

let %private

Int64 performance optimization

File level compilation flags

Loading stdlib from memory

How does it work

What's the benefit?

Some internal changes

Raw JavaScript Parsing/Checking

Unboxed Types

Uniform Warning System

Conclusion

Appendix

`unit` value is compiled into `undefined`