The Scala compiler has to do a lot of work to turn your code into jvm bytecode that can be executed. For several reasons, this work is broken into steps which are executed sequentially, the so called phases of the compiler.
The Phases
If you want to see the phases listed, simply run in your terminal:
$ scalac -Xshow-phases
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
phase name id description
---------- -- -----------
parser 1 parse source into ASTs, perform simple desugaring
namer 2 resolve names, attach symbols to named trees
packageobjects 3 load package objects
typer 4 the meat and potatoes: type the trees
patmat 5 translate match expressions
superaccessors 6 add super accessors in traits and nested classes
extmethods 7 add extension methods for inline classes
pickler 8 serialize symbol tables
refchecks 9 reference/override checking, translate nested objects
uncurry 10 uncurry, translate function values to anonymous classes
tailcalls 11 replace tail calls by jumps
specialize 12 specialized-driven class and method specialization
explicitouter 13 this refs to outer pointers
erasure 14 erase types, add interfaces for traits
posterasure 15 clean up erased inline classes
lazyvals 16 allocate bitmaps, translate lazy vals into lazified defs
lambdalift 17 move nested functions to top level
constructors 18 move field definitions into constructors
flatten 19 eliminate inner classes
mixin 20 mixin composition
cleanup 21 platform-specific cleanups, generate reflective calls
delambdafy 22 remove lambdas
icode 23 generate portable intermediate code
jvm 24 generate JVM bytecode
terminal 25 the last phase during a compilation run
As you can see above, at the time of this writing the compiler uses 25 phases that start with a parser of the code and end with the terminal phase, that ends the compilation.
Please note that the reality of the compiler is not as clean as this documentation would have you believe. Due to efficiency and other concerns, some phases are intertwined and they should be considered a single phase in itself. In other cases some phases do more than advertised.
Phases in the Compiler
If you open your compiler codebase and go to class Global
, line 405, you will see:
// phaseName = "parser"
lazy val syntaxAnalyzer = new {
val global: Global.this.type = Global.this
} with SyntaxAnalyzer {
val runsAfter = List[String]()
val runsRightAfter = None
override val initial = true
}
This corresponds to the first phase of the compiler, parser
. All the phases are defined as lazy val
, in order, after this entry.
If you want to learn more about a specific phase you can always go to the class implementing it. In the example above, that would be SyntaxAnalyzer
. Most of the phases have documentation that explains the function of the phase.
Some things that may not be obvious from reading the documentation of the phases, but are relevant:
parser
, the very first phase, tackles a lot of syntactic sugar. For example, at the end of this phase for-comprehensions will have been transformed into a series of flatMap, map, and filter.namer
,packageobjects
, andtyper
(phases 2,3, and 4) are effectively a single phase. Although split into 3 elements for implementation reasons, the dependencies between them means you can consider it one.- After
typer
(phase 4) completes, all the typechecking has been completed. For example, any error due to Type classes missing an implicit implementation will happen at this stage. The remaining 20 phases work with code that type-checks. Guaranteed ;) pickler
(phase 8) generates attributes for class files which are later on used for compilation against binaries. This is what allows you to use a certain jar file as a library without having to bring the source code of the library along.uncurry
(phase 10) turns functions (likeval f: Int => Int
) to anonymous classes. In JVM 8 this benefits from new structures introduced to work with lambdas.lambalift
(phase 17) lifts nested functions outside of methods, a different task thatuncurry
constructors
(phase 18) generates the constructors for the classes. Keep in mind Scala constructors are very different to the ones expected by the jvm (for instance any expression in the body of a class is executed during construction) so there’ quite a bit going on in this phase.
Example
If we talk about phases we want to mention the -Xprint
flag in scalac
. This flag allows you to see the differences between phases.
To test it, create a file with the following code:
class Foo{
val i = 23
val j = "blah"
val k = i+j
def wibble = {
for(c <- k) yield c*2
}
}
and compile it with:
$ scalac -Xprint:all <youfile>.scala
You will see the compiler outputs the code as it’s seen after each phase that modified something. For example the output will show that after the parser phase the output looks like:
[[syntax trees at end of parser]] // sample.scala
package <empty> {
class Foo extends scala.AnyRef {
def <init>() = {
super.<init>();
()
};
val i = 23;
val j = "blah";
val k = i.$plus(j);
def wibble = k.map(((c) => c.$times(2)))
}
}
You can notice the class is now inside a package empty
as we didn’t declare any package. You can see a method <init>
that acts like a pseudo-constructor has been added, the +
operation assigned to k
has been expanded to a method $plus
, and the for comprehension of wibble
has been expanded to a map
call.
Quite a lot has changed in just the first phase :) After typing finishes we get the following output:
[[syntax trees at end of namer]] // sample.scala: tree is unchanged since parser
[[syntax trees at end of packageobjects]] // sample.scala: tree is unchanged since parser
[[syntax trees at end of typer]] // sample.scala
package <empty> {
class Foo extends scala.AnyRef {
def <init>(): Foo = {
Foo.super.<init>();
()
};
private[this] val i: Int = 23;
<stable> <accessor> def i: Int = Foo.this.i;
private[this] val j: String = "blah";
<stable> <accessor> def j: String = Foo.this.j;
private[this] val k: String = Foo.this.i.+(Foo.this.j);
<stable> <accessor> def k: String = Foo.this.k;
def wibble: scala.collection.immutable.IndexedSeq[Int] = scala.this.Predef.augmentString(Foo.this.k).map[Int, scala.collection.immutable.IndexedSeq[Int]](((c: Char) => c.*(2)))(scala.this.Predef.fallbackStringCanBuildFrom[Int])
}
}
The main difference is that types are assigned. For example our val i
has type Int
, as do the other val
of def
in the class. We can also see the synthetic methods added to access the values of i
, j
, and k
.
You can experiment around with your own files to see changes in the output of scalac
. A warning though, avoid using extends App
constructs as the compiler treats them in a specific way and that may clutter your output.
The full output for the example above follows:
scalac -Xprint:all sample.scala
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
[[syntax trees at end of parser]] // sample.scala
package <empty> {
class Foo extends scala.AnyRef {
def <init>() = {
super.<init>();
()
};
val i = 23;
val j = "blah";
val k = i.$plus(j);
def wibble = k.map(((c) => c.$times(2)))
}
}
[[syntax trees at end of namer]] // sample.scala: tree is unchanged since parser
[[syntax trees at end of packageobjects]] // sample.scala: tree is unchanged since parser
[[syntax trees at end of typer]] // sample.scala
package <empty> {
class Foo extends scala.AnyRef {
def <init>(): Foo = {
Foo.super.<init>();
()
};
private[this] val i: Int = 23;
<stable> <accessor> def i: Int = Foo.this.i;
private[this] val j: String = "blah";
<stable> <accessor> def j: String = Foo.this.j;
private[this] val k: String = Foo.this.i.+(Foo.this.j);
<stable> <accessor> def k: String = Foo.this.k;
def wibble: scala.collection.immutable.IndexedSeq[Int] = scala.this.Predef.augmentString(Foo.this.k).map[Int, scala.collection.immutable.IndexedSeq[Int]](((c: Char) => c.*(2)))(scala.this.Predef.fallbackStringCanBuildFrom[Int])
}
}
[[syntax trees at end of patmat]] // sample.scala: tree is unchanged since typer
[[syntax trees at end of superaccessors]] // sample.scala: tree is unchanged since typer
[[syntax trees at end of extmethods]] // sample.scala: tree is unchanged since typer
[[syntax trees at end of pickler]] // sample.scala: tree is unchanged since typer
[[syntax trees at end of refchecks]] // sample.scala: tree is unchanged since typer
[[syntax trees at end of uncurry]] // sample.scala
package <empty> {
class Foo extends Object {
def <init>(): Foo = {
Foo.super.<init>();
()
};
private[this] val i: Int = 23;
<stable> <accessor> def i(): Int = Foo.this.i;
private[this] val j: String = "blah";
<stable> <accessor> def j(): String = Foo.this.j;
private[this] val k: String = Foo.this.i().+(Foo.this.j());
<stable> <accessor> def k(): String = Foo.this.k;
def wibble(): scala.collection.immutable.IndexedSeq[Int] = scala.this.Predef.augmentString(Foo.this.k()).map[Int, scala.collection.immutable.IndexedSeq[Int]]({
@SerialVersionUID(value = 0) final <synthetic> class $anonfun extends scala.runtime.AbstractFunction1[Char,Int] with Serializable {
def <init>(): <$anon: Char => Int> = {
$anonfun.super.<init>();
()
};
final def apply(c: Char): Int = c.*(2)
};
(new <$anon: Char => Int>(): Char => Int)
}, scala.this.Predef.fallbackStringCanBuildFrom[Int]())
}
}
[[syntax trees at end of tailcalls]] // sample.scala: tree is unchanged since uncurry
[[syntax trees at end of specialize]] // sample.scala: tree is unchanged since uncurry
[[syntax trees at end of explicitouter]] // sample.scala
package <empty> {
class Foo extends Object {
def <init>(): Foo = {
Foo.super.<init>();
()
};
private[this] val i: Int = 23;
<stable> <accessor> def i(): Int = Foo.this.i;
private[this] val j: String = "blah";
<stable> <accessor> def j(): String = Foo.this.j;
private[this] val k: String = Foo.this.i().+(Foo.this.j());
<stable> <accessor> def k(): String = Foo.this.k;
def wibble(): scala.collection.immutable.IndexedSeq[Int] = scala.this.Predef.augmentString(Foo.this.k()).map[Int, scala.collection.immutable.IndexedSeq[Int]]({
@SerialVersionUID(value = 0) final <synthetic> class $anonfun extends scala.runtime.AbstractFunction1[Char,Int] with Serializable {
def <init>($outer: Foo.this.type): <$anon: Char => Int> = {
$anonfun.super.<init>();
()
};
final def apply(c: Char): Int = c.*(2);
<synthetic> <paramaccessor> <artifact> private[this] val $outer: Foo.this.type = _;
<synthetic> <stable> <artifact> def $outer(): Foo.this.type = $anonfun.this.$outer
};
(new <$anon: Char => Int>(Foo.this): Char => Int)
}, scala.this.Predef.fallbackStringCanBuildFrom[Int]())
}
}
[[syntax trees at end of erasure]] // sample.scala
package <empty> {
class Foo extends Object {
def <init>(): Foo = {
Foo.super.<init>();
()
};
private[this] val i: Int = 23;
<stable> <accessor> def i(): Int = Foo.this.i;
private[this] val j: String = "blah";
<stable> <accessor> def j(): String = Foo.this.j;
private[this] val k: String = Foo.this.i().+(Foo.this.j());
<stable> <accessor> def k(): String = Foo.this.k;
def wibble(): scala.collection.immutable.IndexedSeq = new collection.immutable.StringOps(scala.this.Predef.augmentString(Foo.this.k()).$asInstanceOf[String]()).map({
@SerialVersionUID(value = 0) final <synthetic> class $anonfun extends scala.runtime.AbstractFunction1 with Serializable {
def <init>($outer: Foo): <$anon: Function1> = {
$anonfun.super.<init>();
()
};
final def apply(c: Char): Int = c.*(2);
<synthetic> <paramaccessor> <artifact> private[this] val $outer: Foo = _;
<synthetic> <stable> <artifact> def $outer(): Foo = $anonfun.this.$outer;
final <bridge> <artifact> def apply(v1: Object): Object = scala.Int.box($anonfun.this.apply(unbox(v1)))
};
(new <$anon: Function1>(Foo.this): Function1)
}, scala.this.Predef.fallbackStringCanBuildFrom()).$asInstanceOf[scala.collection.immutable.IndexedSeq]()
}
}
[[syntax trees at end of posterasure]] // sample.scala
package <empty> {
class Foo extends Object {
def <init>(): Foo = {
Foo.super.<init>();
()
};
private[this] val i: Int = 23;
<stable> <accessor> def i(): Int = Foo.this.i;
private[this] val j: String = "blah";
<stable> <accessor> def j(): String = Foo.this.j;
private[this] val k: String = Foo.this.i().+(Foo.this.j());
<stable> <accessor> def k(): String = Foo.this.k;
def wibble(): scala.collection.immutable.IndexedSeq = new collection.immutable.StringOps(scala.this.Predef.augmentString(Foo.this.k())).map({
@SerialVersionUID(value = 0) final <synthetic> class $anonfun extends scala.runtime.AbstractFunction1 with Serializable {
def <init>($outer: Foo): <$anon: Function1> = {
$anonfun.super.<init>();
()
};
final def apply(c: Char): Int = c.*(2);
<synthetic> <paramaccessor> <artifact> private[this] val $outer: Foo = _;
<synthetic> <stable> <artifact> def $outer(): Foo = $anonfun.this.$outer;
final <bridge> <artifact> def apply(v1: Object): Object = scala.Int.box($anonfun.this.apply(unbox(v1)))
};
(new <$anon: Function1>(Foo.this): Function1)
}, scala.this.Predef.fallbackStringCanBuildFrom()).$asInstanceOf[scala.collection.immutable.IndexedSeq]()
}
}
[[syntax trees at end of lazyvals]] // sample.scala: tree is unchanged since posterasure
[[syntax trees at end of lambdalift]] // sample.scala
package <empty> {
class Foo extends Object {
def <init>(): Foo = {
Foo.super.<init>();
()
};
private[this] val i: Int = 23;
<stable> <accessor> def i(): Int = Foo.this.i;
private[this] val j: String = "blah";
<stable> <accessor> def j(): String = Foo.this.j;
private[this] val k: String = Foo.this.i().+(Foo.this.j());
<stable> <accessor> def k(): String = Foo.this.k;
def wibble(): scala.collection.immutable.IndexedSeq = new collection.immutable.StringOps(scala.this.Predef.augmentString(Foo.this.k())).map({
(new <$anon: Function1>(Foo.this): Function1)
}, scala.this.Predef.fallbackStringCanBuildFrom()).$asInstanceOf[scala.collection.immutable.IndexedSeq]();
@SerialVersionUID(value = 0) final <synthetic> class $anonfun$wibble$1 extends scala.runtime.AbstractFunction1 with Serializable {
def <init>($outer: Foo): <$anon: Function1> = {
$anonfun$wibble$1.super.<init>();
()
};
final def apply(c: Char): Int = c.*(2);
<synthetic> <paramaccessor> <artifact> private[this] val $outer: Foo = _;
<synthetic> <stable> <artifact> def $outer(): Foo = $anonfun$wibble$1.this.$outer;
final <bridge> <artifact> def apply(v1: Object): Object = scala.Int.box($anonfun$wibble$1.this.apply(scala.Char.unbox(v1)))
}
}
}
[[syntax trees at end of constructors]] // sample.scala
package <empty> {
class Foo extends Object {
private[this] val i: Int = _;
<stable> <accessor> def i(): Int = Foo.this.i;
private[this] val j: String = _;
<stable> <accessor> def j(): String = Foo.this.j;
private[this] val k: String = _;
<stable> <accessor> def k(): String = Foo.this.k;
def wibble(): scala.collection.immutable.IndexedSeq = new collection.immutable.StringOps(scala.this.Predef.augmentString(Foo.this.k())).map({
(new <$anon: Function1>(Foo.this): Function1)
}, scala.this.Predef.fallbackStringCanBuildFrom()).$asInstanceOf[scala.collection.immutable.IndexedSeq]();
@SerialVersionUID(value = 0) final <synthetic> class $anonfun$wibble$1 extends scala.runtime.AbstractFunction1 with Serializable {
final def apply(c: Char): Int = c.*(2);
final <bridge> <artifact> def apply(v1: Object): Object = scala.Int.box($anonfun$wibble$1.this.apply(scala.Char.unbox(v1)));
def <init>($outer: Foo): <$anon: Function1> = {
$anonfun$wibble$1.super.<init>();
()
}
};
def <init>(): Foo = {
Foo.super.<init>();
Foo.this.i = 23;
Foo.this.j = "blah";
Foo.this.k = Foo.this.i().+(Foo.this.j());
()
}
}
}
[[syntax trees at end of flatten]] // sample.scala
package <empty> {
class Foo extends Object {
private[this] val i: Int = _;
<stable> <accessor> def i(): Int = Foo.this.i;
private[this] val j: String = _;
<stable> <accessor> def j(): String = Foo.this.j;
private[this] val k: String = _;
<stable> <accessor> def k(): String = Foo.this.k;
def wibble(): scala.collection.immutable.IndexedSeq = new collection.immutable.StringOps(scala.this.Predef.augmentString(Foo.this.k())).map({
(new <$anon: Function1>(Foo.this): Function1)
}, scala.this.Predef.fallbackStringCanBuildFrom()).$asInstanceOf[scala.collection.immutable.IndexedSeq]();
def <init>(): Foo = {
Foo.super.<init>();
Foo.this.i = 23;
Foo.this.j = "blah";
Foo.this.k = Foo.this.i().+(Foo.this.j());
()
}
};
@SerialVersionUID(value = 0) final <synthetic> class anonfun$wibble$1 extends scala.runtime.AbstractFunction1 with Serializable {
final def apply(c: Char): Int = c.*(2);
final <bridge> <artifact> def apply(v1: Object): Object = scala.Int.box(anonfun$wibble$1.this.apply(scala.Char.unbox(v1)));
def <init>($outer: Foo): <$anon: Function1> = {
anonfun$wibble$1.super.<init>();
()
}
}
}
[[syntax trees at end of mixin]] // sample.scala: tree is unchanged since flatten
[[syntax trees at end of cleanup]] // sample.scala: tree is unchanged since flatten
[[syntax trees at end of delambdafy]] // sample.scala: tree is unchanged since flatten
[[syntax trees at end of icode]] // sample.scala: tree is unchanged since flatten
[[syntax trees at end of jvm]] // sample.scala: tree is unchanged since flatten
Want to contribute? Edit this file