diff options
Diffstat (limited to 'src/boot')
| -rw-r--r-- | src/boot/README | 405 | ||||
| -rw-r--r-- | src/boot/fe/ast.ml | 27 | ||||
| -rw-r--r-- | src/boot/me/trans.ml | 33 | ||||
| -rw-r--r-- | src/boot/me/typestate.ml | 50 | ||||
| -rw-r--r-- | src/boot/util/common.ml | 1 |
5 files changed, 497 insertions, 19 deletions
diff --git a/src/boot/README b/src/boot/README new file mode 100644 index 00000000..30c1f0e9 --- /dev/null +++ b/src/boot/README @@ -0,0 +1,405 @@ +An informal guide to reading and working on the rustboot compiler. +================================================================== + +First off, my sincerest apologies for the lightly-commented nature of the +compiler, as well as the general immaturity of the codebase; rustboot is +intended to be discarded in the near future as we transition off it, to a +rust-based, LLVM-backed compiler. It has taken longer than expected for "the +near future" to arrive, and here we are published and attracting contributors +without a good place for them to start. It will be a priority for the next +little while to make new contributors feel welcome and oriented within the +project; best I can do at this point. We were in a tremendous rush even to get +everything organized to this minimal point. + +If you wish to expand on this document, or have one of the +slightly-more-familiar authors add anything else to it, please get in touch or +file a bug. Your concerns are probably the same as someone else's. + + + +High-level concepts, invariants, 30,000-ft view +=============================================== + +Rustboot has 3 main subdirectories: fe, me, and be (front, mid, back +end). Helper modules and ubiquitous types are found in util/. + +The entry-point for the compiler is driver/main.ml, and this file sequences +the various parts together. + + +The 4 central data structures: +------------------------------ + +#1: fe/ast.ml defines the AST. The AST is treated as immutable after parsing + despite containing some mutable types (hashtbl and such). Many -- though + not all -- nodes within this data structure are wrapped in the type 'a + identified. This is important. An "identified" AST node is one that the + parser has marked with a unique node_id value. This node_id is used both + to denote a source location and, more importantly, to key into a large + number of tables later in the compiler. Most additional calculated + properties of a program that the compiler derives are keyed to the node_id + of an identified node. + + The types 'a identified, node_id and such are in util/common.ml + + +#2: me/semant.ml defines the Semant.ctxt structure. This is a record of + tables, almost all of which are keyed by node_id. See previous comment + regrding node_id. The Semant module is open in most of the modules within + the me/ directory, and they all refer liberally to the ctxt tables, either + directly or via helper functions in semant. Semant also defines the + mid-end pass-management logic, lookup routines, type folds, and a variety + of other miscallaneous semantic-analysis helpers. + + +#3: be/il.ml defines the IL. This is a small, typed IL based on a type system + that is relatively LLVM-ish, and a control-flow system that is *not* + expression/SSA based like LLVM. It's much dumber than that. The root of + the interesting types in this file is the type 'emitter', which is a + growable buffer along with a few counters. An emitter is essentially a + buffer of quads. A quad, in turn, is a primitive virtual instruction + ('quad' because it is in its limit a 3-address machine, plus opcode) which + we then ... tend to turn directly into x86 anyways. Sorry; it wasn't clear + during initial construction that we'd wind up stopping at x86, so the IL + is probably superfluous, but there it is. + + The IL types are operand = cell | immediate, and cell = reg | mem. Plus a + certain quantity of special-casing and noise for constant-pointer + propagation and addressing modes and whatnot. + + +#4: be/asm.ml defines the Asm.frag type, which is a "chunk of binary-ish + stuff" to put in an output file. Words, bytes, lazily-resolved fixups, + constant expressions, 0-terminated strings, alignment boundaries, etc. You + will hopefully not need to produce a lot of this yourself; most of this is + already being emitted. + + An important type that gets resolved here is fixup, from util/common.ml. + Fixups are things you can wrap around a frag using an Asm.DEF frag, which + get their size and position (both in-file and in-memory) calculated at + asm-time; but you can refer to them before they're resolved. So any time + the compiler needs to refer to "the place / size this thingy will be, when + it finally gets boiled down to frags and emitted" we generate a fixup and + use that. Functions and static data structures, for example, tend to get + fixups assigned to them early on in the middle-end of the compiler. + + +Control and information flow within the compiler: +------------------------------------------------- + +- driver/main.ml assumes control on startup. Options are parsed, platform is + detected, etc. + + +- fe/lexer.ml does lexing in any case; fe/parser.ml holds the fundamental + parser-state and parser-combinator functions. Parsing rules are split + between 3 files: fe/cexp.ml, fe/pexp.ml, and fe/item.ml. This split + represents the general structure of the grammar(s): + + - The outermost grammar is called "cexp" (crate expression), and is an + expression language that describes the crate directives found in crate + files. It's evaluated inside the compiler. + + - The next grammar is "item", which is a statement language that describes + the directives, declarations and statements found in source files. If + you compile a naked source file, you jump straight to item and then + synthesize a simple crate structure around the result. + + - The innermost grammar is "pexp" (parsed expression), and is an + expression language used for the shared expression grammar within both + cexp and item. Pexps within cexp are evaluated in the compiler + (non-constant, complex cexps are errors) whereas pexps within items are + desugared to statements and primitive expressions. + + - The AST is the output from the item grammar. Pexp and cexp do not escape + the front-end. + + +- driver/main.ml then builds a Semant.ctxt and threads it through the various + middle-end passes. Each pass defines one or more visitors, which is an FRU + copy of the empty_visitor in me/walk.ml. Each visitor performs a particular + task, encapsulates some local state in local variables, and leaves its + results in a table. If the table it's calculating is pass-local, it will be + a local binding within the pass; if it's to be shared with later passes, it + will be a table in Semant.ctxt. Pass order is therefore somewhat important, + so I'll describe it here: + + - me/resolve.ml looks up names and connects them to definitions. This + includes expanding all types (as types can occur within names, as part + of a parametric name) and performing all import/export/visibility + judgments. After resolve, we should not be doing any further name-based + lookups (with one exception: typestate does some more name + lookup. Subtle reason, will return to it). + + Resolve populates several of the tables near the top of Semant.ctxt: + + ctxt_all_cast_types + ctxt_all_defns + ctxt_all_item_names + ctxt_all_item_types + ctxt_all_lvals + ctxt_all_stmts + ctxt_all_type_items + ctxt_block_items + ctxt_block_slots + ctxt_frame_args + ctxt_lval_to_referent + ctxt_node_referenced + ctxt_required_items + ctxt_slot_is_arg + ctxt_slot_keys + + The most obviously critical of these are lval_to_referent and all_defns, + which connect subsequent visitors from a reference node to its referent + node, and catalogue all the possible things a referent may be. + + Part of resolving that is perhaps not obvious is the task of resolving + and normalizing recursive types. This is what TY_iso is for. Recursive + types in rust have to pass through a tag type on their recursive edge; + TY_iso is an iso-recursive group of tags that refer only to one another; + within a TY_iso, the type term "TY_idx n" refers to "the nth member of + the current TY_iso". Resolve is responsible for finding such groups and + tying them into such closed-form knots. + + TY_name should be completely eliminated in any of the types exiting + resolve. + + + - me/type.ml is a unification-based typechecker and inference engine. This + is as textbook-y as we could make it. It rewrites "auto" slots in the + ctxt_all_defns table when it completes (these are the slots with None as + their Ast.slot_ty). + + This file is organized around tyspecs and tyvars. A tyspec is a + constraint on an unknown type that is implied by its use; tyspecs are + generated during the AST-walk, placed in ref cells (tyvars), and the + cells are and unified with one another. If two tyvars unify, then a new + constraint is created with the tighter of the two and the two previous + tyvars are updated to point to the unified spec. Ideally all constraints + eventually run into a source of a concrete type (or a type otherwise + uniquely-determined by its tyspecs). If not, the type is underdetermined + and we get a type error. Similarly if two tyvars that are supposed to + unify clash in some way (integer unify-with string, say) then there is + also a type error. + + + - me/typestate.ml is a dataflow-based typestate checker. It is responsible + for ensuring all preconditions are met, including init-before-use. It + also determines slot lifecycle boundaries, and populates the context + tables: + + ctxt_constr_ids + ctxt_constrs + ctxt_copy_stmt_is_init + ctxt_post_stmt_slot_drops + ctxt_postconditions + ctxt_poststates + ctxt_preconditions + ctxt_prestates + + It is organized around constr_keys, a bunch of bitsets, and a CFG. + + A constr_key is a normalized value representing a single constraint that + we wish to be able to refer to within a typestate. Every constr_key gets + a bit number assigned to it. A condition (and a typestate) is a + bit-vector, in which the set bits indicate the constr_keys (indexed by + associatd number) that hold in the condition/typestate. + + There are 4 such bitsets generated for each node in the CFG: + precondition/postcondition and prestate/poststate. The visitors here + figure out all the constr_keys we'll need, then assign all the pre/post + conditions, generate the CFG, calculate the typestates from the CFG, and + check that every typestate satisfies its precondition. + + (Due to the peculiarity that types are pure terms and are not 'a + identified in our AST, we have to do some name-lookup in here as well + when normalizing the const_keys). + + + - Effect is relatively simple: it calculates the effect of each type and + item, and checks that they either match their declarations or are + authorized to be lying. + + + - Loop is even simpler: it calculates loop-depth information for later use + generating foreach loops. It populates the context tables: + + ctxt_block_is_loop_body + ctxt_slot_loop_depths + ctxt_stmt_loop_depths + + + - Alias checks slot-aliasing to ensure none of the rules are broken about + simultaneous aliases and such. It also populates the table + ctxt_slot_is_aliased. + + + - Layout determines the layout of frames, arguments, objects, closures and + such. This includes deciding which slot should go in a vreg and + generating fixups for all frame-spill regions. It populates the context + tables: + + ctxt_block_is_loop_body + ctxt_call_sizes + ctxt_frame_blocks + ctxt_frame_sizes + ctxt_slot_is_obj_state + ctxt_slot_offsets + ctxt_slot_vregs + ctxt_spill_fixups + + There is a useful chunk of ASCII-art in the leading comment of layout, + if you want to see how a frame goes together, I recommend reading it. + + + - Trans is the big one. This is the "translate AST to IL" pass, and it's a + bit of a dumping ground, sadly. Probably 4x the size of any other + pass. Stuff that is common to the x86 and LLVM backends is factored out + into transutil.ml, but it hardly helps. Suggestions welcome for + splitting it further. + + Trans works *imperatively*. It maintains a stack of emitters, one per + function (or helper-function) and emits Il.quads into the top-of-stack + emitter into while it walks the statements of each function. If at any + point it needs to pause to emit a helper function ("glue function") it + pushes a new emitter onto the stack and emits into that. + + Trans populates the context tables: + + ctxt_all_item_code + ctxt_block_fixups + ctxt_data + ctxt_file_code + ctxt_file_fixups + ctxt_fn_fixups + ctxt_glue_code + + The entries in the tables ending in _code are of type Semant.code, which + is an abstract type covering both function and glue-function code; each + holds an executable block of quads, plus an aggregate count of vregs and + a reference to the spill fixup for that code. + + +- Once it completes trans, driver/main.ml does the "finishing touches": + register allocates each emitted code value (be/ra.ml), emits dwarf for the + crate (me/dwarf.ml), selects instructions (be/x86.ml), then selects one of + the object-file backends (be/elf.ml, be/macho.ml or be/pe.ml) and emits the + selected Asm.frag to it. Hopefully little of this will require further work; + the most incomplete module here is probably dwarf.ml but the remainder are + mostly stable and don't tend to change much, aside from picking bugs out of + them. + + + +Details and curiosities to note along the way: +============================================== + +- Where you might expect there to be a general recursive expression type for + 'expr', you'll find only a very limited non-recursive 3-way switch: binary, + unary, or atom; where atom is either a literal or an lval. This is because + all the "big" expressions (pexps) were boiled off during the desugaring + phase in the frontend. + + +- There are multiple ways to refer to a path. Names, lvals and cargs all + appear to have similar structure (and do). They're all subsets of the + general path grammar, so all follow the rough shape of being either a base + anchor-path or an ext (extension) path with structural recursion to the + left. + + Cargs (constraint arguments) are the sort of paths that can be passed to + constraints in the typestate system, and can contain the special symbol "*" + in the grammar, meaning "thing I am attached to". This is the symbol + BASE_formal in the carg_base type. + + Names are the sort of paths that refer to types or other items. Not slots. + + Lvals are the sort of paths that *might* refer to slots, but we don't + generally know. So they can contain the dynamic-indexing component + COMP_atom. For example, x.(1 + 2).y is an lval. + + +- Only one of these forms is 'a identified: an lval. And moreover, only the + lval *base* is identified; the remainder of the path has to be projected + forward through the referent after lookup. This also means that when you + lookup anything else by name, you have to be using the result immediately, + not storing it in a table for later. + + +- Types are not 'a identified. This means that you (generally) cannot refer to + a *particular* occurrence of a type in the AST and associate information + with it. Instead, we treat types as "pure terms" (not carrying identity) and + calculate properties of them on the fly. For this we use a general fold + defined in me/semant.ml, the family of functions held in a ty_fold + structure, and passed to fold_ty. + + +- There is a possibly-surprising type called "size" in util/common. This is a + type representing a "size expression" that may depend on runtime + information, such as the type descriptors passed to a frame at runtime. This + exists because our type-parameterization scheme is, at the moment, + implemented by passing type descriptors around at runtime, not + code-expansion a la C++ templates. So any time we have a translated indexing + operation or such that depends on a type parameter, we wind up with a size + expression including SIZE_param_size or SIZE_param_align, and have to do + size arithmetic at runtime. Upstream of trans, we generate sizes willy-nilly + and then decide in trans, x86, and dwarf whether they can be emitted + statically or via runtime calculation at the point of use. + + +- Trans generates position-independent code (PIC). This means that it never + refers to the exact position of a fixup in memory at load-time, always the + distance-to-a-fixup from some other fixup, and/or current PC. On x86 this + means we wind up copying the "get next pc thunk" trick used on linux + systems, and/or storing "crate relative" addresses. The runtime and compiler + "know" (unfortunately sometimes quite obscurely) that an immediate pointer + should be encoded as relative-to a given displacement base, and work with + those as necessary. Similarly, they emit code to reify pointer immediates + (add the displacements to displacement-bases) before handing them off to + (say) C library functions that expect "real" pointers. This is all somewhat + messy. + + +- There is one central static data structure, "rust_crate", which is emitted + into the final loadable object and contains pointers to all subsequent + information the runtime may be interested in. It also serves as the + displacement base for a variety of PIC-ish displacements stored + elsewhere. When the runtime loads a crate, it dlsym()s rust_crate, and then + digs around in there. It's the entry-point for crawling the crate's + structure from outside. Importantly: it also contains pointers to the dwarf. + + +- Currently we drive linking off dwarf. That is: when a crate needs to 'use' + an item from another dwarf crate, we dlopen / LoadLibrary and find the + "rust_crate" value, follow its pointers to dwarf tables, and scan around the + dwarf DIE tree resolving the hierarchical name of the used item. This may + change, we decided to recycle dwarf for this purpose early in the language + evolution and may, given the number of simplifications that have occurred + along the way, be able to fall back to C "mangled name" linkage at some + point. Though that decision carries a number of serious constraints, and + should not be taken lightly. + + + +Probably-bad ideas we will want to do differently in the self-hosted compiler: +============================================================================== + +- We desugar too early in rustboot and should preserve the pexp structure + until later. Dherman is likely to argue for movement to a more + expression-focused grammar. This may well happen. + +- Multiple kinds of paths enforced by numerous nearly-isomorphic ML type + constructors is pointless once we're in rust; we can just make type + abbreviations that carry constraints like path : is_name(*) or such. + +- Storing auxiliary information in semant tables is awkward, and we should + figure out a suitably rusty idiom for decorating AST nodes in-place. + Inter-pass dependencies should be managed by augmenting the AST with + ever-more constraints (is_resolved(ast), is_typechecked(ast), etc.) + +- Trans should be organized as pure and value-producing code, not imperatively + emitting quads into emitters. LLVM will enforce this anwyays. See what + happened in lltrans.ml if you're curious what it'll look (more) like. + +- The PIC scheme will have to change, hopefully get much easier. + diff --git a/src/boot/fe/ast.ml b/src/boot/fe/ast.ml index f3991e9b..0b74da3c 100644 --- a/src/boot/fe/ast.ml +++ b/src/boot/fe/ast.ml @@ -1339,6 +1339,33 @@ and fmt_crate (ff:Format.formatter) (c:crate) : unit = let (view,items) = c.node.crate_items in fmt_mod_view ff view; fmt_mod_items ff items +;; + +let ty_children (ty:ty) : ty array = + let children_of_ty_tag ty_tag = Array.concat (htab_vals ty_tag) in + let children_of_ty_fn ty_fn = + let (ty_sig, _) = ty_fn in + let in_slots = ty_sig.sig_input_slots in + let slots = Array.append in_slots [| ty_sig.sig_output_slot |] in + arr_filter_some (Array.map (fun slot -> slot.slot_ty) slots) + in + match ty with + TY_tup tys -> tys + | TY_vec ty' | TY_chan ty' | TY_port ty' | TY_box ty' | TY_mutable ty' + | TY_constrained (ty', _) -> + [| ty' |] + | TY_rec fields -> Array.map snd fields + | TY_tag ty_tag -> children_of_ty_tag ty_tag + | TY_iso ty_iso -> + children_of_ty_tag (ty_iso.iso_group.(ty_iso.iso_index)) + | TY_fn ty_fn -> children_of_ty_fn ty_fn + | TY_obj (_, methods) -> + Array.concat (List.map children_of_ty_fn (htab_vals methods)) + | TY_any | TY_nil | TY_bool | TY_mach _ | TY_int | TY_uint | TY_char + | TY_str | TY_idx _ | TY_task | TY_native _ | TY_param _ + | TY_named _ | TY_type -> + [| |] +;; let sprintf_expr = sprintf_fmt fmt_expr;; let sprintf_name = sprintf_fmt fmt_name;; diff --git a/src/boot/me/trans.ml b/src/boot/me/trans.ml index f77386a9..85dd5265 100644 --- a/src/boot/me/trans.ml +++ b/src/boot/me/trans.ml @@ -2453,9 +2453,8 @@ let trans_visitor note_drop_step ty "drop_ty: obj path"; let binding = get_element_ptr cell Abi.binding_field_binding in let null_jmp = null_check binding in + let rc_jmp = drop_refcount_and_cmp binding in let obj = deref binding in - let rc = get_element_ptr obj 0 in - let rc_jmp = drop_refcount_and_cmp rc in let tydesc = get_element_ptr obj 1 in let body = get_element_ptr obj 2 in let ty_params = @@ -2505,8 +2504,7 @@ let trans_visitor let _ = check_box_rty cell in let null_jmp = null_check cell in - let rc = box_rc_cell cell in - let j = drop_refcount_and_cmp rc in + let j = drop_refcount_and_cmp cell in (* FIXME (issue #25): check to see that the box has * further box members; if it doesn't we can elide the @@ -2525,7 +2523,7 @@ let trans_visitor note_drop_step ty "drop_ty: done box-drop path"; | MEM_interior when type_is_structured ty -> - note_drop_step ty "drop:ty structured-interior path"; + note_drop_step ty "drop_ty structured-interior path"; iter_ty_parts ty_params cell ty (drop_ty ty_params) curr_iso; note_drop_step ty "drop_ty: done structured-interior path"; @@ -2740,14 +2738,35 @@ let trans_visitor emit (Il.jmp Il.JE Il.CodeNone); j - and drop_refcount_and_cmp (rc:Il.cell) : quad_idx = + and drop_refcount_and_cmp (boxed:Il.cell) : quad_idx = iflog (fun _ -> annotate "drop refcount and maybe free"); + let rc = box_rc_cell boxed in + if cx.ctxt_sess.Session.sess_trace_gc || + cx.ctxt_sess.Session.sess_trace_drop + then + begin + trace_str true "refcount--"; + trace_word true boxed; + trace_word true rc + end; emit (Il.binary Il.SUB rc (Il.Cell rc) one); emit (Il.cmp (Il.Cell rc) zero); let j = mark () in emit (Il.jmp Il.JNE Il.CodeNone); j + and incr_refcount (boxed:Il.cell) : unit = + let rc = box_rc_cell boxed in + if cx.ctxt_sess.Session.sess_trace_gc || + cx.ctxt_sess.Session.sess_trace_drop + then + begin + trace_str true "refcount++"; + trace_word true boxed; + trace_word true rc + end; + add_to rc one + and drop_slot (ty_params:Il.cell) (cell:Il.cell) @@ -2917,7 +2936,7 @@ let trans_visitor | (MEM_rc_struct, MEM_rc_struct) -> (* Lightweight copy: twiddle refcounts, move pointer. *) anno "refcounted light"; - add_to (box_rc_cell src) one; + incr_refcount src; if not initializing then drop_ty ty_params dst dst_ty None; diff --git a/src/boot/me/typestate.ml b/src/boot/me/typestate.ml index 3a13561a..6e0b57e1 100644 --- a/src/boot/me/typestate.ml +++ b/src/boot/me/typestate.ml @@ -988,23 +988,30 @@ let lifecycle_visitor * used later on in translation. *) - let (live_block_slots:(node_id Stack.t) Stack.t) = Stack.create () in + let (live_block_slots:(node_id, unit) Hashtbl.t) = Hashtbl.create 0 in + let (block_slots:(node_id Stack.t) Stack.t) = Stack.create () in let (implicit_init_block_slots:(node_id,node_id) Hashtbl.t) = Hashtbl.create 0 in + let push_slot sl = + Stack.push sl (Stack.top block_slots) + in + let mark_slot_init sl = - Stack.push sl (Stack.top live_block_slots) + Hashtbl.replace live_block_slots sl () in let visit_block_pre b = - Stack.push (Stack.create()) live_block_slots; + Stack.push (Stack.create()) block_slots; begin match htab_search implicit_init_block_slots b.id with None -> () - | Some slot -> mark_slot_init slot + | Some slot -> + push_slot slot; + mark_slot_init slot end; inner.Walk.visit_block_pre b in @@ -1026,7 +1033,7 @@ let lifecycle_visitor let visit_block_post b = inner.Walk.visit_block_post b; - let blk_live = Stack.pop live_block_slots in + let blk_slots = Stack.pop block_slots in let stmts = b.node in let len = Array.length stmts in if len > 0 @@ -1037,9 +1044,22 @@ let lifecycle_visitor Ast.STMT_ret _ | Ast.STMT_be _ -> () (* Taken care of in visit_stmt_post below. *) - | _ -> - let slots = stk_elts_from_top blk_live in - note_drops s slots + | _ -> + (* The blk_slots stack we have has accumulated slots in + * declaration order as we walked the block; the top of the + * stack is the last-declared slot. We want to generate + * slot-drop obligations here for the slots in top-down order + * (starting with the last-declared) but only hitting those + * slots that actually got initialized (went live) at some + * point in the block. + *) + let slots = stk_elts_from_top blk_slots in + let live = + List.filter + (fun i -> Hashtbl.mem live_block_slots i) + slots + in + note_drops s live end; in @@ -1081,6 +1101,9 @@ let lifecycle_visitor init_lval lv_dst end; + | Ast.STMT_decl (Ast.DECL_slot (_, sloti)) -> + push_slot sloti.id + | Ast.STMT_init_rec (lv_dst, _, _) | Ast.STMT_init_tup (lv_dst, _) | Ast.STMT_init_vec (lv_dst, _) @@ -1107,7 +1130,7 @@ let lifecycle_visitor (fst f.Ast.for_each_slot).id - | _ -> () + | _ -> () end; inner.Walk.visit_stmt_pre s in @@ -1117,9 +1140,14 @@ let lifecycle_visitor match s.node with Ast.STMT_ret _ | Ast.STMT_be _ -> - let stks = stk_elts_from_top live_block_slots in + let stks = stk_elts_from_top block_slots in let slots = List.concat (List.map stk_elts_from_top stks) in - note_drops s slots + let live = + List.filter + (fun i -> Hashtbl.mem live_block_slots i) + slots + in + note_drops s live | _ -> () in diff --git a/src/boot/util/common.ml b/src/boot/util/common.ml index 168c9f0a..0ea39e2d 100644 --- a/src/boot/util/common.ml +++ b/src/boot/util/common.ml @@ -341,7 +341,6 @@ let bool_of_option x = Some _ -> true | None -> false - (* * Auxiliary stack functions. *) |